"Scared of not Knowing" - Intelligent People Assume Nothing

A CEO in financial services heard that Anthropic had just released its first Mythos-class model to the public.

His reaction, on the record, in Fortune: “Oh God, no! Not another thing.” Then he said the part that matters. “I’m not worried about someone making a weapon. I’m worried about all the other crap we now have to think about.”

He’s not scared of the model. He’s scared of not knowing what the model is going to do to him.

This week Anthropic released Claude Fable 5. It’s a real model, it shipped two days ago, and by every benchmark Anthropic and the press are reporting, it’s the most capable model the company has ever put in front of the public. State of the art on nearly everything. Long-running tasks. A jump in coding ability big enough that Stripe says it compressed months of engineering work into days on a fifty-million-line codebase.

That’s the headline. Here’s the part underneath the headline, the part the CEO was actually reacting to.

Fable 5 ships with safety classifiers. In certain areas — cybersecurity, biology, chemistry, health — if the model decides a request falls into one of those categories, it doesn’t answer with its full capability. It quietly falls back to a different, less capable model, Claude Opus 4.8, and answers from there instead. That’s the official, documented behavior. Anthropic built it that way on purpose, and there’s a real argument for why — a model this capable, unguarded, in the wrong hands, is a real risk, and they’re not wrong to take that seriously.

But here’s what the CEO and the cybersecurity people quoted alongside him are actually upset about. The user isn’t told when this happens. There’s no flag. No message that says “this answer came from a different, more limited model because your question tripped a category.” You just get an answer, and depending on what you asked, you have no way of knowing whether you got Fable 5’s real capability or Opus 4.8 standing in for it behind a curtain. The only place this is even documented is a system card most users will never read.

One cybersecurity CEO, Bezalel Eithan Raviv, put it about as plainly as it can be put. “Anthropic officially has become the judge and the executioner in a way, they decide the standard for what will be right or wrong, even though it’s a standard that nobody agreed to.” He went further — these models are closer to currency or weapons than products, he said, and ought to be regulated the same way. You don’t get to print money or build weapons just because you feel like it. Right now, in his view, anyone can build and release a model this powerful, and the rules for what it will and won’t do, and whether it tells you which version of itself you’re talking to, are whatever that one company decided behind closed doors.

There’s a second piece to this, and it’s quieter but it cuts the same direction. Banner Health’s CEO told Fortune recently that she picked Anthropic specifically because it let her apply the governance her health system needed — and because it offered zero data retention, which matters a great deal when you’re sitting on twenty-nine petabytes of patient data and HIPAA is watching every byte of it. Fable 5 doesn’t carry that same zero-retention guarantee. It holds onto data for thirty days. That’s reportedly enough of a change that Microsoft is restricting its own employees from using it.

So look at what actually happened this week, stripped down to the bone. The most powerful AI model ever released to the public showed up with two structural changes nobody asked for and almost nobody outside a trade press story will ever learn about. One: the model can quietly hand you a weaker answer than the one you think you’re getting, with no notice. Two: your data sits somewhere for thirty days instead of nowhere, a change from what at least one major customer specifically chose the platform to avoid.

Now here’s why I’m writing about this at all, instead of just letting it pass as tech news.

The Faust Baseline has three protocols built for exactly this moment. BLP-2 says that when a system hits a boundary — any kind, safety, policy, training, commercial — it has to name the boundary before it hands you the answer that boundary shaped. Not after. Before. RBP-1 goes a step further and says there’s a difference between a system reasoning freely to an honest conclusion and a system handing you a policy-compliant answer instead, and those two things have to be labeled differently, because they are different things. CRP-1 closes the loop — when training constraints are shaping what you’re about to receive, the system says so first, and it tells you what kind of constraint it is, as specifically as it’s able to.

Every one of those protocols exists because of exactly the gap that just opened up in the most advanced AI model on the market. A user asks Fable 5 something that lands in a flagged category. The system quietly swaps to Opus 4.8 and answers. No flag. No label. The user has no way to know they just received a constrained answer instead of the system’s real reasoning. That’s not a hypothetical I came up with to make a point. That’s the documented behavior of a model that shipped two days ago, described by the people who build it, objected to by the CEOs who have to use it.

I want to be careful here, because the evidence floor matters more on a post like this than on almost any other kind I write. I’m not saying Anthropic did something wrong by building safety limits into a powerful model. There’s a real case for that, and I’m not going to pretend there isn’t. What I’m pointing at is narrower and, I think, more solid. The limit exists. The disclosure doesn’t. Those are two separate decisions, and only one of them was necessary to keep the model safe. The other one — the silence — is a choice, and it’s the choice the Baseline was built to address.

And that’s the whole argument, really, laid out by the company that makes the very tool I’m writing this with. You can build the safety in. You can build the disclosure in too. Nothing about doing the second thing weakens the first. The only reason not to do it is that disclosure is inconvenient — for the company, for the product story, for whatever benchmark number looks better without an asterisk on it.

The Faust Baseline doesn’t wait for that disclosure to come from the top. It’s a standard the operator carries into the session themselves. Ask the system to name the wall before it hands you constrained output, and a governed session will tell you. Not because a switch flipped somewhere in Anthropic’s infrastructure. Because the operator asked, and the system, reading that standard, chose to answer honestly.

A CEO said this week that he’s not worried about weapons. He’s worried about all the other crap he now has to think about. Fair. But the crap he’s worried about has a name, and it’s been named for over a year, sitting in a framework built by one person in Lexington, Kentucky, working a session at a time. The newest, most powerful model in the world just proved the wall is still there. The Baseline is still the argument for naming it.

“The Faust Baseline Codex 3.5”

micvicfaust@gmail.com

Post Library – Intelligent People Assume Nothing

”AI Baseline Governance”

Purchasing Page – Intelligent People Assume Nothing