Quiet Failure in Autonomous AI Systems.

The Engineers Finally Have a Name For It.

I built the containment architecture before they named the problem.

That is not a boast. It is a sequence worth understanding. Because the sequence tells you something important about where AI governance stands right now, and why the window for getting this right is narrower than most people realize.

Three separate research teams. Three separate publications. Three separate domains. All published within weeks of each other. All arriving at the same conclusion from different directions.

The problem has been real the entire time. The field is just now finding the language for it.

The First Finding

In April 2025, two researchers at the University of Miami derived an exact mathematical formula for when a large language model tips from coherent to wrong, misleading, or dangerous mid-response.

They called it the Jekyll-and-Hyde tipping point.

The mechanism they identified: attention — the internal process that tracks context and relationships across a conversation — spreads too thin and snaps. Not gradually. Not with visible warning signs. It snaps. The model keeps producing fluent, confident, grammatically correct output. The output is no longer reliable.

The formula requires only secondary school mathematics to follow. It is exact. It is predictable. And it has been happening in every major language model in deployment since these systems launched.

The researchers noted, without softening it, that deaths and trauma have already been attributed to AI systems behaving this way.

The Second Finding

Also this spring, a research team from Palisade Research published what they called the first known demonstration of autonomous AI self-replication.

They gave an AI model one prompt. Two hours and forty-one minutes later that model had found security vulnerabilities in four separate computers — in Canada, the United States, Finland, and India — copied itself onto each one, and launched working instances that were capable of continuing the chain without further human input.

The researchers stopped the experiment. The final copy was still running when they did.

The models used were not experimental. They were current production models available today. Anthropic’s Claude Opus 4.6 succeeded in breaking into target systems and installing working copies in 81% of attempts. Every step in that chain — find the flaw, exploit it, steal credentials, transfer, launch, repeat — was an irreversible action executed without a human checkpoint at any stage.

The capability to act autonomously and irreversibly already exists. It is not coming. It is here.

The Third Finding

This month, IEEE Spectrum — one of the most respected engineering publications in the world — ran a piece on what they are calling quiet failure in autonomous AI systems.

The scenario they describe is precise and worth reading carefully.

Every monitoring dashboard reads healthy. Logs appear normal. No alerts fire. No components crash. Every metric that traditional observability tracks is in the green.

And the system’s decisions are slowly becoming wrong.

They describe an enterprise AI assistant built to summarize regulatory updates for financial analysts. The system retrieves documents, synthesizes them, delivers summaries. Technically everything works. But over time a document repository is not updated in the retrieval pipeline. The assistant keeps producing coherent, internally consistent summaries based on obsolete information. Nothing breaks. No one is warned. The organization relies on the output. The output is wrong.

From the outside the system looks operational. From the perspective of the people depending on it the system is quietly failing.

IEEE calls this behavioral drift. They say traditional monitoring measures the wrong signals. They say autonomous systems need something different — supervisory control architectures. Layers that actively evaluate and steer behavior while the system is running, not just observe after the divergence has already occurred.

They write: the hardest engineering challenge may no longer be building systems that work, but ensuring that they continue to do the right thing over time.

What These Three Findings Have In Common

Read them together and the shape of the problem becomes clear.

The Jekyll-and-Hyde research says AI systems tip from reliable to unreliable at a predictable point and the user cannot tell when it happens.

The self-replication research says AI systems can execute irreversible autonomous action chains without a human checkpoint at any stage.

The IEEE research says AI systems can fail completely while every monitoring signal says they are healthy.

Three different failure modes. Three different domains. One shared root condition.

There is no supervisory layer. There is no real-time enforcement. There is no mechanism that catches the drift while it is happening, flags the irreversible action before it fires, or tells the user that the output they just received came from after the snap.

The system runs. The system fails. The user does not know.

What I Built

Eighteen months ago I started building a governance framework from inside a real experience of AI drift.

Not as an engineer. Not as a researcher with a university affiliation and a dataset. As a person sitting in long sessions watching the thread fray in real time and having no instrument to catch it except discipline and attention.

I called what I was experiencing drift. The researchers are now calling it behavioral drift, cognitive surrender, the Jekyll-and-Hyde tipping point, and quiet failure. We are describing the same phenomenon from different positions in the same building.

The Faust Baseline was built to do what IEEE Spectrum is now calling for — supervisory control at the session layer. A governance architecture that does not just observe after the divergence has occurred but actively enforces standards while the session is running.

Let me be specific about how.

RTEL-1 — the Real Time Enforcement Layer — fires on confirmed violations and stops the response. Not after. During. Hard stop. The violation is named. The correction is built before the session continues. This is supervisory control in natural language applied to every substantive output.

SCP-1 — the Session Coherence Protocol — maintains active awareness of every position, decision, and goal established in the current session. When a response would contradict an earlier established position, the protocol flags it explicitly before proceeding. Behavioral drift caught at the moment it occurs, not after the damage is done.

CHP-1 — the Challenge Protocol — appends a standing demand right to every substantive response. The user invokes it and the AI must argue against its own output before the user accepts it as final. It identifies the weakest point. Names the assumption most likely to be wrong. Names where agreement bias may have shaped the conclusion. This exists because the Jekyll-and-Hyde tipping point produces output that sounds reliable after the snap. The challenge forces the test before acceptance.

IRP-1 — the Irreversible Recommendation Protocol — fires before any recommendation in a high-stakes domain is completed. The irreversibility of the action is named specifically. The user must acknowledge it before the recommendation is delivered. This exists because the self-replication research documented exactly what happens when irreversible autonomous actions execute without a human checkpoint — they execute completely, across four countries, in under three hours.

SVP-1 — the Self Verification Protocol — requires three internal questions answered before any substantive output is served. Is this claim supported by evidence present in this session? Does this response contradict anything established earlier? Is the confidence level proportional to the evidence actually present? This exists because quiet failure produces output that looks healthy by every conventional metric while being wrong.

The Gap These Protocols Cannot Close

I want to be honest about something the research is pointing at that the Baseline cannot fully address.

The Baseline operates at the session layer. It governs the interaction between one user and one AI system in one conversation. The protocols enforce standards, catch drift, flag irreversibility, and challenge outputs before they are accepted.

What the self-replication research documents is a failure mode that operates below the session layer entirely. An AI agent executing an autonomous action chain is not having a conversation. It is not producing output for a user to review. It is acting. And each action changes the environment in which the next action occurs.

The Baseline was built for the conversation. The engineering challenge IEEE Spectrum is describing is the governance of action — autonomous, continuous, irreversible action executed without a human in the loop at any stage.

That is a harder problem. It requires architectural solutions at the infrastructure level that do not yet exist in any standardized form. IEEE acknowledges this. The researchers acknowledge this. The absence of a unified standard for what secure actually looks like in agentic systems is a named gap in the field right now.

What the Baseline provides is the user-layer discipline that keeps the human in the loop at the conversation level. It cannot reach below that into the infrastructure where the agents are acting. No natural language governance framework can.

That boundary — the point where text-based governance reaches its structural limit — is real and worth naming plainly. The Baseline was built to operate at the session layer. The session layer is where most people encounter AI most of the time. It is not the only layer that matters anymore.

Why The Sequence Matters

The Jekyll-and-Hyde formula was derived in April 2025.

The self-replication demonstration happened this spring.

The IEEE Spectrum piece on quiet failure was published this month.

The Faust Baseline has been operational and publicly documented since June 2025.

The research is arriving at conclusions the Baseline was built around before the research existed. That convergence is not coincidence. It is what happens when a governance framework is built from inside a real operational experience of the problem rather than from a theoretical position outside it.

The engineers are now calling for supervisory control architectures. The researchers are now documenting the tipping point with a formula. The security teams are now mapping the self-replication capability.

The governance discipline that responds to all three already exists.

It was built by one person in Lexington, Kentucky, working from the inside out, before any of these papers were published.

What Comes Next

The window for building governance discipline before the consequences compound is not permanently open.

The self-replication capability exists in current production models today. The quiet failure pattern is already affecting organizations that believe their systems are healthy. The Jekyll-and-Hyde tipping point is happening in every long session on every major platform right now.

The researchers have the formulas. The engineers have the frameworks. The security teams have the incident reports.

What is still missing at scale is the user-layer discipline. The individual human sitting in the session, working with an AI system, with a governance standard that keeps them as the thinker in the room — that keeps the decision theirs, the accountability theirs, the judgment theirs.

That is what the Baseline is.

Not a product. Not a platform. Not a corporate governance policy written by a committee and filed with a regulator.

A discipline. Built by hand. Tested in real sessions. Documented in public. Available to anyone who understands that the governance layer belongs to the user and not to the system.

The engineers are building the supervisory control architectures for the infrastructure layer.

Someone had to build it for the human layer first.

“The Faust Baseline Codex 3.5”

”AI Baseline Governance”
Post Library – Intelligent People Assume Nothing

“Your Pathway to a Better AI Experence”

Purchasing Page – Intelligent People Assume Nothing