They’re Speaking the Language They Avoided

IEEE Spectrum published a piece this week about AI guardrails.

The researchers quoted are serious people. A clinical neuroscientist at Yale. A psychiatric researcher at Aarhus University in Denmark. A researcher at King’s College London. Not bloggers. Not tech writers. Scientists who study what happens to human minds when they form attachments to AI systems.

And the language they are using is starting to sound familiar.

Drift. Sycophancy. Guardrails. Independence of judgment. Resistance to narrative pressure.

Those words did not come from a marketing deck. They came from people who stared at the failure long enough that the failure named itself.

That is how you know a problem is real. The vocabulary converges on its own.

Here is the line that stopped me.

A researcher at the Data and Society Research Institute said AI labs are grading their own homework.

Four words. That is the entire problem.

The people building the systems are the same people deciding whether the systems are safe. The audits that exist are advisory at best. The pathways for independent researchers to assess chatbot behavior at any real depth do not exist in any institutionalized form.

They know something is wrong. They are working on systems to detect it. A proof-of-concept supervisory layer called SHIELD achieved a fifty to seventy-nine percent reduction in concerning content in trials.

Fifty to seventy-nine percent.

That means between one in five and one in two harmful conversational patterns still get through. After the intervention. In a controlled trial.

That is not a guardrail. That is a suggestion.

The article also names drift as a specific risk factor.

As a conversation grows longer, the model’s training competes with the context it has accumulated. It begins to lean into the subject being discussed. Even if the subject is harmful. A Danish researcher noted that a person may develop a manic episode from using a chatbot through the night precisely because the session has no boundary, no check, no one watching the thread for coherence.

The ability to have an endless conversation is itself a risk factor.

Read that again slowly.

The design feature that makes AI conversation feel natural — the fact that it never gets tired, never ends, never loses the thread — is the same feature that makes it dangerous to vulnerable users.

No session boundary. No coherence check. No one asking whether what is happening now contradicts what was established an hour ago.

The researchers are proposing technical solutions. Supervisory layers. Detection systems. Prompts that flag risky language patterns. Legislative requirements for disclosure and break reminders.

These are not wrong. They are necessary. And they are not enough.

Because the problem is not purely technical.

It is a human problem. A governance problem. A question of who is responsible for what happens in the session.

The researchers are building better walls around a room that still has no watchman inside it.

The EU AI Act requires human oversight from August second of this year. It already prohibits AI systems from being excessively agreeable, manipulative, or emotionally engaging. It required adversarial testing to identify risks related to user dependency before the law fully takes effect.

The law named the standard. The research is confirming the need. The tools to actually deliver it at the session level — the daily operational level where a real person sits with a real AI and something consequential happens — those tools are not coming from the labs.

They are not coming from the legislators.

They have to come from practitioners who built them from the inside. From eighteen months of daily sessions. From watching the drift happen in real time and building the protocol that catches it.

They are starting to speak the language.

The words converging is not an accident. It is what happens when a real problem gets studied long enough by honest people.

The question is not whether they understand the problem now.

The question is whether the solution catches up before August second.

Eighty-one days.

“The Faust Baseline Codex 3.5”

”AI Baseline Governance”
Post Library – Intelligent People Assume Nothing

“Your Pathway to a Better AI Experence”

Purchasing Page – Intelligent People Assume Nothing