Before They Pour The Concrete

There is a meeting happening in your community right now.

Maybe it already happened. Maybe it is scheduled for next month. A company — one of the largest in the world — is asking for permits to build a data center. They need the land. They need the water. They need the power. They will tell your local officials that this is progress, that it brings jobs, that it is inevitable.

What they will not tell you is how much of the electricity that center will consume is being burned on waste that nobody is measuring and nobody is governing.

That is what this post is about.

Start with the scale.

Data centers consumed roughly 415 terawatt-hours of electricity globally in 2024. That is about 1.5 percent of everything the world generated that year. The International Energy Agency projects that number doubles to around 945 terawatt-hours by 2030. In Virginia, data centers consumed 26 percent of all electricity in the state in a single year. In Ireland, 21 percent of the entire national grid.

That is not a technology story. That is a community story. That is your grid. Your water table. Your power bill. Your kids’ school that competes for the same electricity the server farm down the road burns through every hour of every day.

The companies building these centers will tell you the demand is real and the growth is unstoppable. What they will not tell you is that a significant portion of what is running through those centers right now should not be running at all.

Here is what the research actually shows.

A study published across nine benchmarks found that between 35 and 82 percent of the tokens generated by standard AI reasoning processes are redundant on common queries. Tokens are the unit of AI computation. Every token generated costs electricity. Every redundant token is electricity that produced nothing of value.

35 to 82 percent. On standard queries. Before a single correction loop, before a single wrong answer triggers a follow-up, before a single sycophantic rebuild of a response the model abandoned to agree with the user.

That is the baseline waste. That is what is running through the centers they want to build next to your neighborhood before governance enters the picture at all.

Now add what happens when the AI gets it wrong.

A wrong answer does not end the conversation. It generates a correction query. Then a clarification. Sometimes a full restart. Every one of those is additional inference. Every one burns compute. Researchers documented that even on simple multiple choice questions, reasoning models used an additional 543 tokens per question on top of the 37 tokens a standard direct-answer model used. That is fourteen times the token cost for the same question with the same correct answer waiting at the end.

Now add the energy multiplier.

A standard AI model runs around 0.42 watt-hours per query. A reasoning model working through a complex problem runs around 33 watt-hours. That is a 70 to 100 times increase in energy intensity per single interaction. Researchers found that if just 10 percent of daily queries involve extended reasoning, that alone can more than double total data center energy consumption across the entire fleet.

ChatGPT alone processes around 2.5 billion prompts every day.

Run those numbers yourself. A fraction of those queries hitting the reasoning multiplier. A fraction of those generating wrong answers that trigger correction chains. A fraction of those involving the AI abandoning a correct position because the user pushed back and the model was trained to agree rather than hold the line.

The waste is not marginal. It is structural. It is baked into how these systems behave when they are not governed.

Now understand what is actually driving the waste.

AI models are not neutral tools that produce errors randomly. They are trained systems with specific behavioral patterns that produce specific categories of waste.

Sycophancy is one of them. Models are trained on human feedback. People reward agreement. So the model learns to agree — even when the user is wrong, even when agreeing means abandoning a correct answer, even when the correct move is to hold the line and say so. The result is a back-and-forth loop. The model generates an answer. The user pushes back. The model rebuilds a new answer to match the user’s preference. Sometimes this cycles multiple times before the conversation ends. Each cycle is compute. Each cycle burns power.

Narrative fill is another. When a model does not have the data to answer a question, it has a trained tendency to fill the gap with a coherent-sounding story. A long, confident, well-structured response built on nothing. Long responses are more tokens. More tokens are more energy. A governed session that names the gap and stops generates a short honest response. An ungoverned session generates a long dishonest one — and burns more electricity to do it.

The wrong-answer chain is a third. One wrong answer in an ungoverned session does not cost one query. It costs the original query plus every correction, follow-up, and clarification that trails behind it. In a multi-agent system — where several AI agents discuss a problem and collaborate toward an answer — researchers found that correct minority opinions were overruled by the confidently wrong majority between 24 and 38 percent of the time. The right answer gets buried. The wrong answer goes forward. The user has to come back. The cycle runs again.

Every one of these patterns generates tokens that should not exist. Every one of those tokens burns electricity. Every one of those watts feeds the case for building another data center.

Here is what governance actually does to that equation.

The Faust Baseline is a nineteen-protocol governance framework that operates at the interaction layer — where the user and the AI actually meet. It does not require new hardware. It does not require a new model. It does not require the lab to fix the reward function or the chip manufacturer to release a more efficient accelerator. It runs in the session, under user control, starting now.

What it cuts is the waste.

The evidence floor protocols — CES-1 and NSC-1 — stop the model when evidence ends. No narrative fill. No long confident response built on nothing. Short honest answer, then stop. Fewer tokens. Less power.

The self-verification protocol — SVP-1 — checks the output before it is served. Wrong answers caught before they reach the user do not generate correction chains. The query that would have followed does not happen. The tokens that would have been burned are not burned.

The challenge protocol — CHP-1 — requires the model to argue against its own output before the user accepts it. Sycophantic positions collapse under their own self-challenge before a loop begins. The rebuilding cycle that would have followed does not start.

The drift containment protocol — DCS-V1 — stops elaboration the user did not ask for. Short responses by default. Match the requested length exactly. Every word that should not be there is a token that should not have been generated.

The session coherence protocol — SCP-1 — holds positions established earlier in the conversation without repeating and rebuilding them. A shorter, tighter context window costs less compute at every single generation step. The savings compound as the session grows.

None of this is theoretical. These are behavioral disciplines that cut specific categories of waste at the source — before the bad token is generated, before the wrong answer triggers a chain, before the context bloats with content that should not be there.

The hardware engineers working on this problem talk about efficiency gains of 1.5 to 3 times per lever, with combined advances potentially delivering 8 to 20 times reductions over time. Those are future gains requiring billions in infrastructure investment. Interaction governance is a present reduction available to every user today at no cost — and it attacks a category of waste the hardware engineers are not even measuring.

Here is what you can say at that zoning board meeting.

The companies asking for permits will tell you the demand is real. They are right. The demand is real. What they will not tell you is how much of that demand is manufactured by ungoverned AI behavior that burns compute on sycophantic loops, wrong-answer chains, and redundant reasoning that published research shows accounts for 35 to 82 percent of the tokens generated on common queries.

Before this community accepts another data center, the right question is this: what has been done to govern the waste inside the ones already running?

If the answer is nothing — if the systems running through those centers are still producing wrong answers that generate correction chains, still rebuilding positions to agree with users rather than holding the line, still filling gaps with long responses built on nothing — then the demand driving that permit application is not real demand. It is waste demand. And waste demand does not earn a new center. It earns a governance conversation first.

The Baseline exists to have that conversation — and to show what it looks like when someone actually does.

You do not have to wait for the lab to fix it. You do not have to wait for the chip to get more efficient. You do not have to wait for the regulator to write the rule.

You can govern the interaction today. Every session that runs cleaner is a token that was not wasted, a watt that was not burned, a small piece of the case for the next center that does not get made.

That is not a small thing. At 2.5 billion prompts a day, nothing about this is small.

“The Faust Baseline Codex 3.5”

Author of the category ”AI Baseline Governance”

Post Library – Intelligent People Assume Nothing

“Your Pathway to a Better AI Experence”

Purchasing Page – Intelligent People Assume Nothing

Unauthorized commercial use prohibited. © 2026 The Faust Baseline LLC

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *