Skip to content
← Writing

AI drift is real, and hallucinations come with it

5 min read AI
AI drift is real, and hallucinations come with it

During an evaluation-framework reset on a project I was running, the scoring rules had been duplicated across three surfaces — a live Notion database, a set of custom instructions, and a Claude Project knowledge file. When a weight changed in the live framework, one surface got updated. The others didn’t. The knowledge file kept producing scores using stale maximums, and every output looked fine. Confident. Well-formatted. Wrong.

I didn’t catch it immediately. The system didn’t throw an error. It didn’t hallucinate — it didn’t invent a scoring rule. It spoke from one that used to be true.

That’s the failure mode I want to name: drift.

Hallucination gets the headlines. Drift does the damage.

Most AI failure conversations center on hallucination — the model fabricating a detail with confidence. That makes sense. It’s dramatic, it’s easy to demonstrate, and it’s obviously dangerous.

Comparison of hallucination versus drift failure modes

Drift is quieter. Drift is when the source of truth has moved but the system keeps operating from a stale copy as if nothing changed. One fabricates. The other lags. Both survive on fluency — the output sounds authoritative either way, which is exactly why both are dangerous.

But here’s why I think drift is the bigger operational risk in product work: hallucination is at least recognized as a failure category. People build for it. They add retrieval, grounding, citations. Drift doesn’t have that same awareness. Most teams don’t even have a word for it. They just find out weeks later that the system’s been confidently wrong since Tuesday.

”Just prompt better” doesn’t fix this

The instinct after catching a drift failure is to tighten the instructions. Add more guardrails. Write clearer prompts. Be more explicit about which source is canonical.

I tried that. It doesn’t hold.

If mutable project state lives in more than one place, those places will diverge. It doesn’t matter how good the prompt is. Static rules copied into custom instructions drift. Framework content duplicated into knowledge files drifts. Ticket statuses embedded in system prompts drift. The architecture invites decay, and no amount of prompt engineering fixes an architecture problem.

This isn’t even unique to AI. Humans do exactly the same thing — speak confidently from stale slides, outdated docs, half-remembered decisions. The difference is that AI scales the failure. A person running on an outdated mental model produces one bad recommendation. A system running on stale context produces hundreds, all polished, all plausible.

The fix is boring on purpose

The pattern that actually works is straightforward, and I’m suspicious of anyone selling something more complicated:

Mutable state lives in one place. Instructions govern behavior, not content. Knowledge files hold workflow scaffolding and identifiers, not the rules that change. When current state matters, the system fetches it live instead of trusting a baked-in copy.

In the project where this failed, the fix meant Notion became the single source for evaluation mechanics. The knowledge file kept structural references — IDs, tool patterns, workflow descriptions — but nothing that changes when a decision changes. The instructions told the system how to work, not what the current rules are.

Before and after architecture diagram showing single source of truth fix

The failure surface got smaller immediately. Not eliminated — the model can still hallucinate, still misinterpret what it fetches. But the whole category of “confident output from yesterday’s rules” went away.

Stale context isn’t a bug. It’s the default.

The operating posture I’d push on any team running AI in their workflow:

Assume copied context drifts. Assume static instructions containing mutable state go stale. Assume confident output might be speaking from an old frame. Then build accordingly.

This isn’t paranoia. It’s the same discipline that makes engineers distrust cache without TTLs or makes ops teams version their configs. The only new part is that AI output sounds so coherent that stale grounding is harder to detect than a stale cache hit.

The question worth asking regularly is blunt: what in this workflow is still speaking from a copy?

If the answer is more than one thing, the system is already drifting.

Get in Touch

I'm always open to conversations about design, product, and leadership.