Confidence scales. Evidence doesn't.
“Confidence is a feeling, determined mostly by the coherence of the story.” — Daniel Kahneman, Thinking, Fast and Slow
The most capable teams I’ve worked with share a failure mode: they’re too smart for their own evidence base.
Not stupid-smart. Actually smart. They see real problems, build coherent strategies, move quickly, and explain their reasoning with conviction. The work looks disciplined. The plan sounds sharp. And the whole thing can still be wrong in exactly the way that matters — because conviction got out ahead of proof and nobody noticed.
Kahneman called this the “illusion of validity” — the observation that our confidence in a judgment tracks how coherent the story feels, not whether the evidence underneath it is any good. A team that can construct a tight narrative feels right, even when the supporting data is thin. He found that even after learning their predictions were no better than chance, his Israeli Army evaluation team still felt confident rating the next batch of candidates. The feeling of knowing didn’t update with the facts.
This happens at every level. It happens in a boardroom when a product decision sails through because the people in the room are persuasive. It happens on a project board when a ticket gets marked Done because the work felt finished. Same failure, different altitude: confidence substituting for evidence, and the system making it easy.
The room full of smart people
I watched this play out on a product I was building.
The strategic logic was real. The market opportunity was defensible. The team had operating momentum. Work was moving. Then a structured validation review surfaced the gap: the validation contract was weaker than the momentum around it. Months of development, and there were still zero documented user or buyer conversations attached to the evidence base.
Nobody had been careless. The problem was that the internal reasoning was so coherent, so fluent, that it had quietly become a substitute for external signal. The team could explain why the product should work. What they couldn’t show was evidence that it would.
Philip Tetlock’s research on expert prediction puts a point on this: he found that acquiring more domain knowledge often makes people more overconfident, not more accurate. The person who knows the most builds the most convincing internal story — and becomes the least likely to question whether the story has been tested. That’s not an argument against expertise. It’s an argument against expertise operating without structure.
The board that agreed with everyone
The same pattern plays out at the task level, just less dramatically.
On a different project I inherited a board where ticket statuses had drifted from reality. Some tickets said Done but the artifact was missing. Others sat in Backlog even though the work was already shipped. The statuses had accumulated a kind of social credibility — people glanced at them, believed them, and kept building on top of whatever the label said.
A status field is a claim, not a fact. It tells you what someone said about the work. It doesn’t prove the work exists, is complete, or has been verified. But when an organization manages the labels instead of the evidence behind them, the board becomes a confidence layer — a place where optimism gets recorded and nobody stops to check.
Both failures share an architecture
The pattern is the same whether you’re looking at product strategy or project boards.
Smart people generate momentum. Labels record that momentum. And neither one includes a mechanism for reality to interrupt the story. The boardroom version: persuasive reasoning substitutes for user evidence. The board version: a green status substitutes for verification. In both cases, confidence scales easily — another meeting, another status update, another sprint — but evidence doesn’t scale unless the system forces it.
Social psychologist Scott Plous called overconfidence the most prevalent and potentially catastrophic problem in human judgment. I think he was underselling it for teams specifically, because in groups the effect compounds. One confident person can be checked. A room full of confident people who agree with each other creates a narrative that feels like proof.
What evidence systems actually do
The fix isn’t humility. Humility doesn’t generate signal. The fix is structure that creates places where reality can interrupt confidence.
Gary Klein’s pre-mortem is one of the cleanest examples. Instead of asking a team “does anyone see any problems?” — which produces silence — you tell them the project has already failed and ask them to explain why. The psychology flips: people compete to surface the most worrying issue instead of suppressing doubts to preserve harmony. Klein’s research showed pre-mortems reduced overconfidence more effectively than standard critiques or pros-and-cons analysis. Kahneman himself endorsed the technique as one of the few structural interventions that actually works against planning bias.
That’s the shape of a real evidence system. Not “be humble” but “build a structure that forces uncomfortable information to surface before it’s too late.”
At the strategic level, that means promotion criteria are explicit before momentum builds. What evidence has to exist before the product advances? Where is the reasoning preserved so decisions can be inspected later instead of defended from memory? What signal from users or operations can still overturn the plan?
At the operational level, it means the status has to be downstream of evidence, not upstream of it. What artifact or verification earns the status change? What has to be true before the team is allowed to say “done”?
Both levels need the same thing: explicit gates where proof is required, not just confidence.
The question that earns its keep
One diagnostic works at every altitude.
What in this system can challenge a smart team when it’s moving too confidently?
If the answer is “another smart person” — that’s just more confidence. If the answer is “a meeting where someone pushes back” — that’s confidence with theater. If the answer is “explicit evidence gates, documented reasoning, and signal from the edges of the system” — that’s a decision system.
The first two feel productive. The third one actually is.