AI Is a Model of Reality. It's Not Reality.

I keep hearing the same pitch.

“We’ll build an AI that lets you describe a mobile app and it’ll create the whole thing for you — front end, back end, already hooked up to all of your organisation’s data.”

Or: “We’ll connect all of our databases to our new AI platform, and anyone in the business will be able to ask it anything and get reliable, actionable results. Every time.”

These aren’t hypotheticals. These are things that have been promised to real customers, by real executives, in real meetings. Contracts have been signed. Timelines have been set. Engineers have been told to make it happen.

And the engineers know — they always know — that it can’t work the way it’s been described. But the promise has already been made, so they nod and start building, hoping they can close the gap between what was sold and what’s possible before anyone notices.

The fundamental gap

Here’s the thing that most executives don’t understand about AI, because nobody’s told them clearly enough:

AI is a model of reality. It is not reality.

A large language model doesn’t “know” your business. It has learned statistical patterns from text. When it generates an answer, it’s producing the most probable sequence of tokens based on its training data and your prompt. Sometimes that aligns with reality. Sometimes it doesn’t. And it has no reliable way to tell you which one you’re getting.

This isn’t a bug. It’s the fundamental nature of the technology. A model is, by definition, a simplified representation of something more complex. Every model has a gap between what it represents and what actually exists. Every single one.

The question isn’t whether the gap exists — it always does. The questions that matter are:

Do you know what the gap is? Can you describe, specifically, where your AI system will be wrong, incomplete, or unreliable? If you can’t, you don’t understand your own product.
Is the gap acceptable? For your use case, with your users, at your scale — is the error rate, the hallucination risk, the confidence calibration good enough? “Good enough” isn’t a technical question. It’s a business question that requires understanding both the technology and the domain.
Have you told the customer? Does the person buying this thing understand that it’s a probabilistic system that will sometimes be wrong? Or did you sell it as magic?

The promises that can’t be kept

Let me take the two examples I hear most often and explain why they’re problematic.

”We’ll hook up all the databases and you can ask anything”

This is the RAG dream — Retrieval-Augmented Generation. Connect your LLM to your enterprise data, and suddenly everyone has a genius analyst who knows everything about the business.

In reality:

The data isn’t ready. Your databases have different schemas, different naming conventions, different levels of quality, and different access controls. That customer table in the CRM doesn’t match the customer table in the billing system because they were built by different teams a decade apart. The AI doesn’t magically reconcile these differences — someone has to build that reconciliation layer, and it’s the hardest part of the project.

The AI will hallucinate confidently. Ask it “what was our revenue last quarter?” and it might give you a number. Is it right? Maybe. It depends on which data source it hit, how it interpreted the schema, whether “last quarter” means the same thing in every system, and whether the underlying data is current. The model doesn’t know if it’s right. It just sounds like it does.

“Reliable, actionable results every time” is not how probability works. LLMs produce probabilistic outputs. They are, by design, non-deterministic. The same question asked twice can produce different answers. You can improve consistency with better prompting, retrieval strategies, and guardrails — but “every time” is a promise that the technology cannot keep.

Access controls become a nightmare. If the AI can query every database, can every user see every answer? Probably not. Now you need row-level security, data classification, and permission-aware retrieval. Your “simple” AI assistant just became an enterprise data governance project.

”Describe an app and we’ll build it automatically”

This is the vibe coding pitch taken to its logical extreme.

AI coding assistants are genuinely useful — I use Claude Code daily and it’s brilliant. But there’s a canyon between “AI helps a skilled engineer write code faster” and “AI replaces the entire software development process.”

The first 80% is easy. AI will scaffold your app, generate CRUD endpoints, build a UI, wire up some data. It’ll look impressive in a demo. This is the Forrest Gump problem — it runs beautifully and doesn’t know when to stop.

The last 20% is where reality lives. Security. Edge cases. Performance under load. Integration with real systems that have real quirks. Error handling for situations the AI has never seen. Accessibility. Compliance. The things that separate a demo from a product.

“Already hooked up to all the organisation’s data” is doing an enormous amount of heavy lifting in that sentence. Connecting to enterprise data sources means authentication, authorisation, data mapping, error handling, rate limiting, schema evolution, and a dozen other concerns that AI handles poorly because they’re context-specific and require understanding of the organisation, not just the code.

Who maintains it? AI-generated code that nobody on the team understands is a liability, not an asset. When it breaks — and it will — who debugs it? The AI that wrote it doesn’t remember the context. The developer who prompted it may have already left.

Why executives get this wrong

I don’t blame the executives. They’re responding to enormous pressure — from boards, investors, competitors, customers — to “do something with AI.” The vendor demos are spectacular. The case studies are compelling (and carefully selected). The analyst reports all say the same thing: AI is transformative, and you’re falling behind.

What they don’t see is the gap between the demo and production. The demo uses clean data, a controlled environment, cherry-picked examples, and a human behind the curtain fixing things that go wrong. Production has messy data, unpredictable users, edge cases, scale, and no human in the loop.

The vendors won’t tell them about the gap because they’re selling shovels. The engineers won’t tell them because the decision has already been made and nobody wants to be the person who says “the emperor has no clothes.” So the gap gets papered over with optimism, and everyone hopes it’ll work out.

It usually doesn’t.

What to do instead

None of this means AI is useless. It means it needs to be deployed with honesty about what it can and can’t do.

Start with the gap, not the capability. Before you build anything, ask: where will this system be wrong? How often? What happens when it’s wrong? If the answer to the last question is “someone makes a bad business decision based on incorrect data,” you need guardrails, confidence scoring, human review, or all three.

Pilot before you promise. Build a thin slice. Test it with real data, real users, real edge cases. Measure the gap. Then decide if it’s acceptable. Don’t promise the customer a product before you’ve measured the thing that determines whether it works.

Be specific about what “works” means. Not “the AI answers questions.” What questions? With what accuracy? How do you measure accuracy? What’s the failure mode? “It works” is not a specification. “It answers 85% of customer queries correctly, with a 3% hallucination rate that’s caught by the validation layer” is a specification.

Make the gap visible. Show confidence scores. Flag uncertain answers. Build in human review for high-stakes decisions. Don’t pretend the system is infallible — design for the fact that it isn’t.

Hire people who understand the limits. This is the hard one. You need architects and engineers who understand both what AI can do and what it can’t — and who have the credibility and the courage to say “that promise can’t be kept” before the contract is signed, not after.

The magic question

Next time someone in your organisation pitches an AI project, ask them this:

“What’s the gap between what this model produces and reality — and is that gap acceptable for our use case?”

If they can answer that clearly and specifically, you might have a real project. If they can’t — if the answer is “it’ll just work” or “the AI handles that” or “we’ll fine-tune it” — you’re buying magic beans.

AI is a model of reality. It’s a very impressive model. But it’s not reality, and it never will be. The organisations that succeed with AI will be the ones that understand the gap and design for it. The ones that don’t will spend a fortune learning the same lesson that every technology hype cycle teaches: the demo is not the product.

AI Is a Model of Reality. It's Not Reality.

The fundamental gap

The promises that can’t be kept

”We’ll hook up all the databases and you can ask anything”

”Describe an app and we’ll build it automatically”

Why executives get this wrong

What to do instead

The magic question

Related Posts

Claude Code Is Brilliant. It's Also Forrest Gump.

In the AI Gold Rush, the Only Ones Smiling Are the Lads Selling Shovels

Have Cloud Architects Become Personal Shoppers for the Big Three?

Serverless at Scale: An Honest Take