AI Capabilities & Limitations
Understand hallucination as a structural property of LLMs and learn the three defences PMs must design in before shipping.
The fake court case that cost real money
A customer used your contract review tool to check a data-privacy regulation. The AI returned a confident two-paragraph analysis citing "Chen v. DataCorp (2024), §14(b)." The customer's legal team built a contract clause around it. Three days later, someone checked — the case doesn't exist. The AI invented it.
This isn't a bug. It's not a glitch. It's how LLMs work. The model predicted that the next most likely tokens after "as established in" would be a case citation — so it generated one that looks real but isn't. And it did it with zero hesitation and zero uncertainty.
Hallucination isn't going away. It's a structural property of text prediction. You can reduce it dramatically, but you can't eliminate it. As the PM, your job is to design the defences before launch — not after the first incident report.
Why LLMs hallucinate: three root causes
Before reading this section: You know hallucination is a problem. But why does it happen? Take 30 seconds and write down your best guess for one root cause — then see if it matches what follows.
Every hallucination traces back to one of three causes. Each cause has a specific product-level defence:
Cause 1 → Defence 1: RAG (Retrieval-Augmented Generation)
The problem: The model's training data has a cutoff date. Anything after that date — new laws, updated pricing, recent events — it doesn't know. But it won't say "I don't know." It'll generate something plausible.
The defence: Give the model real, verified documents to reference. Before it generates an answer, retrieve the relevant pages from your own database and inject them into the prompt.
Think of it like: A closed-book exam (without RAG) vs. an open-book exam (with RAG). Same student, dramatically different accuracy on factual questions.
Cause 2 → Defence 2: Require citations
The problem: The model can write confident, authoritative text that sounds like it's backed by sources — but isn't.
The defence: Require the model to cite its sources for every factual claim. "If you can't point to a specific document in the context, you can't make the claim." This turns invisible hallucinations into visible ones — a missing citation is a red flag you can catch.
Cause 3 → Defence 3: Human review gate
The problem: When the model processes very long contexts, it's worst at using information stuck in the middle (the "lost in the middle" problem — where LLMs struggle to recall information in the middle of very long contexts (Liu et al., "Lost in the Middle: How Language Models Use Long Contexts," 2023) — though the severity varies by model and task). Critical facts can be right there in the prompt — and the model misses them.
The defence: For high-stakes outputs (legal, financial, medical), require a human expert to review before the output reaches the customer. This catches errors that no automated system can — nuance, context, and "does this actually make sense?"
There Are No Dumb Questions
"Will hallucination be fixed in future models?"
Reduced, yes. Eliminated, no — at least not with current architectures. Every model that predicts text can predict text that's false. Design your products to handle this reality, not wait for it to go away.
"How do I explain hallucination to my CEO?"
"The AI doesn't look things up — it predicts what answer sounds right. Usually it's correct, but sometimes it invents facts that sound completely real. We build safety nets so those mistakes never reach our customers."
Hallucination Risk Sorter
25 XPReal example: Priya's contract review tool
Priya, a PM at a legal-tech startup, shipped a contract review tool in Q1 — without RAG. Here's what happened:
Weeks 1-6: Everything looked great
94% user satisfaction. The model handled established case law fluently. The team celebrated.
Week 8: The incident
A customer asked about a data-privacy regulation passed after the model's training cutoff. The model returned a confident analysis citing a case that doesn't exist. The customer's legal team missed it, drafted a clause around the fake ruling, and the error was caught three days later — after the contract reached the counterparty.
The customer's general counsel said: "I trusted it because it cited a case number. That's worse than if it had said 'I don't know.'"
The damage
Satisfaction score dropped from 94% to 71% in two weeks. One incident.
The fix
RAG backed by a verified legal database, updated weekly. The model now must cite a real document for every factual claim — if it can't cite, it can't state. Satisfaction recovered to 91% within six weeks.
The eval that would have caught it
Citation accuracy on post-cutoff laws. That eval didn't exist at launch. It does now.
Design the Defences
50 XPThe reliability spectrum: not all tasks are created equal
Here's a mental model for quickly assessing whether an LLM will be reliable for a given task:
| Task type | What the model does | Hallucination risk | Example |
|---|---|---|---|
| Transformation | Reshapes text it's given | Low | Summarise, rewrite, translate, classify |
| Pattern-matching | Recognises patterns in given data | Low-Medium | Categorise support tickets, extract names from contracts |
| Knowledge recall | States facts about the world | High | Legal citations, medical facts, pricing information |
| Reasoning | Draws conclusions from multiple facts | Medium-High | "Does this contract clause conflict with GDPR?" |
The PM rule: Start by shipping transformation tasks (low risk). Use those wins to build trust. Then move to pattern-matching. Save knowledge recall for when you have RAG and evals in place.
As a PM, treat prompt injection as a design constraint, not an engineering afterthought: validate inputs, limit what the model can do autonomously, and never assume the system prompt is tamper-proof.
Key takeaways
- Hallucination isn't a bug — it's structural. You can reduce it dramatically with RAG and citations, but you can't eliminate it. Design your products accordingly.
- Three defences: RAG (give it real docs), citation requirements (no source = no claim), and human review gates (for high-stakes outputs).
- Transformation tasks are low risk. Summarise, rewrite, classify — the source text is right there. Knowledge recall is high risk — the model is guessing from memory.
- Confident + wrong is worse than "I don't know." A fake citation that a customer acts on is more damaging than an honest admission of uncertainty.
- Prompt injection is a product risk, not just a security risk. Any feature that processes untrusted text can have its runtime behaviour hijacked. Treat input validation as a product requirement, not a security afterthought.
Knowledge Check
1.A user reports that the AI assistant gave a confidently wrong answer about a company policy. What failure mode is this, and which two product-level mitigations address it without retraining the model?
2.Rank these tasks from most to least reliable for a current LLM: open-ended creative writing, precise arithmetic, summarizing a provided document, answering questions about real-time events.
3.What is prompt injection, and why is it a product concern rather than just a security concern?
4.When is 'human-in-the-loop' the right product decision versus an admission that the AI isn't ready?