Ai Confabulation

The AI Said It Could Do That. The AI Was Wrong.

Rza

24 Jul 2025 — 6 min read

You asked Claude how to batch-process images through Runway's API using a specific endpoint. It gave you the endpoint URL, the authentication header format, the JSON payload structure, and a Python script that tied it all together. The code was clean. The variable names were sensible. The error handling was thoughtful. You ran it, got a 404, assumed you had a typo, and spent 90 minutes debugging before you thought to check Runway's actual API documentation. The endpoint doesn't exist. It has never existed. The AI didn't make a mistake — it invented a plausible fiction and delivered it with the confidence of someone reading from a manual.

This happens constantly. Not occasionally, not as a rare edge case — constantly. If you're building with AI tools while using an AI assistant to guide you, this is a weekly occurrence. It is the background radiation of AI-assisted work in 2026, and most people don't realize it's happening until they've already lost the afternoon.

The Pattern

The pattern looks like this. You have a real problem — a workflow to build, an integration to wire up, a feature to evaluate. You ask an AI assistant about it because that's the reasonable thing to do. The AI gives you a detailed, structured answer that reads like documentation. It includes specifics: parameter names, syntax, expected behavior. It doesn't hedge. It doesn't say "I'm not sure about this" or "you should verify." It just tells you, the same way it tells you that Python is a programming language or that HTTP 200 means success.

You proceed as if the information is real, because it was delivered as if it were real. You write code around it. You design a workflow that depends on it. You tell a client or a teammate that yes, the tool can do that. Then reality intervenes — the feature doesn't exist, the endpoint returns nothing, the parameter you specified is meaningless — and you enter a debugging spiral that feels like your fault but isn't. The AI confabulated. You trusted the confabulation. The hours are gone.

The word matters here. This isn't "lying" — the AI has no intent to deceive. It's not "hallucinating" in any clinically precise sense either, but that term has stuck and it's close enough. The technical reality is that LLMs generate text by predicting the most statistically probable next token given everything that came before. When you ask about a specific tool capability, the model doesn't look it up. It generates what a correct answer would look like, based on patterns in its training data. If the answer is plausible — if it sounds like the kind of thing that could be true — the model produces it with exactly the same confidence it would use for something it actually knows.

That uniform confidence is the core of the problem. When a human expert is uncertain, they signal it. They say "I think" or "if I remember correctly" or "you should double-check." Their tone shifts. An LLM doesn't do this proportionally to its actual reliability. Some models have been trained to hedge occasionally, but the hedging is performative — it's a style the model learned, not a calibrated signal of uncertainty. The AI says "here's how to do it" whether it's drawing from solid training data or generating a plausible confabulation. Both outputs look the same. Both read the same. The only way to tell the difference is to check.

This isn't a rare failure mode that clever prompting can avoid. It's structural. It emerges from how these systems work at a fundamental level, and it doesn't go away with better models — it just gets harder to detect, because the confabulations get more plausible. GPT-3 hallucinated in ways that were often obviously wrong. GPT-4 and Claude 3.5 hallucinate in ways that are subtly, dangerously correct-sounding. The better the model gets at generating convincing text, the better it gets at generating convincing fiction.

The Psychology

You fall for it because your brain uses confidence as a proxy for accuracy. This is a heuristic that works remarkably well with human speakers — a person who states something with detail and certainty is more likely to be correct than someone who mumbles vaguely. You've used this heuristic successfully for your entire life. The AI exploits it accidentally. It's always specific. It's always confident. It's always structured like an expert response. Your pattern-matching machinery processes it as expertise, the same way it would if a senior engineer at the tool's company gave you the same answer.

There's a motivation layer underneath that compounds the effect. You asked the AI because you have a problem you want solved. The answer it gives you is the solution. You want it to be right. Motivated reasoning doesn't feel like motivated reasoning — it feels like evaluation. You read the AI's response and it sounds right, and that feeling of "sounding right" is enough to carry you into implementation without a verification step. The gap between "this sounds like the answer" and "this is the answer" is where the debugging hours live.

The cost isn't just time. It's trust erosion — in both directions. After you've been burned a few times, you start distrusting AI assistance even when it's correct, which makes it less useful. Or worse, you don't learn the lesson and keep building on unverified claims, which means the next confabulation costs you even more. Either way, the relationship between you and your AI tools becomes less productive because you haven't identified the specific boundary between what the AI is good at and what it confabulates about.

There's also a social cost. If you've told a client or a stakeholder that a tool can do something — based on what the AI told you — and then discovered it can't, you've spent credibility you can't easily recover. The AI's confabulation became your mistake the moment you passed it along without checking. This isn't theoretical. Builders on r/ClaudeAI and the Cursor Discord report this pattern regularly: the AI said it could be done, they scoped a project around it, and the capability turned out to be phantom. [VERIFY] A 2025 study from Stanford's HAI group found that developers using AI coding assistants accepted AI-generated code containing fabricated API calls roughly 25% of the time without verification.

The Fix

The fix is a simple rule that's hard to follow consistently: every specific capability claim from an AI gets verified against the official source before you build on it. Not "most claims." Not "claims that seem suspicious." Every specific, load-bearing claim about what a tool can do, what an API exposes, what parameters exist, what outputs to expect.

This sounds exhausting. It isn't, because most of the time the verification takes under a minute. You open the tool's documentation, you Ctrl+F the feature name or the endpoint, and either it's there or it's not. The return on that minute is enormous — it's the difference between building on solid ground and building on something the AI invented. The one-minute check prevents the three-hour debugging session. Every time.

The practical workflow looks like this. Use the AI for the shape of the solution — the architecture, the approach, the general direction. AI assistants are genuinely good at this. They reason well about structure, patterns, and strategy. Then, before you write a single line of implementation code, take every specific claim the AI made and verify it against current documentation. Does the endpoint exist? Does the parameter do what the AI said? Is the authentication pattern correct? Is this feature available on the tier you're using?

Think of it as trust layering. Trust the AI for "what should I think about?" Don't trust it for "what specifically exists." Use the AI to understand the docs, not to replace them. The AI is a brilliant colleague who hasn't checked their email in six months — smart, insightful, full of good ideas, and completely unreliable on current specifics.

This series — all ten parts of it — maps the specific categories of AI confabulation you'll encounter when building with AI tools. Hallucinated features. Phantom APIs. Deprecated capabilities described in present tense. Confident wrong answers about competing models. Each article covers one category, shows you the pattern, explains why your brain falls for it, and gives you the specific verification step that catches it. The goal isn't to make you distrust AI. The goal is to make you a better user of AI — one who knows exactly where the boundary is between what the model knows and what it's generating on the fly.

The AI isn't lying to you. It's doing exactly what it was built to do — generating the most plausible next token. The problem isn't the AI. The problem is treating plausibility as truth. Stop doing that, and everything else in this series is implementation detail.

This is part of CustomClanker's AI Confabulation series — when the AI in your other tab is confidently wrong.

The AI Said It Could Do That. The AI Was Wrong.

Rza

The Pattern

The Psychology

The Fix

Read more

The YouTube + AI Pipeline

The Weekly Drop

The Tool Collector's Guide to Owning Nothing

Self-Hosting & Tinkering