Hallucinated Features — When The AI Invents Capabilities That Don't Exist

You asked GPT-4 whether Midjourney supports inpainting with a --mask parameter. It said yes — explained the syntax, described the expected behavior, even walked you through the workflow of uploading a mask image and referencing it in your prompt. You spent twenty minutes trying to make it work before checking the Midjourney docs and discovering that --mask has never been a Midjourney parameter. It doesn't exist. It has never existed. The AI generated a feature that sounded exactly like something Midjourney could have built, described it with the specificity of official documentation, and sent you off to use it. The feature was fiction. Good fiction — plausible, detailed, internally consistent fiction — but fiction.

This is hallucinated features. Not deprecated features, not upcoming features, not features from a different plan tier — features that have never existed in any version of the tool, described as if they're right there in the current release.

The Pattern

The mechanism is straightforward once you understand how LLMs work. The model trained on descriptions of thousands of tools. When you ask about Tool A, it generates an answer by pattern-matching against everything it learned about Tool A and every similar tool. If a feature would make sense for Tool A — if it's the kind of thing Tool A's competitors have, or the kind of thing users would expect Tool A to offer — the model is more likely to generate it as an answer, regardless of whether Tool A actually has it.

This is why hallucinated features tend to be plausible, not absurd. The AI doesn't invent features that would be obviously impossible or weird. It invents features that sit right at the edge of what the tool could reasonably offer. Midjourney has image manipulation capabilities, so a mask parameter feels right. Runway does video generation, so an API endpoint for batch video processing sounds natural. n8n has hundreds of community nodes, so an AI-generated reference to a "Notion Advanced" node with specific capabilities feels like something you just haven't found yet.

The plausibility is the danger. If the AI told you Midjourney could make you a sandwich, you'd immediately recognize the confabulation. But when it tells you Midjourney supports a --stylize-seed parameter that locks the style independently from the composition seed — well, that sounds like something a sophisticated image generation tool would offer. It's adjacent to features that actually exist. The distance between the hallucination and reality is small enough that your skepticism doesn't activate.

Common categories of hallucinated features include invented API endpoints with plausible URL structures, parameter names that follow the tool's naming conventions but don't correspond to anything real, UI elements described as being in locations that make sense but don't exist, and capabilities attributed to the wrong tool entirely — features that belong to Competitor B described as if they're native to Tool A. The last category is especially common. Ask an AI about Runway's capabilities and you might get a feature that actually belongs to Pika or Kling, described as a Runway feature because the model's training data contained descriptions of all three tools in similar contexts.

The Google problem makes this worse than you'd expect. When the AI tells you a feature exists and you can't find it in the docs, your first instinct isn't "the AI lied." Your first instinct is "I'm searching wrong." You try different search terms. You look for community tutorials. You check YouTube. You check Reddit. The AI was so specific and so confident that you assume the gap is in your search skills, not in the AI's claim. This can eat another 30 minutes on top of the time you already spent trying to use the phantom feature. Users on r/ChatGPT describe this pattern regularly — the search spiral that follows a confident hallucination, where you're essentially trying to find evidence for something that doesn't exist.

There's a gradient of hallucination severity that's worth understanding. At the mild end, the AI describes a real feature but gets a parameter name slightly wrong — close enough that you might find the correct version through the docs. In the middle, it describes a feature that existed in a beta or an earlier version but was never promoted to the current release. At the severe end, it invents a capability wholesale — something no version of the tool has ever offered. The severe cases are actually easier to catch than the mild ones, because the mild hallucinations are close enough to reality that you might find the real feature and assume the AI was just being imprecise. The middle cases are the most insidious — the feature was real once, which means you might even find old blog posts or tutorials referencing it, confirming the hallucination against outdated sources.

The Psychology

You believe the hallucinated feature because you're doing a fundamentally reasonable thing — asking a knowledgeable assistant about a tool's capabilities. The context of the interaction sets up trust. You've probably asked the same AI assistant dozens of correct questions and gotten dozens of correct answers. When it tells you about a Python function or explains a SQL concept, it's right. The model's high accuracy on general knowledge creates a trust baseline that carries over into its claims about specific tool features — where it's significantly less reliable.

There's also a confirmation dynamic at work. You asked about the feature because you wanted to know if it was possible. You had a use case in mind. You were hoping the answer was yes. When the AI says yes with detailed specifics, it confirms what you were hoping to hear, and confirmation feels like validation rather than something that requires verification. The psychology literature calls this confirmation bias, but in practice it just feels like getting the answer to your question.

The specificity of the hallucination is what seals it. Vague claims trigger skepticism. Specific claims — with parameter names, syntax examples, and expected behavior — trigger trust. Your brain processes specificity as evidence of knowledge, and that's usually a reliable heuristic with human communicators. Someone who can name the specific parameter probably knows the tool. But the AI can generate specific parameter names for features that don't exist, because parameter naming follows conventions that the model has learned, just as it learned the conventions of the English language.

The Fix

One rule catches this every time: if an AI tells you a feature exists, verify it on the tool's official documentation or changelog before writing a single line of code or designing a single workflow step around it.

The verification step is fast. Open the tool's docs. Search for the feature name, the parameter, the endpoint — whatever specific thing the AI claimed. If the docs confirm it, proceed. If the docs don't mention it, treat the feature as nonexistent until you find primary-source confirmation. Not a blog post. Not a YouTube tutorial. Not another AI's answer. The tool's own current documentation.

For tools with APIs, the API reference is your single source of truth. If the endpoint isn't documented in the official API reference, it doesn't exist — regardless of what the AI generated. Some tools have undocumented endpoints, but building a workflow around an undocumented endpoint you found via AI hallucination is a recipe for a bad week. For tools with GUIs, the current product documentation or the tool itself is the check. If the AI says there's a button in Settings > Advanced > Export, open the tool and look. If it's not there, the AI confabulated it.

A useful habit is to ask the AI a follow-up before you build: "Can you link me to the documentation for this feature?" The AI can't actually browse the web in most contexts, but its response is revealing. If it generates a plausible-looking but incorrect URL, or says something vague about "checking the official docs," that's a signal — not proof, but a signal — that the feature claim may be generated rather than recalled. A model confidently describing a feature but unable to point you to where it's documented is exhibiting the pattern this series is about.

The deeper habit is positional. Move the AI from the role of "reference source" to the role of "brainstorming partner." Use it to explore what might be possible, what approaches you might take, what tools might be relevant. Then verify every specific claim against primary sources before committing to it. The AI is excellent at helping you think about the problem. It is unreliable at telling you what exists. Treat those as different capabilities — because they are.


This is part of CustomClanker's AI Confabulation series — when the AI in your other tab is confidently wrong.