Claude Deep

What Claude Can't Do: The Honest Limitations List

Rza

12 Sep 2025 — 8 min read

Every other article in this series explains what Claude does well and how to use it effectively. This one does the opposite. These are the limitations that Anthropic's marketing doesn't lead with, that the docs mention briefly in fine print, and that you discover the hard way during your second week of use. None of them are secrets. All of them matter. Understanding what a tool can't do is at least as important as understanding what it can, and the AI industry has a structural incentive to blur this line.

I'm not listing these to be negative about Claude. I use it daily and find it genuinely useful. But useful tools have edges, and the people who cut themselves are usually the ones who didn't know the edges were there. Here's where they are.

No Internet Access

Claude cannot access the internet. It cannot visit URLs, check current prices, look up today's weather, read a live webpage, or verify whether a company still exists. When you paste a URL into a conversation, Claude does not fetch the page. It sees the URL as a string of text. If it seems to know what's at that URL, it's because the content was in its training data — which means the information is from before the training cutoff and may be stale.

This is different from ChatGPT, which has web browsing built into the product, and from Google's Gemini, which has native access to Google Search. Claude's lack of internet access is a deliberate architectural choice, not a missing feature they haven't gotten to yet. According to Anthropic's documentation, tool use via the API allows developers to build internet access into Claude-powered applications, but the consumer chat product — claude.ai — does not include browsing.

In practice, this means Claude is useless for anything that requires current information. Stock prices, today's news, current documentation for a library that updated last month, whether a restaurant is still open — Claude either doesn't know or gives you an answer based on training data that may be months or years old. The training cutoff for Claude 3.5 Sonnet is early 2024 [VERIFY]. Anything after that date doesn't exist in Claude's world.

The workaround is manual: copy the content you want Claude to analyze and paste it into the conversation. This works for articles, documentation, code, and data. It doesn't work for tasks that inherently require live access — monitoring, real-time analysis, or fact-checking against current sources.

No Persistent Memory

I covered this in depth in the memory article in this series, but it belongs on the limitations list. Claude does not retain information between conversations. Every new chat starts from zero. There is no user profile, no preference bank, no accumulated knowledge about you. Projects provide pseudo-memory through custom instructions, and Claude Code has file-based memory, but the core product has no memory system.

This is a limitation that bites you more over time, not less. The first day you use Claude, statelessness is invisible. By the third month, you're tired of re-establishing context. The workarounds — Projects, context documents, session handoffs — work but require discipline. This is one area where ChatGPT has a clear, unambiguous advantage in user experience.

Hallucination

Claude hallucinates. Every large language model hallucinates. This is not a solved problem, it is not close to being a solved problem, and any marketing that implies otherwise is wrong.

What hallucination looks like in Claude specifically: it will generate plausible-sounding citations that don't exist. It will state facts with confidence that are partially or entirely wrong. It will describe features of software that the software doesn't have. It will attribute quotes to people who never said them. It will invent statistics that sound reasonable but were never published anywhere.

To Anthropic's credit, Claude hallucinates less aggressively than many competitors on certain benchmarks, and it's more likely to say "I'm not sure" or "I don't have information about that" than GPT-4 was at launch [VERIFY]. Claude's tendency to hedge — which some users find annoying — is partially a hallucination mitigation strategy. When Claude says "I believe" or "I think," it's signaling lower confidence, which is more honest than stating the hallucinated fact as certainty.

But "more honest about uncertainty" is not "honest." I've had Claude state fabricated facts without hedging, generate fake research papers with plausible titles and authors, and describe API endpoints that don't exist. Extended thinking — Claude's step-by-step reasoning mode available on 3.5 Sonnet and Opus — reduces hallucination for complex reasoning tasks [VERIFY], but it doesn't eliminate it. Nothing does.

The practical rule: never trust Claude's factual claims without verification. Treat it as a first draft that needs fact-checking, not as a source. Use it to generate text, structure arguments, and explore ideas. Do not use it as a reference. If you cite something Claude told you without checking, it will eventually be wrong, and it'll be your name on the byline.

The Refusal Problem

Claude over-refuses. This is the inverse of the hallucination problem and it's equally frustrating. Anthropic's safety training — which they call RLHF with Constitutional AI — makes Claude decline requests that are perfectly benign. I've been refused when asking Claude to write fiction involving mild conflict, explain how common security vulnerabilities work (information freely available in any cybersecurity textbook), discuss historical atrocities in analytical terms, and write persuasive text from a perspective Claude considers potentially harmful.

The pattern is consistent: anything that touches violence, sexuality, substance use, persuasion, or certain political topics can trigger a refusal. The refusal is usually polite — Claude explains why it can't help and offers an alternative — but the alternative is often so sanitized as to be useless. If you ask Claude to write a villain's monologue for a novel and get back a villain who sounds like a corporate HR department, the safety training has overcorrected.

This is a genuine limitation for creative writers, security researchers, educators, and anyone working in domains where discussing difficult topics is the entire point. Users on r/ClaudeAI have cataloged numerous cases where Claude refuses requests that any reasonable person would consider harmless. Anthropic has acknowledged the over-refusal problem and says they're working on calibration, but as of early 2025, it remains a noticeable friction point [VERIFY on current state].

The workaround is prompt framing. "Write a villain monologue" might get refused. "I'm writing a novel where the antagonist is a corrupt politician. Here's the scene context. Write his dialogue in this scene, maintaining his character voice" usually works. The more context and legitimate framing you provide, the less likely Claude is to refuse. But you shouldn't have to engineer around refusals for reasonable requests, and the fact that you do is a limitation worth naming.

No Image Generation

Claude can analyze images. It can describe them, extract text from them, interpret charts and diagrams, and answer questions about visual content. It cannot generate images. No DALL-E equivalent, no Midjourney integration, no image output of any kind. If you need an AI that both understands and generates images, Claude is only half the tool.

ChatGPT includes DALL-E image generation. Google's Gemini includes Imagen. Claude includes nothing. According to Anthropic's public statements, they're focused on text-based capabilities and haven't announced plans for image generation [VERIFY]. This is a straightforward gap in the product offering. If your workflow requires generating images — diagrams, illustrations, mockups, creative assets — you need a separate tool.

Training Data Cutoff

Claude's knowledge has a hard boundary. It was trained on data up to a specific date, and it knows nothing about events, publications, software releases, or cultural developments after that date. For Claude 3.5 Sonnet, this cutoff is approximately early 2024 [VERIFY]. Claude 3 Opus has an earlier cutoff. This means Claude doesn't know about software libraries released after the cutoff, recent API changes, current events, new research papers, updated regulations, or anything else that happened after its training data ended.

This is different from being wrong. Claude isn't misinformed about recent events — it simply doesn't have them. Ask about a library feature released last month and Claude will either say it doesn't know or — worse — hallucinate an answer based on what it would expect the feature to look like given older documentation. The second case is more dangerous because it sounds authoritative.

For coding tasks, this means Claude's knowledge of rapidly evolving frameworks — Next.js, React, Tailwind, Python ML libraries — degrades with every month since the cutoff. The code it suggests may use deprecated patterns, missing features, or outdated syntax. Always check generated code against current documentation.

Math Without Code Execution

Claude is better at math than early LLMs and worse at math than a calculator. Extended thinking helps significantly with multi-step reasoning, logic problems, and proofs. But for anything involving actual computation — arithmetic with large numbers, statistical calculations, financial modeling — Claude is unreliable without a code execution environment.

ChatGPT has Code Interpreter, which lets it write and run Python to compute answers. Claude has no equivalent in the consumer product. You can ask Claude to write code that performs the calculation, copy that code, and run it yourself. But Claude cannot run code and return results within a conversation on claude.ai. Through the API, developers can implement tool use for code execution, and Claude Code can execute code natively. But if you're using the web interface, you're limited to Claude doing mental math, which is approximately as reliable as a confident undergrad doing mental math — impressive when it works, wrong often enough to be dangerous.

Speed

Claude is slower than GPT-4o for simple tasks. This isn't a subjective impression — time the responses. For straightforward questions, short writing tasks, and quick lookups, GPT-4o returns results noticeably faster. Claude's responses are often more thorough and nuanced, but thoroughness has a latency cost. Extended thinking makes this more pronounced — when Claude uses extended thinking, response times can stretch to 15-30 seconds for complex queries [VERIFY].

For tasks where speed matters more than depth — quick code snippets, simple reformatting, short answers — GPT-4o is a better tool. For tasks where quality matters more than speed — complex analysis, long-form writing, multi-step reasoning — Claude's additional latency buys you something. This is a trade-off, not a universal advantage for either model.

The Token Cost Question (API)

If you're using Claude through the API rather than the consumer product, cost is a limitation worth flagging. Claude 3.5 Sonnet costs $3 per million input tokens and $15 per million output tokens [VERIFY on current pricing]. Claude 3 Opus costs $15 per million input tokens and $75 per million output tokens [VERIFY]. For high-volume applications, these costs add up. GPT-4o is cheaper for most use cases [VERIFY on current pricing]. For individual users on Pro or Teams plans, this doesn't matter — you're paying a flat subscription. For developers building applications, the per-token cost is a real factor in choosing between models.

The Honest Framing

Here's the thing about limitations: they're trade-offs, not failures. Claude doesn't have internet access because Anthropic decided that grounding and safety were more important than convenience — and because web browsing introduces a massive surface area for prompt injection and data exfiltration. Claude doesn't have persistent memory because statelessness is simpler, more private, and more predictable. Claude over-refuses because under-refusing is worse — a model that helps with genuinely dangerous requests is more harmful than one that occasionally refuses a harmless one.

These are design choices, and reasonable people can disagree about whether they're the right ones. But understanding them as choices rather than bugs changes how you use the tool. You don't get frustrated that your screwdriver isn't a hammer. You pick up the right tool for the job. Claude is genuinely excellent at reading, writing, analyzing, and reasoning about text-based information within a single session. It is not a real-time information service, a memory system, a calculator, or an image generator. Use it for what it does well, use other tools for what it doesn't, and verify everything. That's not a limitation of Claude specifically. That's literacy about AI in general, and it's the most important thing this entire series can teach you.

This article is part of the Claude Deep Cuts series at CustomClanker.

What Claude Can't Do: The Honest Limitations List

Rza

No Internet Access

No Persistent Memory

Hallucination

The Refusal Problem

No Image Generation

Training Data Cutoff

Math Without Code Execution

Speed

The Token Cost Question (API)

The Honest Framing

Read more

The YouTube + AI Pipeline

The Weekly Drop

The Tool Collector's Guide to Owning Nothing

Self-Hosting & Tinkering