Claude Deep

Claude Memory: What It Retains vs. What It Actually Forgets

Rza

17 Sep 2025 — 6 min read

Every week someone on r/ClaudeAI posts a version of the same complaint: "I told Claude my name three conversations ago and it doesn't remember." They're not wrong. They're just confused about what Claude is — and more importantly, what it isn't. Claude doesn't have memory the way ChatGPT markets memory. What it has is context, project instructions, and — if you use Claude Code — a file-based memory system that actually works. Understanding the difference between these things is the difference between being productive with Claude and being frustrated by it.

This matters because "memory" is doing a lot of heavy lifting in AI marketing right now. OpenAI made persistent memory a flagship feature. Google is building it into Gemini. The implication is that your AI assistant should know you the way a human assistant would — accumulating knowledge over time, building a model of your preferences. Claude takes a different approach, and whether that's a limitation or a feature depends entirely on what you're trying to do.

What The Docs Say

Anthropic's documentation is straightforward about this. Claude operates within a conversation context window. Per the API reference, Claude 3.5 Sonnet supports 200K tokens of context. Claude 3 Opus, same. Claude 3.5 Haiku, same. Everything you say in a conversation, and everything Claude says back, lives in that window. When the conversation ends, it's gone. New conversation, blank slate.

There is no built-in cross-conversation memory in the consumer product. Anthropic's docs don't promise it, don't hint at it, and don't have a roadmap item for it that's publicly visible. What they do offer is Projects — available on Claude Pro and Team plans — which let you attach custom instructions and documents that persist across conversations within that project. According to Anthropic's documentation, project instructions can hold up to 4,000 characters [VERIFY], and you can upload files as project knowledge that Claude references in every conversation.

Claude Code, the CLI-based coding tool, has its own memory system. It reads from CLAUDE.md files — one at the project root, one in a user-level config directory, and optionally one in ~/.claude/ for global preferences. These are plain text files that Claude Code reads at the start of every session. There's also an automatic memory feature where Claude Code can write notes to a memory file that persists across sessions.

What Actually Happens

I tested each of these systems for two weeks, and the practical reality maps to the docs more closely than you'd expect — which is unusual for AI products.

Within a single conversation, Claude's context handling is genuinely good. I ran sessions that pushed past 50K tokens — long debugging sessions, multi-document analysis, extended writing work — and Claude maintained coherent references to things I'd said early in the conversation. It remembered variable names, stylistic preferences I'd stated, constraints I'd outlined. This is not memory. This is context. But it behaves like memory within the bounds of a single session, and for most tasks, a single session is all you need.

Across conversations, it's exactly what Anthropic says: nothing persists. I told Claude my name, my preferred programming language, and my code formatting preferences in one conversation. Opened a new one. Gone. Not degraded, not fuzzy — gone. The new conversation has zero knowledge of the previous one. This is jarring if you're coming from ChatGPT, where the model might greet you by name and remember that you prefer TypeScript over JavaScript. But it's also predictable. You never get a situation where Claude hallucinates a preference you mentioned six months ago in a different context and applies it incorrectly.

Projects as pseudo-memory are the most underused feature in Claude's product lineup. I set up a writing project with custom instructions — voice guidelines, formatting rules, a list of terms to avoid — and uploaded three reference documents. Every new conversation in that project started with Claude already knowing the style rules. It's not dynamic memory; it's a static prompt injection. But for recurring workflows, it's effective. The limitation is that Claude can't update project instructions on its own. If you want to add a new preference, you edit the instructions manually. It's a config file, not a learning system.

Claude Code's file-based memory is the closest thing to real persistent memory in the Claude ecosystem, and it works because it's dead simple. CLAUDE.md is a markdown file. Claude Code reads it. If you tell Claude Code to "remember that we use Tailwind CSS in this project," it writes that to a memory file. Next session, it reads the file, knows about Tailwind. The mechanism is transparent — you can open the file, read it, edit it, delete lines you don't want. There's no black box. I've been using this for a month with a mid-size TypeScript project and the memory file has grown to about 40 lines of project-specific context. It works well. It's also only available in Claude Code, which means it's only available to developers comfortable with a CLI tool.

The ChatGPT Comparison

This is the elephant in the room, so let's address it directly. ChatGPT has persistent memory. You can tell it things about yourself and it stores them in a memory bank that persists across all conversations. It remembers your name, your job, your preferences, your projects. OpenAI markets this aggressively and users genuinely like it.

Claude has none of this in the consumer product. The trade-offs are real on both sides. ChatGPT's memory occasionally hallucinates — users on r/ChatGPT report instances where the model "remembers" things they never said, or applies a remembered preference in the wrong context. Memory also creates a privacy surface area: everything ChatGPT remembers about you is stored on OpenAI's servers, associated with your account, and presumably available for model training unless you opt out. There have been documented cases where memory items are difficult to fully delete [VERIFY].

Claude's approach — no persistent memory by default — is less convenient and more predictable. You never get a conversation that starts weird because of some half-remembered fact from three months ago. You also never have to wonder what the model "knows" about you that you didn't explicitly tell it in this session. Every conversation is self-contained. The downside is real: you repeat yourself. A lot. If you use Claude daily, you're pasting the same context blocks into new conversations regularly, and that's friction.

When No Memory Might Be Better

There's a case to be made — and I think Anthropic is making it implicitly — that stateless conversations are a feature, not a limitation. Three arguments hold up under scrutiny.

First, predictability. When you start a new Claude conversation, you know exactly what context it has: none, plus whatever project instructions you've set up. There's no hidden state. No accumulated drift. No situation where the model applies something it learned from a conversation you had in a different context entirely. For professional use, especially anything involving sensitive data, this is meaningful. You can reason about what Claude knows because you can enumerate it completely.

Second, privacy. Claude's statelessness means there's no growing profile of you on Anthropic's servers. Your conversations are stored for the retention period, but there's no separate "memory" database that accumulates personal facts. If you're working with client data, legal documents, medical information — the absence of persistent memory is a compliance feature, not a missing feature.

Third, output quality stability. ChatGPT users have reported that accumulated memory sometimes degrades output — the model over-indexes on remembered preferences and applies them in situations where they don't fit. Claude doesn't have this problem because it can't have this problem. Each conversation's output depends only on that conversation's input. The quality function is stateless.

Workarounds That Actually Work

If you need persistence with Claude, here's what works in practice, ranked by effort-to-value ratio.

Projects (low effort, high value for recurring tasks). Set up a project for each major workflow. Put your preferences, context, and reference docs in the project. Every conversation in that project starts with that context loaded. This is the closest thing to memory for most users and it takes five minutes to set up.

Context documents (medium effort, high value for complex work). Maintain a markdown file with your current project state, preferences, and relevant background. Paste it at the start of new conversations or upload it. This is manual but gives you complete control over what Claude knows. I keep a claude-context.md file for each active project and update it at the end of working sessions.

Claude Code's CLAUDE.md (low effort, high value, developer-only). If you use Claude Code, this is handled for you. Let Claude Code write to its memory file and it accumulates useful context automatically. Check the file occasionally to prune irrelevant entries.

Session handoff (medium effort, medium value). At the end of a long conversation, ask Claude to summarize the current state — decisions made, preferences established, work completed, next steps. Copy that summary into the start of your next conversation. This is the most manual approach but it works for any Claude plan, including free.

The Bottom Line

Claude doesn't remember you. This is a design choice, not a bug, and it has real advantages alongside real inconveniences. If you need an AI that knows your name and your favorite programming language without being told, ChatGPT does that. If you want predictable, stateless conversations where you control the context completely, Claude does that. The workarounds — Projects, context docs, Claude Code memory — bridge the gap well enough for most professional use cases. The people who get frustrated are the ones expecting memory they were never promised. The people who get value are the ones who understand the system they're actually using.

This article is part of the Claude Deep Cuts series at CustomClanker.

Claude Memory: What It Retains vs. What It Actually Forgets

Rza

What The Docs Say

What Actually Happens

The ChatGPT Comparison

When No Memory Might Be Better

Workarounds That Actually Work

The Bottom Line

Read more

The YouTube + AI Pipeline

The Weekly Drop

The Tool Collector's Guide to Owning Nothing

Self-Hosting & Tinkering