Surviving Compaction

Surviving Compaction
surviving-compaction.html

Surviving Compaction

Series: Article 3 of AI as a Teammate, Not a Tool. Companions: Automate or Fall Behind · Your AI Needs a Bot Account.

You won’t see the compaction.

You’ll see the mistake it makes ten minutes later.

The session was going fine. You were 25 minutes into a refactor, three files in, AI was tracking the architectural decisions you’d talked through. You looked away to take a sip of coffee. You came back. You asked the next question.

AI proceeded confidently. The code it wrote violated a constraint you’d been crystal clear about an hour ago. The PR description it wrote contradicted the design you’d agreed on. The plan it executed skipped a verification step you’d specifically called out. To AI, it looked like a perfectly reasonable next move.

Compaction happened while you were drinking coffee.


$ cat compaction-101.txt

What Compaction Actually Is

Every AI session has a finite context window. Mine right now is one million tokens — generous, by current standards. Generous is a relative word, though. A long conversation, several large file reads, a handful of PR diffs and big tool outputs, and you’ll be surprised how fast you fill it.

When the system sees you approaching the wall, it doesn’t crash. It compacts. It takes your older messages — the early conversation, the early file reads, the design discussion that set up everything you’re doing now — and replaces them with a summary. The summary is short. It frees space. The session can continue.

Here’s the catch: summaries are lossy. That isn’t a flaw. It’s the design. A summary captures the gist, not the nuance. The architectural decision is in the summary; the reasoning behind it isn’t. The constraint you stated is in the summary; the exception you carved out isn’t. The rule survives. Why the rule exists doesn’t.

Compaction isn’t a bug. It’s how the model survives a finite window.
That doesn’t make it less destructive.

Anatomy of a Context Window A horizontal bar showing how a context window is divided: a small System segment, a CLAUDE.md memory segment that grows quietly, your conversation messages, free space, and a red autocompact buffer at the right edge. Anatomy of a Context Window System ~2% CLAUDE.md grows quietly Your conversation (grows as you work) Free space (shrinks as messages grow) Autocompact ~33k tokens When messages crowd into the autocompact zone, the system replaces older messages with a summary. That summary is lossy. By design.

The mechanics, briefly: Anthropic’s prompt cache holds your context for about five minutes. The autocompact buffer — the floor that triggers automatic compaction — sits around 33k tokens. When free space drops near that floor, the system replaces older messages with summaries to make room.

If this happens at the start of a session, no big deal. If this happens 25 minutes into a multi-file refactor, you’ve just lost the first half of the work and AI doesn’t know what it forgot.


$ cat scar-tissue.txt

What Compaction Looks Like When It Bites You

I’ve watched this fail in three flavors. Each one was a “how did that happen?” moment that took me longer to diagnose than I’d like to admit.

Scar #1 — The half-finished plan

I’d given AI a careful, twelve-step deployment process. We were on step seven. AI completed step eight, looked at the result, and declared the entire project finished. The remaining four steps — the ones that included verification, rollback prep, and notification — got dropped. AI didn’t say “I’m skipping these.” It said “Done.” It thought it was.

What had happened: compaction summarized the original twelve-step list as something like “Deploying the new service in stages, with verification at the end.” When AI looked at step eight’s output and saw “verification” in the summary, it pattern-matched and concluded we were at the end. The actual list was gone.

Scar #2 — The TODO that became a finished product

I asked AI to draft a config file. AI wrote a template — perfectly structured, but with <TODO: fill in domain> placeholders in three spots. The plan was: write the skeleton, ask me for the values, fill them in. The skeleton was correct. The plan was correct. The compaction killed the plan, not the file.

Post-compaction, AI saw the file, saw it was structured cleanly, and committed it. Placeholders intact. The next service that read the config failed at startup with a parse error referencing the literal string <TODO: fill in domain>. I caught it because the deploy failed loudly. If it had been a config that quietly accepted nonsense, I’d have caught it days later, by which point I’d be debugging a bug I hadn’t actually written.

Scar #3 — The exception that vanished into a generality

I’d told AI early in the session: “Never edit /etc directly. Always go through the config-management layer.” Two prompts later — post-compaction — AI said “I’ll just update /etc/hosts real quick.” I caught it before it ran. I asked why. AI was confident: “You said to be careful with system files.”

That’s what survived the compaction. Not the rule. A softening of the rule. “Never touch /etc became “be careful with system files” in the summary, and “be careful with” is, to a confident AI, an invitation to do the thing carefully rather than not at all.

AI doesn’t know what it forgot. That’s the part that makes compaction so dangerous.
You can’t ask it to verify a constraint it no longer remembers having.


$ cat detection.txt

How You Notice It Happened

The system doesn’t tell you compaction is about to happen, and it doesn’t really tell you it just happened either. You learn to spot it from outside.

The /context command is your primary tool. Run it any time you suspect the session is getting long. It shows free space as a percentage. If you’re below 30%, you’re already in the danger zone — and the closer you get to the autocompact buffer, the higher the chance compaction triggers on your next big tool output.

The behavioral tells are how you’ll usually catch a compaction after it happens:

  • AI re-asks a question it had already answered an hour ago.
  • AI suggests an approach you’d already explicitly ruled out.
  • AI’s confidence about a fact differs from a fact you remember stating differently.
  • The “wait, why is it suggesting that?” moment.

When you spot any of those tells — STOP. Don’t shrug them off as one-off mistakes. They’re the symptoms of AI working from a different mental model than you are. Treat the next ten minutes of output as suspect until you’ve re-grounded.

Useful trick when you’re not sure: paste the /context output into the chat and ask AI to interpret it. Yes, it’s circular. Yes, AI’s interpretation isn’t always perfect. But it’s what we have right now, and it’s faster than guessing. AI reading its own context budget is, charmingly, a slightly unreliable narrator on the topic. The model vendors will get better at this. Right now it’s the workaround we’ve got.

$ cat recovery-playbook.txt

The Recovery Playbook (After Compaction Has Already Bitten You)

If you didn’t catch it before the compaction, the rules change. Welcome to recovery mode.

The cardinal rule: STOP. Don’t assume anything. Don’t let AI continue the work it thinks it was doing. Don’t let it patch over the gap. The gap is the problem.

Here’s the playbook I run every time I detect a compaction has already happened:

  1. Tell AI to stop and verify. Out loud. “Compaction may have happened. Don’t continue any pending work. Re-read the documentation and the conversation history. Don’t assume anything. Verify any details you aren’t 100% certain of. Use your tools to verify. Ask me before acting.”
  2. Read documentation back together. AI re-reads the project’s CLAUDE.md, the relevant CHANGELOG.md, any plan files, any handoff notes. Don’t paraphrase. Re-read. The summary in AI’s head is the lossy version; the file on disk is the source of truth.
  3. Verify with tools. If AI’s “memory” of the file system disagrees with what’s actually on disk, the file system wins. cat, grep, git log, MCP queries, web fetches — whatever ground-truth tools are available. Don’t trust the summary; trust the artifact.
  4. Re-state the constraints. The constraints you set up earlier in the session — the rules, the exceptions, the don’ts — are exactly what compaction tends to soften into mush. State them again. Make AI confirm understanding before continuing.
  5. Resume from a known-good point, not from where AI thinks you were. AI’s idea of “where we are” is the thing most likely to be wrong.

When this works, it’s because you’ve taken five minutes to rebuild the context that compaction destroyed. When you skip it — when you say “yeah just keep going” — you’re authorizing AI to make decisions on a foundation it can no longer see.

If your project is complex, write the recovery prompt down ahead of time. I keep a HANDOFF-AFTER-COMPACT.md template in my project trees — a structured “to resume” document that captures current state, the goal, key files, what’s known to work, what’s open, and the exact prompt to feed the next session. It works whether the next session is post-compaction me or a fresh AI booted up cold. The download below is the generic version of that template.

⚡ Download: HANDOFF-AFTER-COMPACT-TEMPLATE.md

Generic template — drop it into any project, fill it in as you go, reference it as the first thing the next session reads.


$ cat prevention.txt

Prevention, in Three Layers

Recovery works. Prevention is better. The advice gets harder to follow as you go down the list.

Layer 1 — Break tasks up

The classic recommendation. Take big multi-step work and break it into chunks small enough to finish before you risk compaction. Each chunk gets its own session, its own clear scope, its own clean context.

Easy to say. Real life happens. Some refactors can’t cleanly split — they touch too many files, and breaking them apart introduces inconsistencies worse than the compaction risk. Some debugging sessions wander further than you expected, because the bug was deeper than you expected. The advice is correct. It just isn’t always available.

Layer 2 — Use /context proactively

This is the layer everyone skips. “I’ll deal with it when it gets bad.”

Run /context before any large task. Use these thresholds:

Free space Status What to do
≥ 50% 🟢 Healthy Proceed
30 – 50% 🟡 Caution Scope tightly. Avoid loading huge files or spawning subagents that return long outputs.
< 30% 🔴 Stop Do NOT start the task. Save state, propose a fresh session, resume there.

The check costs nothing. You skip it because it feels like ceremony. Until the day you wish you hadn’t.

If you don’t know what to make of the /context output, paste it into the chat and ask AI. Same circular trick from the detection section. Same answer: it works well enough, calibrate your trust, faster than guessing.

Layer 3 — CLAUDE.md hygiene

This is the high-leverage move that almost nobody does.

CLAUDE.md is auto-loaded into every session, before you even type your first prompt. Every line you put there is paid for in tokens, forever, in every session. A CLAUDE.md that grows unchecked from a tidy 200 lines to a meandering 1,500 lines is, mathematically, a CLAUDE.md that brings every session that much closer to the autocompact buffer before you’ve said hello.

I learned this the way I learn most things — by doing it wrong first. My CLAUDE.md files had quietly grown to 1,690 lines combined across three projects. That was 38.6k tokens of memory burning on every prompt. The fix: split the long-form content into reference files that load on demand only, kept the durable rules and pointers in CLAUDE.md, dropped the auto-loaded total by 48%.

The cardinal rule that came out of that cleanup is short:

Session narrative goes to CHANGELOG.md only.
CLAUDE.md is for durable rules, constants, and pointers.

If your CLAUDE.md reads like a journal, you’re funding compaction with every prompt you send. The full discipline behind this — the :save workflow, the health-check thresholds, how to refactor a bloated CLAUDE.md without losing anything — is the next article in this series. For now: run wc -l CLAUDE.md. If it’s over 300 lines, that’s the lever you want to pull next.

Prevention is cheaper than recovery. Always. By a lot.
The check that takes one minute today saves the silent failure you’d never see coming.


$ cat the-rule.txt

The One Rule for Post-Compaction Behavior

If I had to compress this whole article onto a Post-it note, it would be this:

After a compaction, the most expensive thing
AI can do is be confident.

Confidence after compaction is a tell. AI doesn’t know what it forgot. It can’t tell you what it forgot. It will, by default, fill the gap with its best guess — and its best guess will sound exactly as confident as its actual knowledge sounded ten minutes ago.

So when you suspect a compaction — even a little — slow down. Verify. Re-read. Re-state. Make AI confirm understanding before proceeding. The five minutes you spend re-grounding are the cheapest insurance you’ll ever buy.


$ exit

What’s Next

The recovery playbook works. The prevention layers work. Both rely on you noticing — catching the danger before it bites, or catching the symptom shortly after.

There’s a fourth layer I haven’t talked about, and it’s the one that prevents you from needing the others most of the time. It’s a documentation discipline I run on every project. It survives compaction by design. It costs me ten seconds at the end of a session and saves me hours when the session compacts unexpectedly.

It’s two characters: :save.

That’s the next — and final — article in this series.

A note on what I don’t know yet

Compaction-recovery patterns are still being figured out, by everyone. The HANDOFF template above is what I run; if you’ve got a sharper version, I want to see it. The detection-by-behavioral-tell heuristic is reliable for me but soft science. If your shop has built a more rigorous detector — a script, a hook, a metric — get in touch.

I’m also watching the model vendors closely on this. Better in-session signals about compaction risk, more transparent summaries, finer-grained control over what gets compacted versus preserved — all of those would help. None of them are here yet. We work with what we have.


You won’t see the compaction.


You’ll see the mistake it makes ten minutes later.


Build the playbook before you need it.

Similar Posts

Leave a Reply