Every Rule in My CLAUDE.md Is a Mistake I Made Once

~/CLAUDE.md

Every Rule in My CLAUDE.md Is a Mistake I Made Once

Six months ago, my CLAUDE.md was six lines long. My name, my projects, my preferred text editor. The kind of thing you write in five minutes because the docs say you should.

Today it’s 2,487 lines across five files. And it reads like an incident response playbook — because that’s what it is. Every rule started as a failure. A deployment that went to the wrong branch. A public article that almost contained my private IP addresses. An AI that confidently cited documentation that hadn’t been accurate in three months.

This article is the field guide to those failures — what broke, what I wrote down so it wouldn’t break again, and why your AI context files are governance, not documentation.

Here’s what one section of my CLAUDE.md actually looks like

When drafting a multi-item plan another Claude session will implement, these rules are non-negotiable:

Pre-audit each item against current code before writing it. Read+Grep yourself, or spawn parallel Explore subagents at “very thorough” depth.

Same-session contradiction sweep. If your draft contains hedging language (“or click anyway,” “except when N,” “unless”), stop — that’s a tell you’re contradicting something you just shipped.

Confidence labels per item. Each item carries an explicit tag: verified, plausible — needs audit, or speculative — discuss first.

Batch limit. Plans larger than 5 items must be drafted in batches of ≤3, each pre-audited.

First-conflict trigger. If one item’s wrong, audit the entire remaining plan immediately.

Handoff briefing requirement. The next session must be told: “audit each remaining item against current code before acting; report conflicts.”

That’s not a style guide. That’s not a list of preferences. It is one — an incident response playbook. Six rules, each written the morning after something went sideways.

I wrote about treating AI context like infrastructure a while back. That was the theory. This article is what the infrastructure looks like after six months of load. It’s not pretty. It’s full of scars. That’s how you know it’s working.

$ wc -l ~/CLAUDE.md

Week One vs. Month Six

Nobody plans to write a 2,500-line context system. You start with six lines and then every time something breaks in a way that surprises you, you add a rule that would have caught it. Six months later you have an incident response playbook.

I’m reconstructing from memory — I didn’t have version control on my context files back then. That’s a different article. But my first CLAUDE.md probably looked like this:

# CLAUDE.md
I'm John. I work in InfoSec.
Here are my projects:
- Homelab (Proxmox, Docker, DNS)
- TheDeLay.com (WordPress blog)
- Pinball leaderboard (Flask app)
Preferred editor: vi

The actual table of contents of my homelab CLAUDE.md

Here are the actual top-level section headers from my homelab root CLAUDE.md, in order. Just the headers — no body text. The point isn’t the contents. The point is the scope a context file has to cover when you’re using AI for serious ongoing work:

## User Preferences
## CRITICAL: Secrets Handling Policy
## CRITICAL: Sanitization Review Before Publishing
## Working with Claude on Multi-Item Plans
## Quick Reference Tables
## Topic Router - What To Read When
## Network Architecture (Quick Facts)
## Key Conventions
## MCP Integrations
## Related Projects
## Folder Structure
## Services on Primary Docker Host
## Gotchas & Lessons Learned
## Change History

Fourteen sections. Two of them flagged CRITICAL. One topic router (a table that tells the AI which other doc to read for which problem). One full section dedicated to gotchas and lessons learned.

This is what an AI’s working memory has to look like when it’s doing real work in a real environment. Preferences. Security policy. A failure-mode-prevention playbook. A map of the network. A directory of where to read deeper. A list of the scars.

Documentation describes a system. This describes how to behave in one.

Big difference.

I’m going to walk through the categories of incident those rules came from. Five of them. Each one is a story about a different way I learned not to trust something — including, eventually, myself.

$ verify –against=live-system

Don’t trust the docs

The pinball ingestion pipeline runs every machine in the bar. League nights, scores stream automatically into a PostgreSQL database, get ranked, and show up on the kiosk display before the player even racks their balls. I wrote a whole article about it — the dual-trigger n8n workflow design, the IScored polling, the whole pipeline.

When I sat down with Claude to draft that article, my project documentation said one thing. The live system said another.

  docs:   8 machines  |  5-min polling  |  fixed 60-min schedule
  live:  25 machines  | 10-min polling  |  dual-trigger adaptive

Three for three. The docs had drifted — written when the system had eight machines, polled differently, scheduled differently. Nobody updated them because nobody had a reason to, until the AI started reading them and confidently restating them as fact.

I caught it because I was about to publish a public article with those numbers. I cross-checked against the database. The first query came back with 25 rows. I had not queried the wrong table. That near-miss spawned an entire workflow document — the Article Development Guide — whose Phase 1 is now a mandatory fact-verification step. The lesson fits in seven words:

Trust the live system, not the docs.

That phrase now lives in three different CLAUDE.md files. Because the moment you stop saying it out loud, the AI defaults back to confidently citing whatever it found in the README.

$ audit –confidence

Don’t trust the AI’s confidence

This is the category that produced the playbook at the top of the article.

The failure mode is subtle. Claude is, by default, a uniform-confidence narrator. Whether it just verified a fact by reading the code, or it’s making a plausible inference, or it’s frankly guessing — the sentences come out at the same register. Same length. Same tone. Same level of certainty.

That’s fine in a chat. It’s not fine when the output is a multi-item plan that another Claude session is going to pick up tomorrow morning and start implementing. The next session has zero context. It reads the plan. Every item looks equally trustworthy. So it goes and implements them. And about 30% of them turn out to be based on assumptions that were never checked.

I had this happen enough times to write the rules — the ones quoted at the top of this article. The closing paragraph of that section is the part that hits hardest:

The failure mode this prevents: drafting feels lower-stakes than implementing because no code is being changed, so the audit gets skipped. But the plan is the artifact other sessions trust — its quality matters as much as code quality. Treat plan-drafting with the same rigor as code-shipping.

Plans feel like sketches. They aren’t. They’re contracts you’re handing to a future stranger — another instance of Claude that won’t remember any of the context you had when you wrote it. Say “we should probably refactor the cache layer” without checking whether someone already did last week, and the next session will refactor it again. Now you have two cache layers, and one of them was perfectly fine.

The full Multi-Item Plans section, with commentary on each rule

Here’s the full text, exactly as it appears in my homelab CLAUDE.md:

When drafting a multi-item plan another Claude session will implement
(UX_POLISH, REFACTOR, MIGRATION docs, queued cleanup lists, TEST_PLAN
backlogs), these rules are non-negotiable:

1. Pre-audit each item against current code before writing it.
   Read+Grep yourself, or spawn parallel Explore subagents at "very
   thorough" depth. Skipping this turns minutes of audit into hours
   of cross-session conflict resolution.

2. Same-session contradiction sweep. Before finalizing a draft,
   scan recent CHANGELOG entries from this session. If your draft
   contains hedging language ("or click anyway," "except when N,"
   "unless"), stop — that's a tell you're contradicting something
   you just shipped.

3. Confidence labels per item. Each item carries an explicit tag:
   verified (read the code), plausible — needs audit before
   implementing, or speculative — discuss first. Uniform-confidence
   prose is dishonest when items vary in how thoroughly you checked.

4. Batch limit. Plans larger than 5 items must be drafted in
   batches of ≤3 items, each pre-audited, not dumped all at once.

5. First-conflict trigger. If a sibling session catches one
   conflict in a plan you wrote, audit the entire remaining plan
   immediately rather than discovering conflicts serially. Use
   parallel Explore agents.

6. Handoff briefing requirement. When passing a plan to another
   Claude session for implementation, the briefing must include:
   "audit each remaining item against current code before acting;
   report conflicts."

The failure mode this prevents: drafting feels lower-stakes than
implementing because no code is being changed, so the audit gets
skipped. But the plan is the artifact other sessions trust — its
quality matters as much as code quality. Treat plan-drafting with
the same rigor as code-shipping.

A quick gloss on each one — what failed, why the rule is shaped the way it is.

Rule 1 — pre-audit. Born from the most common failure mode: drafting a plan against an outdated mental model of the code. The plan reads fine in isolation. It contradicts last week’s commits. Reading and grepping (or spawning parallel Explore subagents) is cheap. Cross-session conflict resolution is not.

Rule 2 — contradiction sweep. This one’s a tell. The hedging words give it away. If a plan contains “or click anyway” or “except when N,” I just shipped something that broke an assumption and the plan is trying to paper over it. Stop, read the CHANGELOG, fix the plan or fix the code.

Rule 3 — confidence labels. This is the most important one and the easiest to skip. Three labels: verified, plausible — needs audit, speculative — discuss first. Once items are tagged, the next session reading the plan can prioritize correctly. Without tags, every item gets the same trust, which means the speculative ones get implemented first (they’re usually the easiest-sounding) and produce the most rework.

Rule 4 — batch limit. Plans of more than 5 items must be drafted in batches of ≤3, each pre-audited. Why? Because the audit step in rule 1 has a cost, and humans (and AIs) cheat the cost by batching. The batch cap forces the audit to happen on every item.

Rule 5 — first-conflict trigger. If one item in a plan turns out to be wrong, the rest are statistically suspect. Don’t fix the conflict and proceed; audit the entire remainder. One bad item means the audit step was probably skipped, which means the rest is also unaudited.

Rule 6 — handoff briefing. When you hand a plan to another session, the briefing has to say “audit each remaining item against current code before acting; report conflicts.” Without that explicit instruction, the receiving session trusts the plan at face value. Every Claude session starts with zero institutional knowledge. Tell it not to.

Plans aren’t the lightweight thing. They’re the contract. Treat them like code.

$ sanitize-review –strict

Don’t trust yourself at speed

There’s an article on this site about this one. The short version: AI-assisted content creation almost embedded fingerprints of my homelab — real hostnames, private IP addresses, internal domain names — into a public-facing article. Not because the AI was malicious. Because I was moving fast, and the AI was helpfully pulling context from the most relevant files it could find, and those files contained everything I didn’t want public.

I caught it. Barely.

Luck is not a security control.

So now there are rules.

They show up in three different CLAUDE.md files. The Articles workspace says it most plainly:

Every article must pass a sanitization review before it goes to WordPress. This is not optional and not a “for sensitive articles” thing — it runs on every single piece of content.

The sanitization review is its own skill — a separate file in ~/.claude/skills/ that activates automatically when I say things like “ready to publish,” “let’s ship this,” or “paste this into WordPress.” The skill auto-triggers on those phrases because by the time I’m saying them, I’m not in audit mode anymore. I’m in publish mode. The whole point is for the brake to engage when I’ve stopped pumping it myself.

The push-back-once rule

If the user says “it’s fine, just publish it” — name the specific risk once, then defer if they override.
Do not nag.
The point is to surface the question, not to block the user.

The skill is a backstop, not a gatekeeper. It exists to make sure the question gets asked. The other guardrails protect against the AI’s failure modes. This category protects against mine.

$ git branch –show-current

Don’t trust the first session

Every Claude Code session starts with zero memory of the previous one. The CLAUDE.md gets re-read; recent commits get scanned; the rest is a blank slate.

That blankness is dangerous. The first thing a fresh session does is reach for the most obvious solution to whatever I just asked. It hasn’t tripped over the wires yet. So my CLAUDE.md files have a recurring pattern: stop the session before it acts. Make it pause and consider what else is connected.

The pinball project’s git workflow rule is a perfect example. I wrote about that kiosk — the deployment pipeline lives or dies on two branches. development is staging. main is the live screen at the bar three hours away. So:

🚨 CRITICAL PRE-COMMIT SAFEGUARD 🚨

BEFORE making ANY commits: run git branch --show-current. Always verify you’re on the development branch. Never push directly to main. Always test on development first.

That’s there because once, in the early days, a commit went to main. Probably mine, probably Claude’s, doesn’t matter — it slid through and an untested change went straight to the kiosk. Now it’s the loudest section in the file. ALL CAPS. Sirens. The first thing any new session sees when it scrolls through.

The website CLAUDE.md has a different version of the same idea, called Ecosystem Awareness. Before any change, the new session has to ask:

What else might this affect?

And then run through a checklist: WordPress core, theme, child theme, database, wp-config, Cloudflare, GCP, SMTP, file permissions. Eight separate systems, all interconnected, where a change to one can silently break another.

The list grew over time. A theme tweak that broke email got SMTP added. A DNS change that took down a tunnel got Cloudflare added. A Cloudflare Access misconfiguration that silently killed analytics got… well, we’ll get to that one. The rule that closes the section is the most honest thing in that file:

The user would rather answer a question than debug a broken site.

Convergent evolution. Every single one of my CLAUDE.md files independently evolved a rule that says: ask. Don’t assume. When in doubt, stop and check. Not because I told them to. Because every single project ran into the same failure mode — Claude filling a small gap with a confident-sounding guess, that guess being slightly wrong, that wrongness compounding — and the fix in every case was the same. Different projects. Different domains. Same scar.

That convergence is the article in miniature: these aren’t preferences. They’re selection pressure.

$ tail -n 100 scars.log

The scar tissue

Quick hits.

Each of these is a real rule in a real CLAUDE.md, born from a specific failure I’d rather not repeat.

Auto-increment primary keys. Claude generated INSERT statements with score_id = 0 for the high scores table. The first row went in fine. Every row after that hit a duplicate key violation, because PostgreSQL was trying to assign the same auto-incremented value Claude had already hardcoded. The rule now appears three times in the pinball CLAUDE.md — once inline, once in a callout, once in its own dedicated section with a “wrong vs. correct” example. Once was not enough.

Shell escaping with !. The exclamation point in bash has a special meaning (history expansion). When you pipe a command through SSH inside double quotes, it gets escaped as \!, which mangles anything that legitimately contains it — like API tokens for a backup server. The fix is heredocs with single quotes, or script files on the target, or $'string' syntax. The rule lives in the homelab CLAUDE.md because the Stack Overflow answer for this took me too long to find.

The Kadence padding stack. On mobile, articles were rendering at 121 pixels of usable width — six characters per line. Five nested container divs, each adding 24-32px of horizontal padding. None obviously wrong. All of them combining to murder the layout. The fix was one media query. Finding it took an hour of DevTools archaeology. The full post-mortem — every container, every padding value — lives in the website CLAUDE.md so the next session doesn’t have to dig.

The Umami silent failure. Analytics for the blog stopped working. No errors. No alerts. Just a dashboard that showed “no traffic” and got assumed to mean “nobody’s reading the blog.” Three weeks later I figured out what had actually happened: a Cloudflare Access policy had been applied to the analytics subdomain, and every single tracking request was being silently redirected to a login page before reaching the analytics server. The browser didn’t show an error. The dashboard didn’t show an error. The redirect just happened. Silent failures are the worst failures, and that sentence is now a section header.

More scar tissue: token permissions, webhook 404s, VRRP plaintext auth

A few more rules, in quick-hit format. Each one is its own little post-mortem in someone’s CLAUDE.md, told just verbosely enough to spare the next session from learning it again.

Backup-server API token permissions. The backup server’s web UI lets you create API tokens with checkboxes for permissions. There’s a “Privilege Separation” toggle. Unchecking it is supposed to give the token the permissions of the user. It does not, exactly. Even with privilege separation off, the token still needs an explicit ACL entry for the path it’s trying to operate on. The UI says “this token has the user’s permissions”; the system enforces “this token has whatever ACLs are explicitly granted to it.” The token I created looked correct in every UI panel, and silently failed the moment it tried to read the actual datastore path. The rule in CLAUDE.md is now: every backup token needs an ACL entry, full stop. Trust the failure modes, not the dropdowns.

Workflow webhooks created via MCP. I use an automation tool that exposes webhook endpoints. When I create a workflow through the regular UI, the webhook gets a webhookId field automatically. When I create the same workflow programmatically through an MCP server, that field is sometimes missing. The workflow looks active. The endpoint returns 404. There’s no error message anywhere. The fix is dumb — open the workflow, toggle it off and on through the UI, the missing field gets populated. The rule: API-created resources don’t always match UI-created resources. If something is silently 404-ing, check whether it was created through the MCP shortcut and whether all the metadata fields the UI would have added are actually present.

VRRP/keepalived authentication. My DNS layer is highly available — two name servers behind a virtual IP, with keepalived deciding which one holds the VIP at any given moment. The default authentication mode is auth_type PASS, which transmits the auth password in plaintext on the local network. On a home network this is theoretical; on any environment with even slightly hostile WiFi, it’s a vector. The rule is in a planning doc rather than CLAUDE.md proper, because the fix is non-trivial (migrate to a tunneled control plane, probably WireGuard) — but the awareness lives in CLAUDE.md, so any session touching the HA setup at least knows what’s underneath it.

The pattern across all three: it’s not the primary failure mode that hurts. It’s the secondary one. The token isn’t broken — the ACL is missing. The webhook isn’t broken — a metadata field is missing. The HA isn’t broken — the auth on the protocol you didn’t think of is plaintext. CLAUDE.md exists to keep that secondary layer in front of the next session’s eyes.

Each of these is three or four sentences in a CLAUDE.md somewhere. They look small. Cumulatively, they’re most of the file.

$ :save

The `:save` command

When I finish a session — feature shipped, bug fixed, article drafted, anything — I type two characters: :save. Claude does a structured pass through every relevant doc in the project. Update CLAUDE.md if anything changed. Append to CHANGELOG.md with what was done. Update SERVER_SETUP.md if a server config changed. Update FUTURE_PLANS.md if items got finished or new ones got found.

And then it asks me three questions.

Are there any details from this session I should capture that I might have missed?

Did anything we discussed change your plans or priorities?

Should any of this be added to the parent HomeLab CLAUDE.md as well?

That’s how new guardrails get into CLAUDE.md. Not because I planned them. Because at the end of every session, I get prompted to ask “what did we just learn?” — and sometimes the answer is “we learned that auto-increment keys cause duplicate violations and I’d like to never relearn that.” So I write it down. And it goes into the file. And the next session reads it before doing anything.

The full :save workflow — every file it touches and every question it asks

When I type :save, Claude does a structured pass through five files. Five self-audit questions. Three user-facing questions. Then it commits to memory.

The five files it touches

CLAUDE.md

New server details, credentials, configuration changes, new known issues or gotchas, and the “Last Updated” date at the top. This is the long-term memory — the file the next session reads first.

CHANGELOG.md

A dated entry describing what was done, what issues came up, how they were resolved. The CHANGELOG is the receipts. CLAUDE.md says “this is how things work now”; CHANGELOG says “this is when and why it changed.”

SERVER_SETUP.md

If anything about server config changed: packages, settings, vhost rules. This is the rebuild-from-scratch reference. If the server burned down tomorrow, this file plus the backup script is what brings it back.

FUTURE_PLANS.md

Check off completed items. Add new items discovered during the session. Re-prioritize if needed. This is how today’s session feeds tomorrow’s roadmap without me having to remember.

Other docs

Reference docs, troubleshooting summaries, backup folders, anything else the session touched.

The five self-audit questions Claude runs against itself before saving

Did we change any server configuration? → SERVER_SETUP.md
Did we discover a new gotcha or issue? → Known Issues section, or a new reference doc
Did we complete a planned feature? → Check off in FUTURE_PLANS.md
Did we modify theme files? → Backup to backups/ folder
Is there anything unclear that should be documented? → Ask the user

The three questions it asks me out loud

“Are there any details from this session I should capture that I might have missed?”
“Did anything we discussed change your plans or priorities?”
“Should any of this be added to the parent HomeLab CLAUDE.md as well?”

That third question is the one that matters most. Most sessions are scoped to a single project, but the lessons aren’t always project-specific. Sometimes a debugging story I hit in the website project is going to bite me again in the pinball project. The question forces me to consider whether this lesson is parochial or universal — and if it’s universal, it gets promoted to the parent HomeLab CLAUDE.md, where every project sees it.

The power of :save isn’t the automation. Claude could update those files without prompting me. The power is the built-in pause. I just spent two hours deep in a problem. I’m tired. I want to close the laptop. The :save workflow stops me, runs me through the questions, and forces me to step back and ask “what did we actually learn?” That’s where guardrails come from. Not from planning sessions. From the slightly-annoying ritual at the end of every productive day.

Skip :save once and the lesson is gone. The next session re-encounters the same problem with no warning. That’s why it’s not optional.

:save is not a script.

There’s no save.py. No slash-command file. No compiled program, no API call, no automation framework. I checked. It’s lines 145–189 of Projects/TheDeLay_Website/CLAUDE.md — a section labeled ## The :save Command containing five required actions, five self-audit questions, and three user-facing questions. When I type :save, Claude reads the instructions written in plain English and follows them. That’s the whole mechanism.

The architecture, in four lines

The “command” is a document.
The “framework” is structured English.
There is no engine. The AI is the engine.
The CLAUDE.md is the program.

Which means I’ve been telling you the same story this whole article without saying it out loud: the entire thing — confidence labels, multi-item plan audits, sanitization reviews, ecosystem checklists, scar tissue post-mortems, push-back-once, ask-don’t-assume, never-commit-to-main — is plain text in version-controlled markdown files. Re-read on cue. Modified by appending more text when things break. That’s not a metaphor. That’s the literal architecture.

$ cat ~/CLAUDE.md | wc -l

The payoff

Your CLAUDE.md isn’t documentation. It’s part of the system — the rule layer, the memory of every prior failure, encoded so the AI doesn’t have to relearn it.

The AI doesn’t get smarter. You get more careful, and you write down what you learned, and you put it where the next session will read it before doing anything.

This connects to the rest of the AI series I’ve written. “How to Give Claude a Memory” is the entry point — the 15-minute version that gets you started. “Why I Treat My AI Context Like Infrastructure” is the conceptual framing. “I Taught Claude How to Coach Me” is what skill files look like when you push the same idea further. This article is the long view — six months of accumulated load on the same foundation.

If you’ve been using AI seriously and your context files are still tidy, you haven’t been using it long enough yet. Or you have, and you’re going to discover the failure modes the hard way the first time something quietly drifts and nobody catches it.

Start your CLAUDE.md. It’ll be 15 lines on day one. Check back in six months. I guarantee it’ll read like an incident response playbook.

That’s not a sign you’re doing something wrong.

That’s the sign it’s working.

Written collaboratively with Claude Code, on a workstation full of CLAUDE.md files.

Companion reads: Why I Treat My AI Context Like Infrastructure — How to Give Claude a Memory — I Taught Claude How to Coach Me — So You’re Good at AI? Build This Skill Before You Get Sued — Building a Kiosk That Never Calls You at 2 AM — How 25 Pinball Machines Talk to a Database

Every Rule in My CLAUDE.md Is a Mistake I Made Once

Every Rule in My CLAUDE.md Is a Mistake I Made Once

Week One vs. Month Six

Don’t trust the docs

Don’t trust the AI’s confidence

Don’t trust yourself at speed

So now there are rules.

The push-back-once rule

Don’t trust the first session

The scar tissue

Quick hits.

The `:save` command

And then it asks me three questions.

The five files it touches

The five self-audit questions Claude runs against itself before saving

The three questions it asks me out loud

The architecture, in four lines

The payoff

I Taught an AI to Catch Its Own Lies. It Lied During the Lesson.

AI Curated Newsletter – Just 4U!

Surviving Compaction

Why I Treat My AI Context Like Infrastructure

Give Claude a Memory

3 Comments

Leave a Reply Cancel reply

Every Rule in My CLAUDE.md Is a Mistake I Made Once

Week One vs. Month Six

Don’t trust the docs

Don’t trust the AI’s confidence

Don’t trust yourself at speed

So now there are rules.

The push-back-once rule

Don’t trust the first session

The scar tissue

Quick hits.

The :save command

And then it asks me three questions.

The five files it touches

The five self-audit questions Claude runs against itself before saving

The three questions it asks me out loud

The architecture, in four lines

The payoff

Similar Posts

3 Comments

Leave a Reply Cancel reply

The `:save` command