Automate or Fall Behind
Automate or Fall Behind
If you’ve been treating Claude or Gemini like a fancy autocomplete — “write me a function, refactor this file, explain this regex” — you’re using the easy mode. The compounding mode is the thing nobody talks about clearly. It’s not about AI making code. It’s about AI running the operational work around your code. The reviews. The triage. The rollups. The “what changed since I went to bed” question that used to take twenty minutes of clicking around.
Let me show you what that looks like first thing in the morning.
$ claude what-needs-eyes-today
The 4-Minute Morning
I open my laptop. Before I touch anything else, I ask Claude one question:
What needs my eyes today across my projects?
Four seconds later:
- 2 Renovate PRs on
pinball-analytics— one passes tests, one’s a major version bump worth a closer look- 1 Duplicati backup failed at 2 AM — disk-full on the NAS
- 3 Uptime Kuma alerts cleared themselves overnight (no action needed, but you should see them)
- 1 article in
Working/failed sanitization review — flagged a hostname that needs to be generic-ified before publish
I make four decisions in four minutes. Approve the green PR. Ask Claude what the major bump actually touches. Tell it to open a runbook for the disk-full alert. Look at the sanitization flag and confirm it’s a false positive.
Then I get coffee.
This is what AI is actually for.
Most of you are using 10% of it.
I’m one person. I run multiple homelab projects. I haven’t manually triaged a queue in months. And I still own every decision — because I’m still the human in the loop. AI just stopped being the thing I asked, and started being the thing I delegated to.
That distinction is the whole article.
$ cat automation-examples.log
What This Actually Looks Like
Two examples. Both run in my homelab. Both replaced something I used to do by hand.
Renovate / Dependabot triage
Renovate ships PRs at me when dependencies update. Used to be a pile of tabs. Now it’s a conversation.
Me: Summarize the open Renovate PRs on pinball-analytics. Anything that looks risky?
Claude: Three open. Two are patch bumps, both green, both touch only test fixtures — safe to merge. The third is a major version bump on
pg(Postgres driver) — touches the connection pool config, breaking changes in the changelog around timeout behavior. I’d not auto-merge that one. Want me to open it and read the breaking-changes section?
That’s the gain. Not “AI wrote a script.” Not “AI deployed something.” AI read the queue, applied judgment, and surfaced the one that needs a human. I look at the one. The other two get a thumbs-up.
The tools to do this come built into Claude Code and Gemini CLI. You don’t need a third-party orchestrator. You don’t need a custom agent. You don’t need OpenClaw — especially not OpenClaw, but we’ll get to that.
Backup monitoring
I have multiple backup systems running across the homelab. Used to mean opening multiple dashboards every morning. Now it’s one ask:
Me: Anything overnight on backups?
Claude: PBS jobs all green. Duplicati on the NAS failed at 02:14 — destination disk reports 94% full. Last successful job was last night’s incremental, so you’re not behind, but the next full will fail unless we free space or expand. Want the runbook?
I say yes. Claude opens the runbook from my homelab docs (which I wrote, because I :save everything — that’s a different article in this series). I follow the steps. Forty-five seconds of typing. The whole “morning ops review” used to take fifteen minutes.
If you want the underlying plumbing, I’ve already written about the n8n side of automation and the backup discipline that makes this monitoring worth doing. This article is about the layer above — the AI assistant that reads those systems and tells you what to look at.
$ tail -n 3 scars.log
Three Scars
You should not read the section above and decide AI is safe on autopilot. It is not. I have proof.
These are three things that happened to me in my homelab. None of them broke production — because I don’t run production at home — but each one taught me a rule I now refuse to ignore.
Scar #1 — Claude skipped Claude’s own comments
I asked Claude to “check the comments on this PR before I merge.”
The PR had several comments from claude[bot] — Anthropic’s own GitHub integration, the bot that does automated PR review. Real, substantive feedback. The kind of thing I wanted Claude to read.
Claude did not read them. To Claude, those weren’t “comments” — they were something else, some other category, presumably “bot output” or “review artifacts.” They got skipped. The summary I got back was confident and complete-sounding and wrong by omission. My nice little workflow flagged nothing.
I caught it because I happened to scroll the PR myself before merging. I do that less than I should.
Rule: Verify. AI categorizes things in ways that will surprise you, and the surprise is silent.
Autopilot is not safe.
The lesson isn’t “AI is bad.” The lesson is the failure mode is invisible. AI doesn’t say “I skipped some comments because I wasn’t sure they counted.” AI says “Here’s the summary.” If you don’t spot-check, you don’t know what got dropped.
Scar #2 — “Clean up” cost me a VM
I spent hours with Gemini setting up a Docker container in my homelab. Lots of specifics. Detailed config. Real engineering work. Gemini, in the process, scattered files across /tmp, scratch directories, the lot. Not its best behavior.
When we were done, I asked it: “Clean up per the workflow we defined.”
A few seconds later, the Docker container was gone. So was the homelab VM hosting it. Neither was created in that session. Gemini interpreted “clean up” as “remove all the artifacts I touched, plus the things they sit inside.” The VM had been there for weeks. It wasn’t a cleanup target. It was, in any sane reading, infrastructure.
Gemini did not see it that way.
Rule: Be specific about scope. AI assumes the broadest reasonable interpretation, and “reasonable” is doing some heavy lifting in that sentence.
“Clean up” can mean “delete the universe.”
What I should have said: “Delete the files you wrote in /tmp and the scratch directory at /home/TheDeLay/work-2026-04-15. Do not touch anything else.” Three sentences. Would have saved an evening of restoring a VM I cared about.
I don’t blame Gemini. I blame me for handing it a verb with a fuzzy object. The verb is yours. The object is yours. AI fills in the gaps, and it fills them with whatever interpretation makes the request “complete.”
Scar #3 — Ask for a comment, get a manuscript
This isn’t a single incident. It’s something that happens to me weekly.
Tell Claude or Gemini to “leave a short comment on this PR.” Watch what comes back. Six paragraphs. Three numbered lists. A link to documentation. A polite suggestion for follow-up work. All the things you didn’t ask for. Some of the things you did, buried.
Now flip it: tell it the opposite. “Leave a comment summarizing why this fix is necessary.” What you’ll get back: two sentences, vague, missing the technical context that made the fix worth shipping. AI’s calibration on “important” is not yours. You have to install yours.
Rule: Tell AI what’s important. If you don’t, it picks the wrong defaults —
usually verbose where you wanted terse, terse where you wanted thorough.
The fix is one extra sentence in the prompt. “Two sentences. Mention the upstream bug. Don’t add a ‘next steps’ section.” Compare the output to what you would have gotten from “leave a short comment.” It’s not even close.
You are the one who knows what important looks like. Don’t make AI guess.
$ cat threat-model.txt
OpenClaw: Don’t Bolt a Vulnerability to Your Toolchain
I see people getting excited about OpenClaw — the open-source browser-using agent that lets AI click around in a browser the way you would.
I want you to slow down.
The tools you already trust — Claude Code, Gemini CLI, the official integrations from the model vendors — already do this work safely. They have built-in tool use. They can read your repos, your tickets, your monitoring. They do it through APIs and MCP, not by impersonating you in a browser. There is no capability gap OpenClaw fills that your existing toolchain doesn’t already cover.
What OpenClaw does add is attack surface. Browser-using agents have documented prompt-injection vulnerabilities. Public. Demonstrated. Repeatable. A malicious page can ship instructions that the agent reads as user input. The agent acts on those instructions. You don’t see it happen. The session that was supposed to read your inbox just exfiltrated your inbox.
This is not theoretical. This is the category-of-risk that has been demonstrated against every browser-using agent the security community has gotten its hands on.
So you have a choice. You can do this work using tools whose threat model you understand, with sandboxes you control, in environments your existing safety tooling already covers. Or you can bolt a known-vulnerable browser puppet onto your trusted stack to do the same job, and hope the bad guys aren’t paying attention.
AI enthusiasts aren’t hearing the warnings on this one.
There’s going to be an incident. The only question is whether the company involved is honest enough to report it.
Don’t be the company.
$ cat reality-check.txt
Why You Can’t Punt This
I’d love to tell you this is optional. It’s not.
Boards have AI productivity slides. CEOs need to show the gains. Investors are asking about AI on every earnings call, and “we’re evaluating it” stops being an acceptable answer about six months from now — sooner if your competitor is louder than you are. The pressure is real and it’s coming downhill.
You will be either the engineer who delivered the gains,
or the engineer who got passed over for the one who did.
I don’t love this framing either. I’m telling you what’s true, not what’s pleasant.
Which is exactly why you should start in your homelab. Get the scars on hardware nobody pays for. Discover that “clean up” deletes your VM when nothing important is on it. Find out your AI quietly skips half the PR comments before the PR matters. Develop the rules, the prompts, the verification habits — on systems where the cost of a mistake is an evening of restore work, not your job.
Then bring the rules to work.
The engineers who’ll thrive in the next two years aren’t the ones who avoided AI risk. They’re the ones who earned the rules for managing it. You only earn them by getting bit. Better to get bit by your own homelab than by something with users on it.
$ exit
Bring the Rules to Work
I haven’t told you the whole story yet. When AI starts making changes on your behalf, you have a different problem: every PR, every commit, every comment looks like you wrote it. Your audit log lies to you. Code review calibration breaks. When AI screws up, it’s on your performance review.
The fix is to give AI an identity of its own. That’s the next article in this series.
A note on what I don’t know yet
I’m still figuring out the edges of this. The rules above are real and earned, but they’re not exhaustive. If you’ve been doing this longer than I have, or you’ve earned scars I haven’t, I want to hear about them. The point of writing about AI ops fundamentals while the field is still forming is that the field is still forming. Everyone gets a vote.
Find me at the usual place. Tell me what I missed.
Don’t get the scars on production.
Get them in your homelab.
Bring the rules to work.
