Context Engineering: How I Taught an AI to Coach Pinball

Context Engineering: How I Taught an AI to Coach Pinball
context-engineering-coaching-pinball.html

Context Engineering: How I Taught an AI to Coach Pinball

GoofyGoober walks up to the kiosk at Midwest Aleworks. He’s been ranked #1 in this month’s pinball league for two weeks straight, but the gap is shrinking. He opens the hamburger menu, taps Player Strategy Report, and stares at his name on the screen.

The page knows things. It knows he’s classified as “The Grinder” because he’s played 18 of the 19 machines in rotation. It knows he’s in defend_lead mode because he’s holding first place with six days left in the tournament. It knows that on the Rush machine, he’s currently sitting at rank 7 with 52 million — and that last month he hit 623 million on that same machine. It tells him, in plain English, that pushing past 92 million on Rush would net him +3 points and that the cluster of scores at ranks 5–7 is tight enough that one hot ball could move him several spots.

Nobody wrote that advice. No coach studied the tape. An n8n workflow woke up at 6 AM, asked an LLM, and cached the answer before anyone at the bar opened their eyes.

In the last article, I described the pipeline that gets pinball scores into a database without anyone touching a keyboard. This article is about what happens after that — the layer that turns a row of numbers into something a player actually wants to read. And the architectural lesson buried inside it, which is that good AI output is almost never about the prompt.


$ ls /features

What’s On the Page

GoofyGoober's Player Strategy Report page on the kiosk display, showing player stats, unplayed machines, the AI Strategy Advisor narrative, and machine history
The full Player Strategy Report for GoofyGoober. The AI Strategy Advisor section is the focus of this article.

The Player Strategy Report is a single page that lives off the same kiosk app I described in Building a Kiosk That Never Calls You at 2 AM. Hamburger menu, one tap, you’re in. The page shows:

  • An AI-generated coaching narrative at the top — the part that names specific machines and specific score targets.
  • Unplayed machines, ranked by ROI — what to play next if you want to climb fastest.
  • Score trends with spark-lines per machine, so you can see whether you’re improving, stalling, or backsliding.
  • Rivals — who’s directly above and below you, with head-to-head stats.
  • Consistency rating — a 1-to-5-star score derived from the coefficient of variation across your attempts on each machine.
  • A What-If calculator and a Path-to-Podium projection for players who want to play the math themselves.
The full feature tour

The AI narrative gets the headline treatment, but the rest of the report is doing real work too. Here’s the full inventory.

The What-If Calculator is the feature that turns the report from a passive read into an interactive tool. Pick any machine, type in a hypothetical score, and the page recomputes your rank on that machine, your point total, and your tournament position in real time. It auto-fills the next plausible rank improvement when you open it (so you don’t start from a blank field), and it has quick-action buttons for “match the leader,” “+10%,” and “+25%” so you can sanity-check ambitious goals without doing the math yourself. The projected change shows up in green if it moves you up the leaderboard, red if it doesn’t.

Path to Podium is the same idea aimed at a different question: what’s the smallest possible point total that would put me in the top 3? It computes the point gap between you and 3rd place, then suggests the cheapest combination of rank improvements across your machines to close it. It’s not a guarantee — the leaderboard moves while you’re playing — but it’s a good “where do I focus tonight” answer.

Rivalries show the players directly above and below you on the overall leaderboard, with head-to-head stats: how many machines you both play, how often you beat them, and the average margin. Click any rival’s name and a modal opens with the full machine-by-machine breakdown — every machine you both play, who’s ahead, by how much. This is where the report gets competitive in a way that a static leaderboard never will.

The Consistency Rating is a 1-to-5-star score per machine, derived from the coefficient of variation across your attempts. Five stars means your scores cluster tight; one star means you’re swinging wildly. It’s not a quality score — it’s a predictability score. A 1-star high-variance machine is exactly where a hot ball can change your rank dramatically. A 5-star low-variance machine is where you need genuine improvement, not luck.

The Nemesis system is the petty, glorious capstone. It identifies the one player who beats you most often by margins close enough to feel personal — not the league champion who crushes everybody, but the specific opponent who keeps edging you out by 2 million points on Stranger Things. The report calls them out by name. There is, as far as I know, no in-app way to opt out of being someone’s nemesis, which feels right.

All five features run on the same scores the ingestion pipeline collects. No additional data sources, no extra workflows. The What-If math, the consistency rating, the nemesis identification — all of it is SQL plus a bit of computation, served from the same Flask endpoints that power the kiosk.

All of this is computed from the same scores the ingestion pipeline collects. No additional data sources, no telemetry, no special instrumentation. Just math applied thoughtfully, then handed to a language model with enough context to be useful.

That last part is the whole article.


$ cat scoring_system.txt

How Points Actually Work

Before any of this means anything, you need to understand how points work in this league.

Every machine has its own leaderboard. First place on a machine = 100 points. Second = 90. Third = 85. After that, points descend by one per rank: rank 4 = 84, rank 5 = 83, all the way down. A player’s tournament score is the sum of their best result on each machine they’ve played.

The interesting consequence: moving from rank 6 to rank 3 on one machine = +3 points. Same as moving from rank 6 to rank 5 on three machines. Not all improvements are equal. Some climbs are massively more efficient than others, and the right move depends on where the cluster of competitors sits, not just where you sit. If you’ve ever agonized over a fantasy football waiver wire pickup, you already know this math.

This is why pre-computed ROI matters — and why the AI gets useful only after we hand it the math, not before.


$ grep -i “context” prompt-engineering.md

Context Engineering > Prompt Engineering

Everyone wants to talk about prompt engineering. The actual leverage is in what you feed the prompt.

Here’s the brittle version of this feature. You take the player’s name and rank, hand it to an LLM, and ask: “You’re a pinball coach. Give advice to GoofyGoober.” What you get back is the kind of generic motivational fluff that gets ignored within five seconds. “Keep practicing! Focus on consistency! Try to climb the leaderboard!” Useless.

Here’s the version that actually works. The API endpoint that powers the report doesn’t just return scores. When you call it with ?include_ai_context=true, it returns an ai_context object — a structured JSON blob with eight pre-computed fields that summarize, in machine-readable form, exactly what kind of player this is, what mode they’re in, what specific machines they should target, and what topics the AI should avoid mentioning because they’re already shown elsewhere on the page.

That’s it. That’s the trick. The LLM is doing the easy part — turning structured facts into English. The hard part — the interesting part — is figuring out what facts to compute in the first place.

I wrote about a similar pattern in I Taught Claude How to Coach Me (where the structured context is a personal coaching framework) and again in How to Give Claude a Memory (where the structured context is a persistent memory store). Different domains, same principle: the AI is downstream of the data architecture. Get the data architecture right and the model looks brilliant. Get it wrong and even the best model produces oatmeal.

The full ai_context object

Here is the real structure of the ai_context object returned by GET /api/player/<name>/strategy?include_ai_context=true. Comments in // are mine, not in the actual JSON:

{
  // Scoring formula. Hardcoded reference, not data, but the LLM needs it.
  "scoring_system": {
    "formula": "1st=100pts, 2nd=90pts, 3rd+=max(0, 88-rank)",
    "key_insight": "Moving from rank 6 to rank 4 on a machine = +2 points"
  },

  // Tournament timing context. Drives urgency_level which shapes tone.
  "tournament_context": {
    "name": "MAW April Leaderboard",
    "days_remaining": 6,
    "urgency_level": "high"           // low / moderate / high / critical
  },

  // Strategic mode + a one-line guidance string the LLM is told to honor.
  "player_mode": "defend_lead",
  "player_mode_guidance": "Monitor rivals' vulnerable positions - protect machines where #2 is close",

  // Archetype object: type, description, strategic_implication.
  "player_archetype": {
    "type": "The Grinder",
    "description": "Plays all machines (18/19), steady improvement",
    "strategic_implication": "Broad coverage strategy is working - keep optimizing"
  },

  // The big one. Top 5 high-ROI targets, sorted by recommendation_score.
  // Each entry has ~17 fields. This is what the LLM cites in the narrative.
  "high_value_targets": [
    {
      "machine_name": "Rush",
      "player_score": 52695040,
      "top_20_average": 60306629,
      "percent_below_average": 12.6,
      "current_rank_on_machine": 7,
      "beatable_positions": [4, 5, 6],
      "scores_to_beat": [92867250, 71234100, 58901000],
      "target_rank": 4,
      "points_if_target": 3,
      "volatility": "high",
      "volatility_tip": "High variance machine - a hot ball could jump you several spots",
      "cluster_alert": "Positions 5-7 are clustered within 5% of each other",
      "recommendation_score": 0.9,
      "trend": "consistent",
      "previous_best_score": 623900060,
      "previous_best_tournament": "MAW March",
      "has_previous_higher": true       // <-- drives the "you've done this before" line
    }
    // ... up to 4 more targets
  ],

  // Top 3 machines to skip. Reason text gets quoted by the LLM.
  "low_value_machines": [
    { "machine_name": "Dungeons & Dragons", "reason": "Already rank 1, score 18% above average. Only 0 point to gain.", "skip_reason": "diminishing_returns" }
  ],

  // Point math the LLM should NOT recompute.
  "point_gap_analysis": {
    "points_to_next_rank": null,         // null because they're #1
    "achievable_points_from_targets": 26,
    "realistic_rank_improvement": 0
  },

  // The negative-instruction list. Arguably the most important field.
  "avoid_these_topics": [
    "unplayed machines (shown at top of report)",
    "rival details (shown in rivals section)",
    "generic advice like 'practice more' or 'be consistent'"
  ]
}

Every field is pre-computed by the API. The LLM does not calculate point gains, does not check who beat whom, does not judge whether a 12.6% gap is “below average enough to mention.” All of that is settled before the prompt is even built. The model’s job is one thing: write English from this JSON.


$ tree subsystems/

Four Subsystems Build the Context

Context Enrichment Pipeline: From Raw Scores to Coaching Narrative Diagram showing how raw pinball scores in PostgreSQL flow through four enrichment subsystems (Player Archetypes, Player Mode, High-Value Targets, Tournament Urgency) into a structured ai_context object, which is fed to an LLM that produces a personalized coaching narrative. $ cat context-enrichment-pipeline.svg ▸ RAW SCORES → STRUCTURED CONTEXT → LANGUAGE MODEL DATA ENRICHMENT AI PostgreSQL Raw Scores high_scores_archive leaderboard_cache Player Archetypes Grinder · Specialist · Balanced based on machine diversity % Player Mode defend_lead · crunch_time optimization · growth_focus High-Value Targets ROI · cluster · volatility recommendation score 0.0–1.0 Tournament Urgency days remaining → urgency low · moderate · high · critical ai_context structured JSON, pre-computed archetype player_mode high_value_targets[] days_remaining urgency_level avoid_these_topics[] LLM prompt + ai_context “sports analyst” tone Personalized Coaching Narrative “Rush — your 52M is well below your 623M from last tournament. A push past 92M nets +3 points. The cluster at ranks 5–7 means a hot ball could jump you spots.” specific scores · previous history · ROI
Four subsystems compute the strategic intelligence; the LLM converts it into English.

The ai_context object is built by four small subsystems, each of which turns raw scores into a single piece of strategic intelligence.

Player Archetypes classifies the player based on machine diversity. Played 80%+ of the available machines? You’re “The Grinder” — broad coverage, steady improvement. Played 30% or fewer? You’re “The Specialist” — deep on a few favorites. Anywhere in the middle, you’re “The Balanced Player.” Three categories. About forty lines of code. The simplicity is the point — the AI doesn’t need a personality test, it needs a one-word handle.

Player Mode is more situational. It looks at your current rank, the days remaining in the tournament, and your rank history, then picks a mode: defend_lead (you’re #1, protect what you have), crunch_time (≤5 days left, take only quick wins), growth_focus (you’re in the bottom half, every climb matters), optimization (default mid-pack, chase the best ROI), or new_player (no history, play everything). Each mode comes with a one-line strategic guidance string. The mode is what determines whether the narrative reads “go for the upset” or “play it safe and protect your top scores.”

High-Value Targets is the ROI engine. For every machine the player has touched, it computes a recommendation score from 0.0 to 1.0 based on five signals: how far below the top-20 average their score sits, how many points they’d gain at a realistic target rank, whether there’s a cluster of opponents within 5% (easier to leapfrog multiple players at once), how volatile the machine’s scores are (high coefficient of variation means a hot ball can jump you several spots), and whether their recent trend is improving. Anything scoring 0.4 or higher gets flagged as a high-value target. The top three go into the prompt.

How the recommendation score works

Each machine the player has touched gets a recommendation_score from 0.0 to 1.0. Anything ≥ 0.4 is flagged as a high-value target and is eligible to appear in the prompt. The score is built from five additive components:

ComponentBonusTrigger
Below-average score+0.3percent_below_average > 10 (room to improve)
Significant point gain+0.3points_if_target ≥ 2 (worth the effort)
Cluster alert+0.2≥2 adjacent ranks within 5% (leapfrog opportunity)
High volatility+0.1coefficient of variation > 40% (upset potential)
Improving trend+0.1recent scores trending up

Worked example: GoofyGoober on Rush

GoofyGoober is currently rank 7 on Rush with 52,695,040. The top-20 average on that machine is 60,306,629. Let’s walk the score:

  • Below-average score? 12.6% below the top-20 average. That’s > 10. +0.3.
  • Significant point gain? Beatable position is rank 4 (with the right score). Going from rank 7 to rank 4 = +3 points. That’s ≥ 2. +0.3.
  • Cluster alert? Yes — positions 5, 6, and 7 are all within 5% of each other. One above-average ball could jump him past two players at once. +0.2.
  • High volatility? Rush is a CV-of-the-roof machine — wide swings between hot and cold games. +0.1.
  • Improving trend? Trend is “consistent,” not “improving.” +0.0.

Total: 0.9. Easily over the 0.4 threshold, and high enough to land in the top 3 targets that get cited by name in the LLM prompt.

The opposite case is just as informative. A machine where GoofyGoober is already rank 1 and 18% above average gets shunted into low_value_machines with the reason "Already rank 1, score 18% above average. Only 0 point to gain." That string flows straight into the LLM’s “skip these” list, which is why the narrative correctly tells him to not spend reps on Dungeons & Dragons.

The whole calculation is about thirty lines of Python. The point isn’t that the formula is clever — it’s that the formula is fixed. The LLM never sees the raw inputs. It sees the verdict.

Tournament Urgency is the simplest of the four — it converts days-remaining into an urgency level: low, moderate, high, critical. This shapes the AI’s tone. With 12 days left, the narrative reads like a season-long strategy session. With 2 days left, it reads like a fire drill.

Why pre-compute the math instead of letting the model figure it out? Because LLMs are great at language and terrible at arithmetic. If you hand a model raw scores and tell it “find the best targets,” it will hallucinate point calculations with confident, professional-sounding wrongness. Pre-compute the conclusions, feed it the conclusions, let it write English. That’s the division of labor.

The output of all four subsystems gets folded into a single ai_context object. That object is the prompt’s payload.


$ systemctl status narrative-generator

The Daily Pipeline

Narrative Generation Workflow: Daily n8n Pipeline Diagram of the n8n workflow that runs daily at 6 AM. It fetches the top 20 players, then loops through each one: fetches their strategy data with AI context, builds a prompt, calls the LLM, saves the narrative to the database, waits 2 seconds, and continues to the next player. $ cat narrative-generation-workflow.svg ▸ DAILY 6 AM CST · n8n · ~2–3 MINUTES TOTAL Loop · ×20 players Daily 6 AM Trigger cron 0 6 * * * Get Top 20 Players HTTP GET /report-ready Fetch Strategy + ai_context HTTP GET ?include_ai… Build Prompt data-driven Code node JS template LLM generate ~3–5s latency Save to DB narratives HTTP POST /narrative Wait 2s rate limit sleep 2000ms next player ↑ After 20 iterations: 20 fresh narratives cached in player_strategy_narratives table. Total run time ~2–3 min · LLM cost effectively free (free-tier covers 20 RPD)
Seven nodes, one loop, ~2–3 minutes a day. Twenty fresh narratives cached before the bar opens.

The whole thing runs in n8n. One workflow, seven nodes, fires every morning at 6 AM Central:

Daily 6 AM Trigger

Cron schedule, fires once per day.

Get Top 20 Players

HTTP GET to /api/players/report-ready, returns the list of players to process.

Loop, once per player

Fetch strategy data with ?include_ai_context=true → build prompt (Code node) → call the LLM (~3–5s) → HTTP POST the narrative to /api/player/:name/narrative (upserts on (player_id, event_code)) → wait 2 seconds and continue.

The whole 20-player run finishes in about 2 to 3 minutes. By 6:03 AM every day, every top-20 player has a fresh, personalized narrative cached in the player_strategy_narratives table. When they walk up to the kiosk later that night and open their report, the page just reads the cached row — no LLM call at view time, no waiting, no failure surface in front of the user.

Cost: effectively zero. The free tier on the LLM API I’m using covers far more daily requests than this workflow makes. Even on the paid tier the math comes out to pennies a month for 20 narratives a day at a few hundred input tokens each.

The Build Prompt node, in full

The Build Prompt node is a Code node, not a hardcoded string. It pulls the ai_context object out of the previous HTTP node’s response and constructs a templated prompt at runtime. The template has eight sections, each mapped to specific ai_context fields:

You are a data-driven pinball strategy coach for Midwest Ale Works tournaments.

SCORING SYSTEM (memorize this):
- 1st place on machine = 100 points
- 2nd place = 90 points
- 3rd place = 85 points
- 4th+ = 88 - rank (so 4th=84, 5th=83, 6th=82, etc.)
- Rank 88+ = 0 points
- Total score = sum across all machines played

PLAYER PROFILE:
Name: {player_name}
Rank: #{current_rank} with {total_points} points
Games played: {games_played}/{total_machines}
Archetype: {archetype.type} — {archetype.description}
Strategic implication: {archetype.strategic_implication}

PLAYER MODE: {player_mode}
Guidance: {player_mode_guidance}

TOURNAMENT CONTEXT:
{tournament.name}
Days remaining: {tournament.days_remaining}
Urgency: {tournament.urgency_level}

CRITICAL — DO NOT MENTION THESE (the report already shows them):
- Unplayed machines — SHOWN AT TOP OF REPORT
- Rival head-to-head records — SHOWN IN RIVALS SECTION
- Generic advice ("practice more", "be consistent", "play more games")

HIGH-VALUE TARGETS (data-backed recommendations):
{for each of top 3 targets:}
  - {machine_name}:
    Your score: {score} | Top 20 avg: {avg} ({percent_below}% below avg)
    Your rank: {rank} | Beatable: ranks {positions} at scores {scores}
    Points if rank {target_rank}: +{points_if_target} points
    Volatility: {volatility} - {volatility_tip}
    PREVIOUS BEST: {previous_best_score} in {previous_best_tournament}
      (you've proven you can do better!)   <-- only if has_previous_higher
    CLUSTER ALERT: {cluster_alert}         <-- only if cluster_alert is set

LOW-VALUE MACHINES (skip these):
{for each of top 2:}
  - {machine_name}: {skip_reason}

POINT PATH:
- Points to catch next rival: {points_to_next_rank}
- Achievable from high-value targets: {achievable_points_from_targets}
- Realistic rank improvement: {realistic_rank_improvement} positions

OUTPUT FORMAT (follow exactly):
1. Opening paragraph (2-3 sentences): address player by name, acknowledge
   their rank/archetype, set the strategic context.
2. Machine recommendations as BULLET POINTS — game name FIRST, bolded.
   If has_previous_higher, mention it as motivation.
3. Closing summary (1-2 sentences): total potential point gain.

WRITE 150-400 WORDS FOLLOWING THIS EXACT FORMAT.

TONE: Sports analyst breaking down film — specific, numbers-driven,
actionable. Do NOT use emojis. Return ONLY the strategy message text,
no JSON or formatting.

A few things worth pulling out.

The “DO NOT MENTION” section is positive instruction by negation. The page already shows unplayed machines and rival head-to-head stats. Repeating them in the narrative would be redundant and would dilute the signal. The list is short and surgical: three items, all framed as “the user already saw this elsewhere.”

The output format is prescriptive on purpose. Opening paragraph, bullets with game name first, closing summary. A 150–400 word range. Without these constraints, LLMs default to either too-short (“Hit Rush harder! Good luck!”) or too-long (“Here are seventeen tips for improving your pinball play…”). The format lock keeps every player’s narrative comparable.

The “sports analyst breaking down film” tone instruction is doing more work than it looks like. That single line is the difference between a chipper coaching voice and a Tuesday-morning film-room voice. It’s why the narrative reads like ESPN and not like a wellness app.

The conditional inclusions matter. The PREVIOUS BEST line and the CLUSTER ALERT line both only render if the corresponding ai_context flag is true. The model never sees a placeholder, never sees an empty field, never has to decide whether to mention a previous tournament — it gets a clean bullet only when one is warranted. That’s how the “you crushed this with 623M last month” line ends up sounding so natural in the output. It’s not natural language understanding. It’s a conditional template.

Everything the LLM produces is downstream of this prompt. Everything in the prompt is downstream of ai_context. Everything in ai_context is downstream of the four subsystems. That’s the whole stack.

A few decisions worth calling out:

  • Top 20 only, not the full league. Anyone past rank 20 doesn’t reasonably read a daily strategy report — they’re playing for fun, not optimization. Cost and rate-limiting drop the moment you set a sensible cutoff.
  • Daily refresh, not on-demand. Stale advice is worse than no advice, but generating fresh narratives at view time would be slow and would add a failure mode (LLM down = no report). 6 AM cache write means the 8 PM kiosk read is instant and reliable.
  • 24-hour TTL on cached narratives. Anything past that gets regenerated by the next morning’s run.
  • Code-node prompt construction, not hardcoded text. The prompt is built programmatically from ai_context so we can change the strategy logic in the API without touching the workflow.

$ diff before.txt after.txt

Before and After

Here’s what gets fed to the LLM, condensed (the real payload is JSON, but this is the shape of what the model sees):

Player: goofygoober
Rank: #1 of 27 · Points: 1,662 · Games: 18/19
Archetype: The Grinder
Mode: defend_lead — Monitor rivals' vulnerable positions
Tournament: 6 days remaining · Urgency: high

High-value targets:
  - Rush     · rank 7  · score 52,695,040  · top-20 avg 60,306,629 (12.6% below)
             · previous best: 623,900,060 (last tournament) — has_previous_higher: true
             · cluster_alert: ranks 5-7 within 5% · volatility: high
             · target rank 4 = +3 points
  - Alien    · rank 4  · 18,909,460 vs avg 34,927,669 (45.9% below)
             · previous best: 152,149,250 · target rank 1 = +16 points
  - Pulp Fiction · rank 5 · 1,499,180 vs avg 3,984,000 (62.4% below)
             · previous best: 7,021,090 · target rank 2 = +7 points

Avoid these topics: unplayed_machines, rival_details (already shown on report)

Output format: opening paragraph, bullet points (game name first, then advice),
closing summary. Tone: sports analyst breaking down film. 150-400 words.

And here’s what the LLM produces from that, verbatim from the production database:

Close-up of the AI Strategy Advisor section as rendered on the kiosk: bullets per machine with specific score targets, point gains, and previous-tournament references
The same narrative as it appears on the kiosk — rendered for the player, not the developer.

Alright, goofygoober, with only seven days left in the MAW April Leaderboard, we’re in crunch time. Your #1 position and Grinder archetype strategy have served you well across 18 machines. Now, it’s about extending that lead by targeting specific machines where significant point gains are within reach.

  • Rush — Your current Rank 7 (81 points) with 52,695,040 is well below your proven potential. A strong game pushing past 92,867,250 for Rank 4 would net you +3 crucial points (84 points). Remember, you crushed this machine with 623,900,060 in MAW March, proving you can hit that mark again. The cluster at ranks 5-7 means a hot ball here could make a big difference.
  • Alien — Currently at Rank 4 (84 points) with 18,909,460, this machine represents your biggest immediate upside. Aim for Rank 1 with a score of 95,255,720, which would deliver a massive +16 points (100 points). Your previous best of 152,149,250 in MAW March shows this is entirely achievable.
  • Pulp Fiction — Your Rank 5 (83 points) score of 1,499,180 is another prime target. Improving to Rank 2 with 12,812,981 would add +7 points (90 points). You’ve posted 7,021,090 on Pulp Fiction previously, indicating you have the skill to climb these ranks.

By focusing on these high-value targets, you have an achievable gain of 26 points, further solidifying your #1 position. Let’s close out the tournament strong and finish on top.

A few things worth noticing.

It mentions his 623M score on Rush from last tournament. That’s not the LLM remembering — language models do not “remember” anything between calls. That’s the has_previous_higher flag and the previous_best_score field, which the API computes by joining against the historical scores table and injects into the prompt. The model is reporting it back, not recalling it.

It mentions the cluster at ranks 5–7. That’s the cluster-detection step, also pre-computed.

It does not mention the unplayed machines, the rival head-to-head records, or the consistency ratings. Those are already shown elsewhere on the page. The avoid_these_topics field told the AI to skip them.

The “avoid_these_topics” field is the most underrated part of this design. Telling the AI what not to say is as important as telling it what to say. The report already shows unplayed machines and rival details — repeating them in the narrative would waste the reader’s attention and make the AI feel redundant. A tiny field, a big quality lift.

The narrative reads like a knowledgeable friend who’s been studying your play. It is, in fact, a structured-data pipeline plus a stateless function call.


$ uptime

The Payoff

Six days into this tournament, every top-20 player has a fresh narrative every morning before the bar opens. None of them know about recommendation scores or cluster detection or coefficient of variation. They just know that every morning, their report has new advice, and the advice references specific machines with specific score targets and specific previous results. That’s the whole product.

This is the third article in what’s becoming a recurring theme on this blog: AI doing useful things with real data. Not chatbot demos. Not toy projects. Real users, real numbers, real value. The Claude Coach skill was the first. Claude memory was the second. This is the third. The pattern is the same in all three: invest the engineering effort in the data layer, and the AI layer becomes almost embarrassingly thin.

It’s also a continuation of the hands-off philosophy from the ingestion pipeline article. Scores ingest themselves. Narratives generate themselves. Dennis doesn’t touch any of it. I don’t either, most days. The infrastructure runs, the players play, and at 6 AM every morning a small workflow does the equivalent of a sports analyst’s pre-game prep, twenty times in two minutes, for the cost of approximately nothing.

If you want to see it in action, walk up to the kiosk at Midwest Aleworks — or visit pinball.thedelay.com, open the hamburger menu, and tap Player Strategy Report. Pick any name from the dropdown. Read what an LLM had to say about them this morning, before anybody woke up.

Similar Posts

Leave a Reply