Claude Code Insights — Refresh

182 sessions across 5 project trees | 2026-03-14 to 2026-04-13 (30 days)

At a Glance

What's new since last report: The biggest shift is subagent delegation — 85 invocations in 30 days, where the previous window had effectively zero. Explore (48), Plan (21), and the project-specific bot-dev agent (6) now do the heavy lifting on research, design, and Telegram-bot edits while you steer at a higher altitude. Today you closed the window with a Gmail-to-vault email pipeline built end-to-end via two bot-dev sub-agent dispatches. Impressive Things You Did →

What's working: Your memory system is now load-bearing — ~/.claude/projects/-root/memory/ holds MEMORY.md plus 20+ project_*.md / feedback_*.md / tool_*.md notes that survive across sessions. You've codified six recurring corrections as feedback_*.md files (test persistence, duplication debt, adaptive-thinking config, Gemini model policy, API-vs-Max billing, Claude Code model policy) so each lesson is captured exactly once. New Usage Patterns →

What's hindering you: Long sessions that time out or hit cache cycling — the largest 2026-04-13 session is 1.4 MB / 1,379 KB of JSONL alone, and three sessions exceed 5 MB. Tool errors are dominated by misformatted edits and command failures (251 in window). The original "wrong VPS" friction has eased thanks to ssh aliases (website, yaskawa, cloos-bot, qc-bot) but a new flavor has emerged: duplicated code drifting in parallel across multiple bot codebases (vault_index_file in 4 places). Where Things Go Wrong →

Quick wins to try: You already use subagents and skills heavily. The two you've barely touched are scheduled/triggered automation (cron jobs are still hand-rolled bash scripts, not Claude triggers) and worktrees for parallel branches on the bigger refactors like the pending vault_indexing.py extraction. Features to Try →

182

Sessions

~58h

Active Time

691

User Msgs

Subagents

Days

124 MB

Session Data

What You Work On

The project mix is more concentrated than the prior window. Telegram bot work dominates, with the email-ingest build (today) and the Cloos/Teqram bot WhatsApp pipelines being the heaviest single threads. The Yaskawa cobot has moved from setup to active monitoring. New: weld_viz 3D simulator iterations show up as a recurring side-quest.

Telegram Bot Core (main VPS) ~16 sessions

Continued evolution of the Voice/Claude Telegram bot at /root/develop/telegram_connection. Highlights: BGE-M3 embedding migration (384→1024-dim), worklog-aware vault chunking, hybrid memory search with company-name keyword boost, Media Pipeline v2 (Gemini 3 Flash photo/video/audio/PDF), self-heal v2, adaptive-thinking config fix. Today's session built the new email_ingest.py Gmail-to-vault pipeline as a self-contained module.

QC Vault Bot & Inspection Reports ~7 sessions

QC bot moved from main VPS to Cloos VPS, gained NDT report generation (.md + .pdf via fpdf2 + .xlsx via openpyxl with paired left/right photo evidence matching inspector style), per-photo ChromaDB indexing with defect/method metadata, Dutch TTS via gTTS, and active-project tracking with inline keyboards. DAVIT CRANES 25380 (Seasight, 224 photos) and germannbridge are the live test projects.

Cloos & Teqram WhatsApp Knowledge Bots ~6 sessions

Two more Telegram bots co-deployed on the Cloos VPS. Cloos bot ingested 4,326 WhatsApp messages from the welding-team channels into 1,763 ChromaDB docs. Teqram bot followed with 982 docs from the grinding service desk channel using Gemini Embedding 2 (768-dim). Video and document ingestion added mid-March. Three bots, one VPS, three separate venvs / configs / ChromaDBs — a multi-tenant pattern you've now stabilized.

Yaskawa Cobot (ROS 2 / MoveIt) ~4 sessions

Yaskawa VPS (16GB, 150GB disk) running ROS 2 Jazzy + MoveIt 2 + micro-ROS Agent for the HC10DTP cobot + YRC1000 controller. Phase 1 monitoring dashboard work and FPX multi-layer welding (31 passes) prep. Also re-ingestion experiments using Gemini 3 Flash on WhatsApp channels, paused at 3/8 done.

Website / Floorplan / Weld Viz ~5 sessions

Floorplan app for Iemants now production at tasks.arnereabel.com/floorplan/ behind nginx + Cloudflare tunnel. Weld Viz 3D welding physics simulator iterated from v2 to v18 (THREE.js + bloom + GLSL shaders) — 18 saved checkpoints on disk. Cloos Maintenance App debugged via the debugger subagent.

Infrastructure & Health Monitoring ~4 sessions

Health-check coverage expanded to all 5 VPSes. /root/health-check.sh runs every 5 min; daily /root/security-summary.sh at 08:00 UTC; on-demand /status command in the Telegram bot. UFW + fail2ban active. Hermes Gateway formally retired (disabled 2026-04-03 on QC, 2026-04-05 on main).

Top Tools Used (in window)

Bash

1886

Read

521

Edit

517

Grep

143

Write

135

WebFetch

TaskUpdate

Agent (Task)

Subagent Mix (85 invocations) NEW

Explore

Plan

general-purpose

bot-dev

debugger

vps-ops

claude-code-guide

Files Touched by Language

Python

433

HTML

326

Markdown

257

Shell

CSS

JSON

Token Volume (in window)

Cache reads

1.0B

Cache creation

23.7M

Output

2.1M

Input (uncached)

66.8K

99.9% of input is cache hits — your sessions reuse context efficiently.

How You Use Claude Code

Compared to three weeks ago, your operating mode has shifted up a level of abstraction. The previous report described you as a hands-on builder who threw Claude into the deep end with ambitious multi-step goals and supervised closely. That's still true at the top — but underneath, you now delegate to subagents. 85 invocations in 30 days, with Explore (48) replacing the manual "Claude, find X across the filesystem" prompts you used to write yourself, and Plan (21) catching design mistakes before code gets written. The project-specific bot-dev subagent took two of today's three biggest sub-tasks: building the email pipeline scaffolding, then adding the image gate + live test once you'd approved the design.

The other big shift: memory is now infrastructure, not an experiment. ~/.claude/projects/-root/memory/ holds 20+ files indexed from MEMORY.md. Six of them are explicit feedback_*.md notes — corrections you made to Claude's behavior, written down so you don't have to make them again. feedback_persist_tests.md exists because three test suites were deleted "after they passed"; that lesson now lives in your soul. feedback_duplication_debt.md exists because vault_index_file drifted across four files and someone (you) noticed. The bot also writes back to memory autonomously via the SessionStop hook.

Mechanically: 1,886 Bash invocations still dominate, but the Bash:Edit ratio (3.6:1) is much higher than the prior window's 4.8:1 — meaning when you do edit, you edit more. The largest single session in window is 1.4 MB of JSONL (today's email pipeline build); three sessions exceed 5 MB. You operate primarily in two windows: morning Brussels time (peak hour 9–11 UTC) and evening (peak 19–22 UTC). Median user response time is 72.5s — you're not babysitting, you're checking in.

Key pattern shift: You moved from "Claude as a tool I drive" to "Claude as a small team I direct" — Explore for research, Plan for design, bot-dev for implementation, with you holding the spec and rejecting bad turns.

User Response Time Distribution

2-10s

10-30s

30s-1m

128

1-2m

141

2-5m

121

5-15m

>15m

Median: 72.5s • Mean: 445.0s • The fat tail (40 responses > 15 min) is real-world interleaving — you let Claude run while you fly to other windows.

Multi-Clauding (Parallel Sessions)

Overlap Events

Sessions Involved

44%

Of Sessions

Multi-clauding nearly tripled vs. the prior window (5 → 22 overlap events). Half your sessions now overlap with at least one other — a sign you've gotten comfortable spawning a side-session for unrelated lookups while a long build runs.

User Messages by Time of Day

Session Size Distribution (KB)

< 50 KB

50–250 KB

250 KB–1 MB

1–5 MB

> 5 MB

39 sessions in -root alone. Largest: 16 MB (April 4, mixed multi-VPS work).

Impressive Things You Did

182 sessions across 30 days. A handful are worth calling out for what they reveal about how you're operating now — especially today's email pipeline, which is the cleanest example yet of the new "spec → subagent → review → ship" workflow.

Today (2026-04-13) — Gmail-to-Vault Email Ingest, end-to-end in one session FLAGSHIP

Built email_ingest.py (~900 lines, self-contained — does not import bot.py) plus tests/test_email_ingest.py with 62 passing checks. Triggered on a claude-ingest[:company] subject keyword (you tried Gmail labels first, the UX failed first-contact testing, you switched mid-build). Includes an Outlook-forward parser that walks the deepest nested original sender, <mailto:..> cleanup, a 20 KB image gate (5 KB at first, bumped after a WindEurope banner snuck through), company override (cloos|yaskawa|teqram|iemants|general), body-SHA1 content dedup with a 4-doc backfill of legacy ChromaDB entries, and a DST-aware hourly cron that gates by TZ=Europe/Brussels date +%H at 08/12/18 BT. Failure tracking: 3 failed cycles → claude-ingest-failed label + Telegram alert.

What's notable isn't the feature — it's how you built it: two delegations to the bot-dev sub-agent (build → test → image-gate iteration), iterative course-corrections grounded in real-world testing rather than spec-perfection (label → subject, 5 KB → 20 KB, three follow-on body-cleanup fixes), and rollback discipline (vault + ChromaDB + Gmail labels snapshotted then restored before re-processing). The intentional duplication of the Gemini analyzer is documented for future unification in project_email_ingest_unification.md — debt logged, not hidden.

BGE-M3 embedding migration with worklog-aware chunking

Migrated the main bot's vault collection from MiniLM-L6-v2 (384-dim) to BGE-M3 (1024-dim) on April 3. Beyond the dimension change, you added worklog-aware chunking that splits on ## DD/MM/YYYY + ### Company headers and stores date / company / week_num as metadata, and a hybrid memory search with a +0.15 keyword boost when company names match in tags or content. Practical wins, not academic: vault_search now returns the right Yaskawa entry when you query "ring connections 25380" instead of the closest semantic neighbor.

Three bots, one VPS — multi-tenant Cloos box

Cloos VPS (4GB RAM) now hosts three independent Telegram bots: Cloos welding (1,763 ChromaDB docs from 8 WhatsApp channels), Teqram grinding (982 docs, Gemini Embedding 2 768-dim, includes 14 ingested documents — 4 PDFs, 2 .DOC procedures, 1 .nin tool config, 1 .ply 3D scan), and the QC Vault bot (~440 photo + worklog docs, NDT report generation in .md/.pdf/.xlsx). Three separate venvs, three configs, three ChromaDBs — a multi-tenant pattern you stabilized without containerizing.

QC Vault NDT reports matching inspector style

The qc_ndt_report tool now generates three artifacts in one call: Markdown summary, fpdf2 PDF with Liberation Sans + embedded photos, and an openpyxl XLSX from a real inspection-report template (in vault/excel_example/). The XLSX cross-references issues in section 7 to evidence in section 10 with paired left/right photos and inspector-style descriptions — close enough that gmail_send with attachments ships all three to the client without manual edits. Active on DAVIT CRANES 25380 (Seasight, 224 photos) and 27390 germannbridge.

Five-VPS health-check fabric

/root/health-check.sh runs every 5 minutes against all 5 VPSes (Main, QC, Website, Yaskawa, Cloos), auto-restarts failed services, and alerts via Telegram with a 30-min cooldown to avoid alert spam. Daily security summary at 08:00 UTC. Backup-age checks gated to once-daily at 09:00 UTC (30h threshold) so they don't false-fire on the 24h cron skew. UFW + fail2ban enabled. The original report's "wrong VPS" friction class has effectively gone away thanks to ssh aliases.

Where Things Go Wrong

The friction classes have shifted. The old "wrong VPS" and "non-existent commands" issues have largely resolved (ssh aliases + a much fuller CLAUDE.md). New failure modes are about scale — duplicated code, oversized sessions, and lessons that need codifying because they keep being relearned.

Duplication debt across bot codebases NEW

You now run five bot processes across three VPSes that share architectural patterns but not code. vault_index_file exists in 4 separate copies; _prune_stale_* in 2. Today's email pipeline intentionally added a sixth Gemini-analyzer copy with documented unification trigger (project_email_ingest_unification.md) — debt logged, not hidden. The risk: a fix to one copy doesn't propagate.

feedback_duplication_debt.md warns: "don't perpetuate while 'fixing'; unify names or extract first."
The pending vault_indexing.py extraction (~1.5–2h, playbook in project_next_vault_indexing_extraction.md) collapses 4 copies of vault_index_file + 2 copies of _prune_stale_*.

Lessons that keep being relearned

Six of your memory files are feedback_*.md — explicit corrections to recurring Claude behavior. Each one is a lesson learned the hard way that cost real time before being codified. Notable: feedback_persist_tests.md (3 test suites deleted "after they passed" — now codified rule: tests go in tests/, always). feedback_adaptive_thinking_config.md (the field is output_config.effort, not budget_tokens — verified against Anthropic docs 2026-04-05).

feedback_claude_code_model.md: bots shelling out to claude CLI must use Opus 4.6 (Max plan, no API cost); Sonnet downgrade only applies to direct API calls.
feedback_api_vs_max_billing.md: the main bot uses direct API (billed per token), NOT Max — Max only covers claude CLI subprocess path.
feedback_gemini_models.md: vision/content uses Gemini 3 Flash, never 2.5 Flash. Embeddings: Gemini Embedding 2 (768-dim).

Oversized sessions strain the cache

3 sessions in window exceed 5 MB of JSONL; the largest is 16 MB. Today's email pipeline session alone hit 1.4 MB. Long sessions show up as bigger response-time tails (40 responses > 15 min) and the occasional Anthropic overloaded_error. The 99.9% cache-hit rate softens the cost, but past ~5 MB you're paying real latency every turn.

The prompt for this very report explicitly warned about overloaded_error from a prior attempt — likely a session-size symptom.
1.0B cache-read tokens in 30 days vs 23.7M cache-creation tokens — context is being recycled aggressively but not always at the right granularity.

Tool errors are now mostly Edit/Bash mismatches, not environment confusion

251 tool-result errors in the window. Composition has shifted: failed Edits (string-match failures because the file changed mid-session), Bash command_failed (exit codes from probe commands like systemctl status on a non-running service), and the occasional Gemini API 503. Almost no "wrong VPS" or "command not found" errors — those got designed out by the ssh aliases and CLAUDE.md inventory.

Edit failures concentrate in long sessions where you've made manual edits between Claude turns — Read-then-Edit on every change would catch it but you mostly accept the retry.
The prior report's pain points (OAuth re-auth flow, sed escaping) no longer appear — the OAuth token flow lives in /root/.config/google/ as a one-time setup; sed-over-SSH was replaced by writing files locally then scping.

Existing CC Features to Try

Suggested CLAUDE.md Additions

Just copy this into Claude Code to add it to your CLAUDE.md.

## Subagent Routing
- Default to Explore for any 'find / search the codebase' task before reading files yourself.
- Default to Plan before writing any feature larger than a single function.
- Use bot-dev for any change inside /root/develop/telegram_connection or the QC/Cloos/Teqram bot dirs.
- Use debugger for live-app investigation behind a URL or service log.
- vps-ops for any cross-VPS health/restart/disk action.

You already use these — codifying the routing will stop "should I delegate?" deliberation mid-session.

## Refactor Triggers
- When adding the Nth copy of an existing function (N >= 2), create a project_*_unification.md memory file before merging the change. Include: file paths of each copy, what diverged, and the unification trigger (e.g. 'when a 4th copy is needed').
- Pending: vault_indexing.py extraction — playbook in project_next_vault_indexing_extraction.md.

Today's email pipeline already follows this rule informally — write it down so it survives.

## Session Discipline
- If a session JSONL exceeds ~3 MB or 90 minutes, save a checkpoint memory and start a fresh session for the next milestone.
- Use /clear when switching projects mid-session, not when context is full — context-full switches are too late.

3 sessions in window exceeded 5 MB; cache hits don't fully save you past that size.

You already use subagents, hooks, skills, and headless mode heavily. Here's what's still missing from your repertoire.

Triggers (scheduled remote agents)

Cron-like Claude Code agents that run on a schedule, no shell wrapper

Why for you: Your cron jobs (vault sync, ChromaDB backup, weekly summary, health check) are bash scripts that call other tools. Triggers let you schedule a Claude Code agent directly — useful for the "weekly worklog summary" job which currently shells out to a one-off Sonnet API call. Cleaner failure handling, output goes to your Telegram via the same hook chain.

claude trigger create --name weekly-summary \
  --cron "0 22 * * 0" \
  --prompt "Read worklog_week_$(date +%V).md from /root/obsidian-vault, write a 5-bullet summary to Notes/summary_week_$(date +%V).md, and ping Telegram user 8594455361 with a one-line digest."

Worktrees for parallel branches

Spin up isolated working dirs for risky refactors without polluting the main checkout

Why for you: The pending vault_indexing.py extraction touches 4 files in /root/develop/telegram_connection. Doing it in a worktree lets you keep the main bot running on the current code while you build the extraction in isolation, then swap atomically. Same for the email pipeline → bot.py unification when that day comes.

cd /root/develop/telegram_connection
git worktree add ../telegram_connection-extract-vault-indexing extract/vault-indexing
cd ../telegram_connection-extract-vault-indexing
claude  # run the extraction here without touching the live bot's checkout

Output Styles

Configure Claude's response shape per project (terse, structured, code-only)

Why for you: Your CLAUDE.md soul says "Arne prefers short answers" but Claude still occasionally over-explains. An output style enforces it at the harness level — useful for the bot codebases where you want diff-only responses on small fixes.

mkdir -p ~/.claude/output-styles && cat > ~/.claude/output-styles/terse.md <<'EOF'
You are working with Arne. Default to:
- One-line confirmations on small edits ("Done: bumped image gate to 20480.")
- No re-stating the request back
- No "Let me..." preambles
- Bullet lists for multi-step results, never paragraphs
EOF

MCP servers for the integrations you already wrote yourself

Expose Gmail / Calendar / Drive / Sheets / vault as MCP tools instead of bot-side tool functions

Why for you: Your Telegram bot has 30 tools — many of them (Gmail search/send, Calendar list/create, Drive upload, Sheets read/write, vault save/search) duplicate things that exist as MCP servers. Wrapping them once as MCP would give Claude Code direct access in your terminal sessions, not just inside the bot. The Gmail/Calendar MCPs already exist (they're in your deferred-tool list).

# Connect the Anthropic Gmail MCP (already deferred-available in your env):
claude mcp list                  # see what's already wired
claude mcp add claude_ai_Gmail   # one-time auth — then `gmail_send` works in any CC session

New Usage Patterns

Patterns that emerged in this window, ordered by impact.

Spec → bot-dev subagent → review → ship NEW

For Telegram-bot work, dispatch the implementation to the bot-dev subagent with a tight spec, then review and iterate.

Today's email pipeline used this twice in succession: build → review → "approve, also add image gate, then live test." The subagent owns the multi-file changes; you own the spec and the merge call. This is qualitatively different from the prior window's "drive every step" pattern. 6 bot-dev dispatches in 30 days.

Paste into Claude Code:

Use the bot-dev subagent. Build a new self-contained module at /root/develop/telegram_connection/.py that does X / Y / Z. Constraints: do NOT import bot.py, write tests in tests/test_.py before declaring done, log any intentional duplication of existing helpers in a project_*_unification.md memory file. Report back with a summary and file paths only.

Codify recurring corrections as feedback_*.md memory files NEW

When you correct Claude on the same thing twice, write a feedback_*.md note in ~/.claude/projects/-root/memory/.

Six exist now (test persistence, duplication debt, Gemini models, adaptive thinking, API-vs-Max billing, Claude Code model policy). The SessionStart hook injects them into context, so the correction sticks across sessions instead of being relearned. This is the single highest-leverage pattern in your repertoire — every entry pays back ~1h of future debugging.

Paste into Claude Code:

I just had to correct you twice on . Write a feedback_.md memory file in /root/.claude/projects/-root/memory/ with: 1) the wrong behavior, 2) the correct behavior with citation, 3) why this matters in our codebase. Then save it to ChromaDB with tags=feedback, --priority 5.

Snapshot before re-processing, dedup over delete NEW

Before any pipeline that mutates ChromaDB / vault / Gmail labels in bulk, snapshot the targets and dedup on content hash rather than deleting first.

Today's email pipeline snapshotted the vault, ChromaDB collection, and Gmail label state before re-processing, then used body-SHA1 (whitespace + case normalized) to dedup against the existing 4 docs. No data loss possible. This is now the implicit playbook for any ingestion change.

Paste into Claude Code:

Before we re-run the : snapshot the current state of , dump the ChromaDB  to a JSON, and save the current Gmail label list. Save all three to /tmp/snapshot-$(date +%s)/. Then run the pipeline with content-hash dedup against existing docs — don't delete first.

SSH aliases as the substrate, not commands

All cross-VPS work is now ssh website / ssh yaskawa / ssh cloos-bot / ssh qc-bot — no IP addresses in prompts.

The prior report flagged "wrong VPS" as the #1 friction class. That class is gone. Aliases in ~/.ssh/config + a CLAUDE.md inventory mean Claude knows which box runs what. Side-effect: prompts are shorter and less error-prone — "ssh yaskawa to check the rosbot logs" beats "ssh [email protected]".

Iterative course-correction over upfront spec perfection

Build, test against reality, change the spec, build the next layer.

Today's email pipeline shipped Gmail labels first; you tested it, the UX failed, you switched to subject keywords mid-build. The 5 KB image gate let a WindEurope banner through; you bumped it to 20 KB. None of this was on the original spec, none of it slowed you down. This isn't a new pattern but it's now the dominant one — and pairs naturally with the bot-dev subagent's tight build-test-report loop.

On the Horizon

You're past the "build the infrastructure" phase. Three concrete projects on the active list that the next 30 days could ship.

vault_indexing.py extraction (the first proper refactor)

vault_index_file lives in 4 files; _prune_stale_* in 2. You have a written playbook (project_next_vault_indexing_extraction.md), tests in place, and a ~1.5–2h estimate. This is the cleanest "Plan → Explore → bot-dev in a worktree" candidate in the repo. Doing it well sets the pattern for the email-pipeline → bot.py unification later.

Getting started: Open a worktree (git worktree add ../telegram_connection-extract-vault-indexing), dispatch Plan to draft the module API from the existing 4 callsites, then bot-dev to extract, leave the playbook open in the chat as the success criterion.

Yaskawa cobot Phase 2 — full ROS 2 implementation

Phase 1 (monitoring + MoveIt) is in place on the 16 GB Yaskawa VPS. Phase 2 is the actual FPX multi-layer welding control — 31 passes, real cobot motion, micro-ROS Agent talking to the YRC1000. The CLAUDE.md flag is there ("Phase 2: full ROS 2 implementation"). This is the project that will most exercise the debugger subagent (real hardware, real failure modes, real logs).

Getting started: Dispatch the debugger subagent on the existing micro-ROS Agent logs to baseline the current behavior before adding the FPX motion layer. Keep a project_yaskawa_phase2.md running playbook so the inevitable hardware-related re-runs don't lose state.

Email pipeline → bot.py unification (when the time comes)

Today you intentionally duplicated the Gemini analyzer in email_ingest.py with a documented unification trigger in project_email_ingest_unification.md. When the next ingest variant lands (Slack? webhooks?), that trigger fires and you have a clean 3-way merge to do. The same worktree + Plan + bot-dev pattern as vault_indexing.py applies.

Getting started: Don't pre-merge — wait for the third caller. The cost of a 3-way merge is usually less than the cost of designing the wrong abstraction off two examples. Your existing feedback_duplication_debt.md says exactly this.

"Let's try Gmail labels." — "Gmail labels are clunky to apply on mobile." — "OK, subject-line trigger then." (mid-build, today)

The email pipeline's trigger mechanism flipped from labels to a claude-ingest[:company] subject keyword after one round of real-world testing on Arne's phone. The 5 KB image gate became 20 KB after a WindEurope banner snuck through. Three more body-cleanup fixes followed once forwarded Outlook chains showed their actual mess. Spec perfection is a myth; iterative course-correction is the actual workflow. The whole pipeline shipped, tests green, in one session.