Claude Code Insights — Refresh

182 sessions across 5 project trees | 2026-03-14 to 2026-04-13 (30 days)

Refresh of report.html. Same shape, new window. The original covered 60 sessions over three weeks ending 2026-03-14. This one picks up the day after and runs to today. New patterns since the original are tagged NEW.
At a Glance
What's new since last report: The biggest shift is subagent delegation — 85 invocations in 30 days, where the previous window had effectively zero. Explore (48), Plan (21), and the project-specific bot-dev agent (6) now do the heavy lifting on research, design, and Telegram-bot edits while you steer at a higher altitude. Today you closed the window with a Gmail-to-vault email pipeline built end-to-end via two bot-dev sub-agent dispatches. Impressive Things You Did →
What's working: Your memory system is now load-bearing — ~/.claude/projects/-root/memory/ holds MEMORY.md plus 20+ project_*.md / feedback_*.md / tool_*.md notes that survive across sessions. You've codified six recurring corrections as feedback_*.md files (test persistence, duplication debt, adaptive-thinking config, Gemini model policy, API-vs-Max billing, Claude Code model policy) so each lesson is captured exactly once. New Usage Patterns →
What's hindering you: Long sessions that time out or hit cache cycling — the largest 2026-04-13 session is 1.4 MB / 1,379 KB of JSONL alone, and three sessions exceed 5 MB. Tool errors are dominated by misformatted edits and command failures (251 in window). The original "wrong VPS" friction has eased thanks to ssh aliases (website, yaskawa, cloos-bot, qc-bot) but a new flavor has emerged: duplicated code drifting in parallel across multiple bot codebases (vault_index_file in 4 places). Where Things Go Wrong →
Quick wins to try: You already use subagents and skills heavily. The two you've barely touched are scheduled/triggered automation (cron jobs are still hand-rolled bash scripts, not Claude triggers) and worktrees for parallel branches on the bigger refactors like the pending vault_indexing.py extraction. Features to Try →
182
Sessions
~58h
Active Time
691
User Msgs
85
Subagents
30
Days
124 MB
Session Data

What You Work On

The project mix is more concentrated than the prior window. Telegram bot work dominates, with the email-ingest build (today) and the Cloos/Teqram bot WhatsApp pipelines being the heaviest single threads. The Yaskawa cobot has moved from setup to active monitoring. New: weld_viz 3D simulator iterations show up as a recurring side-quest.

Telegram Bot Core (main VPS) ~16 sessions
Continued evolution of the Voice/Claude Telegram bot at /root/develop/telegram_connection. Highlights: BGE-M3 embedding migration (384→1024-dim), worklog-aware vault chunking, hybrid memory search with company-name keyword boost, Media Pipeline v2 (Gemini 3 Flash photo/video/audio/PDF), self-heal v2, adaptive-thinking config fix. Today's session built the new email_ingest.py Gmail-to-vault pipeline as a self-contained module.
QC Vault Bot & Inspection Reports ~7 sessions
QC bot moved from main VPS to Cloos VPS, gained NDT report generation (.md + .pdf via fpdf2 + .xlsx via openpyxl with paired left/right photo evidence matching inspector style), per-photo ChromaDB indexing with defect/method metadata, Dutch TTS via gTTS, and active-project tracking with inline keyboards. DAVIT CRANES 25380 (Seasight, 224 photos) and germannbridge are the live test projects.
Cloos & Teqram WhatsApp Knowledge Bots ~6 sessions
Two more Telegram bots co-deployed on the Cloos VPS. Cloos bot ingested 4,326 WhatsApp messages from the welding-team channels into 1,763 ChromaDB docs. Teqram bot followed with 982 docs from the grinding service desk channel using Gemini Embedding 2 (768-dim). Video and document ingestion added mid-March. Three bots, one VPS, three separate venvs / configs / ChromaDBs — a multi-tenant pattern you've now stabilized.
Yaskawa Cobot (ROS 2 / MoveIt) ~4 sessions
Yaskawa VPS (16GB, 150GB disk) running ROS 2 Jazzy + MoveIt 2 + micro-ROS Agent for the HC10DTP cobot + YRC1000 controller. Phase 1 monitoring dashboard work and FPX multi-layer welding (31 passes) prep. Also re-ingestion experiments using Gemini 3 Flash on WhatsApp channels, paused at 3/8 done.
Website / Floorplan / Weld Viz ~5 sessions
Floorplan app for Iemants now production at tasks.arnereabel.com/floorplan/ behind nginx + Cloudflare tunnel. Weld Viz 3D welding physics simulator iterated from v2 to v18 (THREE.js + bloom + GLSL shaders) — 18 saved checkpoints on disk. Cloos Maintenance App debugged via the debugger subagent.
Infrastructure & Health Monitoring ~4 sessions
Health-check coverage expanded to all 5 VPSes. /root/health-check.sh runs every 5 min; daily /root/security-summary.sh at 08:00 UTC; on-demand /status command in the Telegram bot. UFW + fail2ban active. Hermes Gateway formally retired (disabled 2026-04-03 on QC, 2026-04-05 on main).
Top Tools Used (in window)
Bash
1886
Read
521
Edit
517
Grep
143
Write
135
WebFetch
76
TaskUpdate
41
Agent (Task)
36
Subagent Mix (85 invocations) NEW
Explore
48
Plan
21
general-purpose
6
bot-dev
6
debugger
2
vps-ops
1
claude-code-guide
1
Files Touched by Language
Python
433
HTML
326
Markdown
257
Shell
68
CSS
24
JSON
17
Token Volume (in window)
Cache reads
1.0B
Cache creation
23.7M
Output
2.1M
Input (uncached)
66.8K
99.9% of input is cache hits — your sessions reuse context efficiently.

How You Use Claude Code

Compared to three weeks ago, your operating mode has shifted up a level of abstraction. The previous report described you as a hands-on builder who threw Claude into the deep end with ambitious multi-step goals and supervised closely. That's still true at the top — but underneath, you now delegate to subagents. 85 invocations in 30 days, with Explore (48) replacing the manual "Claude, find X across the filesystem" prompts you used to write yourself, and Plan (21) catching design mistakes before code gets written. The project-specific bot-dev subagent took two of today's three biggest sub-tasks: building the email pipeline scaffolding, then adding the image gate + live test once you'd approved the design.

The other big shift: memory is now infrastructure, not an experiment. ~/.claude/projects/-root/memory/ holds 20+ files indexed from MEMORY.md. Six of them are explicit feedback_*.md notes — corrections you made to Claude's behavior, written down so you don't have to make them again. feedback_persist_tests.md exists because three test suites were deleted "after they passed"; that lesson now lives in your soul. feedback_duplication_debt.md exists because vault_index_file drifted across four files and someone (you) noticed. The bot also writes back to memory autonomously via the SessionStop hook.

Mechanically: 1,886 Bash invocations still dominate, but the Bash:Edit ratio (3.6:1) is much higher than the prior window's 4.8:1 — meaning when you do edit, you edit more. The largest single session in window is 1.4 MB of JSONL (today's email pipeline build); three sessions exceed 5 MB. You operate primarily in two windows: morning Brussels time (peak hour 9–11 UTC) and evening (peak 19–22 UTC). Median user response time is 72.5s — you're not babysitting, you're checking in.

Key pattern shift: You moved from "Claude as a tool I drive" to "Claude as a small team I direct" — Explore for research, Plan for design, bot-dev for implementation, with you holding the spec and rejecting bad turns.
User Response Time Distribution
2-10s
38
10-30s
98
30s-1m
128
1-2m
141
2-5m
121
5-15m
56
>15m
40
Median: 72.5s • Mean: 445.0s • The fat tail (40 responses > 15 min) is real-world interleaving — you let Claude run while you fly to other windows.
Multi-Clauding (Parallel Sessions)
22
Overlap Events
17
Sessions Involved
44%
Of Sessions

Multi-clauding nearly tripled vs. the prior window (5 → 22 overlap events). Half your sessions now overlap with at least one other — a sign you've gotten comfortable spawning a side-session for unrelated lookups while a long build runs.

User Messages by Time of Day
Session Size Distribution (KB)
< 50 KB
9
50–250 KB
15
250 KB–1 MB
8
1–5 MB
4
> 5 MB
3
39 sessions in -root alone. Largest: 16 MB (April 4, mixed multi-VPS work).

Impressive Things You Did

182 sessions across 30 days. A handful are worth calling out for what they reveal about how you're operating now — especially today's email pipeline, which is the cleanest example yet of the new "spec → subagent → review → ship" workflow.

Today (2026-04-13) — Gmail-to-Vault Email Ingest, end-to-end in one session FLAGSHIP
Built email_ingest.py (~900 lines, self-contained — does not import bot.py) plus tests/test_email_ingest.py with 62 passing checks. Triggered on a claude-ingest[:company] subject keyword (you tried Gmail labels first, the UX failed first-contact testing, you switched mid-build). Includes an Outlook-forward parser that walks the deepest nested original sender, <mailto:..> cleanup, a 20 KB image gate (5 KB at first, bumped after a WindEurope banner snuck through), company override (cloos|yaskawa|teqram|iemants|general), body-SHA1 content dedup with a 4-doc backfill of legacy ChromaDB entries, and a DST-aware hourly cron that gates by TZ=Europe/Brussels date +%H at 08/12/18 BT. Failure tracking: 3 failed cycles → claude-ingest-failed label + Telegram alert.

What's notable isn't the feature — it's how you built it: two delegations to the bot-dev sub-agent (build → test → image-gate iteration), iterative course-corrections grounded in real-world testing rather than spec-perfection (label → subject, 5 KB → 20 KB, three follow-on body-cleanup fixes), and rollback discipline (vault + ChromaDB + Gmail labels snapshotted then restored before re-processing). The intentional duplication of the Gemini analyzer is documented for future unification in project_email_ingest_unification.md — debt logged, not hidden.
BGE-M3 embedding migration with worklog-aware chunking
Migrated the main bot's vault collection from MiniLM-L6-v2 (384-dim) to BGE-M3 (1024-dim) on April 3. Beyond the dimension change, you added worklog-aware chunking that splits on ## DD/MM/YYYY + ### Company headers and stores date / company / week_num as metadata, and a hybrid memory search with a +0.15 keyword boost when company names match in tags or content. Practical wins, not academic: vault_search now returns the right Yaskawa entry when you query "ring connections 25380" instead of the closest semantic neighbor.
Three bots, one VPS — multi-tenant Cloos box
Cloos VPS (4GB RAM) now hosts three independent Telegram bots: Cloos welding (1,763 ChromaDB docs from 8 WhatsApp channels), Teqram grinding (982 docs, Gemini Embedding 2 768-dim, includes 14 ingested documents — 4 PDFs, 2 .DOC procedures, 1 .nin tool config, 1 .ply 3D scan), and the QC Vault bot (~440 photo + worklog docs, NDT report generation in .md/.pdf/.xlsx). Three separate venvs, three configs, three ChromaDBs — a multi-tenant pattern you stabilized without containerizing.
QC Vault NDT reports matching inspector style
The qc_ndt_report tool now generates three artifacts in one call: Markdown summary, fpdf2 PDF with Liberation Sans + embedded photos, and an openpyxl XLSX from a real inspection-report template (in vault/excel_example/). The XLSX cross-references issues in section 7 to evidence in section 10 with paired left/right photos and inspector-style descriptions — close enough that gmail_send with attachments ships all three to the client without manual edits. Active on DAVIT CRANES 25380 (Seasight, 224 photos) and 27390 germannbridge.
Five-VPS health-check fabric
/root/health-check.sh runs every 5 minutes against all 5 VPSes (Main, QC, Website, Yaskawa, Cloos), auto-restarts failed services, and alerts via Telegram with a 30-min cooldown to avoid alert spam. Daily security summary at 08:00 UTC. Backup-age checks gated to once-daily at 09:00 UTC (30h threshold) so they don't false-fire on the 24h cron skew. UFW + fail2ban enabled. The original report's "wrong VPS" friction class has effectively gone away thanks to ssh aliases.

Where Things Go Wrong

The friction classes have shifted. The old "wrong VPS" and "non-existent commands" issues have largely resolved (ssh aliases + a much fuller CLAUDE.md). New failure modes are about scale — duplicated code, oversized sessions, and lessons that need codifying because they keep being relearned.

Duplication debt across bot codebases NEW
You now run five bot processes across three VPSes that share architectural patterns but not code. vault_index_file exists in 4 separate copies; _prune_stale_* in 2. Today's email pipeline intentionally added a sixth Gemini-analyzer copy with documented unification trigger (project_email_ingest_unification.md) — debt logged, not hidden. The risk: a fix to one copy doesn't propagate.
  • feedback_duplication_debt.md warns: "don't perpetuate while 'fixing'; unify names or extract first."
  • The pending vault_indexing.py extraction (~1.5–2h, playbook in project_next_vault_indexing_extraction.md) collapses 4 copies of vault_index_file + 2 copies of _prune_stale_*.
Lessons that keep being relearned
Six of your memory files are feedback_*.md — explicit corrections to recurring Claude behavior. Each one is a lesson learned the hard way that cost real time before being codified. Notable: feedback_persist_tests.md (3 test suites deleted "after they passed" — now codified rule: tests go in tests/, always). feedback_adaptive_thinking_config.md (the field is output_config.effort, not budget_tokens — verified against Anthropic docs 2026-04-05).
  • feedback_claude_code_model.md: bots shelling out to claude CLI must use Opus 4.6 (Max plan, no API cost); Sonnet downgrade only applies to direct API calls.
  • feedback_api_vs_max_billing.md: the main bot uses direct API (billed per token), NOT Max — Max only covers claude CLI subprocess path.
  • feedback_gemini_models.md: vision/content uses Gemini 3 Flash, never 2.5 Flash. Embeddings: Gemini Embedding 2 (768-dim).
Oversized sessions strain the cache
3 sessions in window exceed 5 MB of JSONL; the largest is 16 MB. Today's email pipeline session alone hit 1.4 MB. Long sessions show up as bigger response-time tails (40 responses > 15 min) and the occasional Anthropic overloaded_error. The 99.9% cache-hit rate softens the cost, but past ~5 MB you're paying real latency every turn.
  • The prompt for this very report explicitly warned about overloaded_error from a prior attempt — likely a session-size symptom.
  • 1.0B cache-read tokens in 30 days vs 23.7M cache-creation tokens — context is being recycled aggressively but not always at the right granularity.
Tool errors are now mostly Edit/Bash mismatches, not environment confusion
251 tool-result errors in the window. Composition has shifted: failed Edits (string-match failures because the file changed mid-session), Bash command_failed (exit codes from probe commands like systemctl status on a non-running service), and the occasional Gemini API 503. Almost no "wrong VPS" or "command not found" errors — those got designed out by the ssh aliases and CLAUDE.md inventory.
  • Edit failures concentrate in long sessions where you've made manual edits between Claude turns — Read-then-Edit on every change would catch it but you mostly accept the retry.
  • The prior report's pain points (OAuth re-auth flow, sed escaping) no longer appear — the OAuth token flow lives in /root/.config/google/ as a one-time setup; sed-over-SSH was replaced by writing files locally then scping.

Existing CC Features to Try

Suggested CLAUDE.md Additions

Just copy this into Claude Code to add it to your CLAUDE.md.

You already use these — codifying the routing will stop "should I delegate?" deliberation mid-session.
Today's email pipeline already follows this rule informally — write it down so it survives.
3 sessions in window exceeded 5 MB; cache hits don't fully save you past that size.

You already use subagents, hooks, skills, and headless mode heavily. Here's what's still missing from your repertoire.

Triggers (scheduled remote agents)
Cron-like Claude Code agents that run on a schedule, no shell wrapper
Why for you: Your cron jobs (vault sync, ChromaDB backup, weekly summary, health check) are bash scripts that call other tools. Triggers let you schedule a Claude Code agent directly — useful for the "weekly worklog summary" job which currently shells out to a one-off Sonnet API call. Cleaner failure handling, output goes to your Telegram via the same hook chain.
claude trigger create --name weekly-summary \ --cron "0 22 * * 0" \ --prompt "Read worklog_week_$(date +%V).md from /root/obsidian-vault, write a 5-bullet summary to Notes/summary_week_$(date +%V).md, and ping Telegram user 8594455361 with a one-line digest."
Worktrees for parallel branches
Spin up isolated working dirs for risky refactors without polluting the main checkout
Why for you: The pending vault_indexing.py extraction touches 4 files in /root/develop/telegram_connection. Doing it in a worktree lets you keep the main bot running on the current code while you build the extraction in isolation, then swap atomically. Same for the email pipeline → bot.py unification when that day comes.
cd /root/develop/telegram_connection git worktree add ../telegram_connection-extract-vault-indexing extract/vault-indexing cd ../telegram_connection-extract-vault-indexing claude # run the extraction here without touching the live bot's checkout
Output Styles
Configure Claude's response shape per project (terse, structured, code-only)
Why for you: Your CLAUDE.md soul says "Arne prefers short answers" but Claude still occasionally over-explains. An output style enforces it at the harness level — useful for the bot codebases where you want diff-only responses on small fixes.
mkdir -p ~/.claude/output-styles && cat > ~/.claude/output-styles/terse.md <<'EOF' You are working with Arne. Default to: - One-line confirmations on small edits ("Done: bumped image gate to 20480.") - No re-stating the request back - No "Let me..." preambles - Bullet lists for multi-step results, never paragraphs EOF
MCP servers for the integrations you already wrote yourself
Expose Gmail / Calendar / Drive / Sheets / vault as MCP tools instead of bot-side tool functions
Why for you: Your Telegram bot has 30 tools — many of them (Gmail search/send, Calendar list/create, Drive upload, Sheets read/write, vault save/search) duplicate things that exist as MCP servers. Wrapping them once as MCP would give Claude Code direct access in your terminal sessions, not just inside the bot. The Gmail/Calendar MCPs already exist (they're in your deferred-tool list).
# Connect the Anthropic Gmail MCP (already deferred-available in your env): claude mcp list # see what's already wired claude mcp add claude_ai_Gmail # one-time auth — then `gmail_send` works in any CC session

New Usage Patterns

Patterns that emerged in this window, ordered by impact.

Spec → bot-dev subagent → review → ship NEW
For Telegram-bot work, dispatch the implementation to the bot-dev subagent with a tight spec, then review and iterate.
Today's email pipeline used this twice in succession: build → review → "approve, also add image gate, then live test." The subagent owns the multi-file changes; you own the spec and the merge call. This is qualitatively different from the prior window's "drive every step" pattern. 6 bot-dev dispatches in 30 days.
Paste into Claude Code:
Use the bot-dev subagent. Build a new self-contained module at /root/develop/telegram_connection/.py that does X / Y / Z. Constraints: do NOT import bot.py, write tests in tests/test_.py before declaring done, log any intentional duplication of existing helpers in a project_*_unification.md memory file. Report back with a summary and file paths only.
Codify recurring corrections as feedback_*.md memory files NEW
When you correct Claude on the same thing twice, write a feedback_*.md note in ~/.claude/projects/-root/memory/.
Six exist now (test persistence, duplication debt, Gemini models, adaptive thinking, API-vs-Max billing, Claude Code model policy). The SessionStart hook injects them into context, so the correction sticks across sessions instead of being relearned. This is the single highest-leverage pattern in your repertoire — every entry pays back ~1h of future debugging.
Paste into Claude Code:
I just had to correct you twice on . Write a feedback_.md memory file in /root/.claude/projects/-root/memory/ with: 1) the wrong behavior, 2) the correct behavior with citation, 3) why this matters in our codebase. Then save it to ChromaDB with tags=feedback, --priority 5.
Snapshot before re-processing, dedup over delete NEW
Before any pipeline that mutates ChromaDB / vault / Gmail labels in bulk, snapshot the targets and dedup on content hash rather than deleting first.
Today's email pipeline snapshotted the vault, ChromaDB collection, and Gmail label state before re-processing, then used body-SHA1 (whitespace + case normalized) to dedup against the existing 4 docs. No data loss possible. This is now the implicit playbook for any ingestion change.
Paste into Claude Code:
Before we re-run the : snapshot the current state of , dump the ChromaDB to a JSON, and save the current Gmail label list. Save all three to /tmp/snapshot-$(date +%s)/. Then run the pipeline with content-hash dedup against existing docs — don't delete first.
SSH aliases as the substrate, not commands
All cross-VPS work is now ssh website / ssh yaskawa / ssh cloos-bot / ssh qc-bot — no IP addresses in prompts.
The prior report flagged "wrong VPS" as the #1 friction class. That class is gone. Aliases in ~/.ssh/config + a CLAUDE.md inventory mean Claude knows which box runs what. Side-effect: prompts are shorter and less error-prone — "ssh yaskawa to check the rosbot logs" beats "ssh [email protected]".
Iterative course-correction over upfront spec perfection
Build, test against reality, change the spec, build the next layer.
Today's email pipeline shipped Gmail labels first; you tested it, the UX failed, you switched to subject keywords mid-build. The 5 KB image gate let a WindEurope banner through; you bumped it to 20 KB. None of this was on the original spec, none of it slowed you down. This isn't a new pattern but it's now the dominant one — and pairs naturally with the bot-dev subagent's tight build-test-report loop.

On the Horizon

You're past the "build the infrastructure" phase. Three concrete projects on the active list that the next 30 days could ship.

vault_indexing.py extraction (the first proper refactor)
vault_index_file lives in 4 files; _prune_stale_* in 2. You have a written playbook (project_next_vault_indexing_extraction.md), tests in place, and a ~1.5–2h estimate. This is the cleanest "Plan → Explore → bot-dev in a worktree" candidate in the repo. Doing it well sets the pattern for the email-pipeline → bot.py unification later.
Getting started: Open a worktree (git worktree add ../telegram_connection-extract-vault-indexing), dispatch Plan to draft the module API from the existing 4 callsites, then bot-dev to extract, leave the playbook open in the chat as the success criterion.
Yaskawa cobot Phase 2 — full ROS 2 implementation
Phase 1 (monitoring + MoveIt) is in place on the 16 GB Yaskawa VPS. Phase 2 is the actual FPX multi-layer welding control — 31 passes, real cobot motion, micro-ROS Agent talking to the YRC1000. The CLAUDE.md flag is there ("Phase 2: full ROS 2 implementation"). This is the project that will most exercise the debugger subagent (real hardware, real failure modes, real logs).
Getting started: Dispatch the debugger subagent on the existing micro-ROS Agent logs to baseline the current behavior before adding the FPX motion layer. Keep a project_yaskawa_phase2.md running playbook so the inevitable hardware-related re-runs don't lose state.
Email pipeline → bot.py unification (when the time comes)
Today you intentionally duplicated the Gemini analyzer in email_ingest.py with a documented unification trigger in project_email_ingest_unification.md. When the next ingest variant lands (Slack? webhooks?), that trigger fires and you have a clean 3-way merge to do. The same worktree + Plan + bot-dev pattern as vault_indexing.py applies.
Getting started: Don't pre-merge — wait for the third caller. The cost of a 3-way merge is usually less than the cost of designing the wrong abstraction off two examples. Your existing feedback_duplication_debt.md says exactly this.
"Let's try Gmail labels." — "Gmail labels are clunky to apply on mobile." — "OK, subject-line trigger then." (mid-build, today)
The email pipeline's trigger mechanism flipped from labels to a claude-ingest[:company] subject keyword after one round of real-world testing on Arne's phone. The 5 KB image gate became 20 KB after a WindEurope banner snuck through. Three more body-cleanup fixes followed once forwarded Outlook chains showed their actual mess. Spec perfection is a myth; iterative course-correction is the actual workflow. The whole pipeline shipped, tests green, in one session.