
Earworm
Tools
What worked
Earworm is the first Go project in the portfolio — Claude Code handled the language switch cleanly, producing a 17.6 MB CGo-free single binary with GoReleaser cross-compiling to darwin/linux amd64/arm64. The plan engine (draft → ready → apply lifecycle with SHA-256 cross-filesystem verification and audit logs) emerged in ~60 commits over 7 days post-v1.0 and grew the code volume ~66%. Most importantly: within 48 hours of shipping a feature I was using it on real data — 315+ entries in library cleanup CSVs, 18 Warhammer 40k folder flattens, Malazan + Beast Arises multi-book splits.
What broke
The audible-cli subprocess boundary is a clean license firewall but its output format isn't formally versioned, so the wrapper has to be defensive — Claude's first cut was too trusting. Audible's rate-limit thresholds are undocumented, which meant conservative 5-30s delays that may be overly cautious; no way to tune without risking a ban. The daemon mode cannot auto-apply plans post-v1.1 (deliberate safety trade-off for destructive ops) which reduces unattended usefulness.
Roles
I defined the plan-engine philosophy — additive before destructive, dry-run as default, double-confirmation on destructive steps, CSV as the offline composition format. Claude Code wrote the state machine, the fileops package (fsync + SHA-256 verify), and the scanner/split detectors. The decision to use pure Go SQLite (modernc.org) to avoid CGo was mine; the scanner heuristics for multi-book detection were co-designed.
Earworm (Audiobook Library Manager)
Overview
Earworm is a Go-based CLI tool that started as a reliable, fault-tolerant Audible downloader (replacing Libation) and evolved through Apr 7-12 into a full library operations engine: deep scanning for structural issues, plan-based remediation with metadata carriage, idempotent file operations (flatten, split, cleanup), and auditable plan review/approval.
Core purpose (v1.0): Reliably download and organize Audible audiobooks into a local library with zero manual intervention.
Core purpose (v1.2+): Manage the full lifecycle of a self-hosted audiobook library — download, organize, detect structural issues, propose plans, review them, and apply fixes with audit trails and trash-safe deletes.
Target users: Audiobook collectors running self-hosted media servers (Audiobookshelf) who need reliable batch downloading and ongoing library hygiene for libraries that grew chaotically over years.
Key Features
Download & organize (v1.0 core):
- SQLite-backed library state tracking with persistent download state machine (enables crash recovery and resume)
- Audible library sync via audible-cli subprocess wrapper with typed interface and output parsing
- Fault-tolerant batch downloads with rate limiting, exponential backoff, per-book timeout, and crash recovery
- AAXC decryption via FFmpeg integration (AAXC -> M4B conversion with voucher parsing)
- Automatic organization into Libation-compatible folder structure (
Author/Title [ASIN]/) - Cross-filesystem file moves with copy-verify-delete pattern (local to NAS)
- Audiobookshelf integration — REST API client for library scan triggers after organization
- Goodreads CSV export for reading list sync
- Daemon/polling mode for unattended server operation
- Embedded Python venv bootstrapping for audible-cli dependency management
- Real-time progress with tqdm output parsing and terminal UI (charmbracelet/lipgloss)
Library operations (phases 9-18.1, Apr 7-12):
- Deep scan —
earworm scan --deep/earworm scan issuesdetects structural problems (nested folders, multi-book containers, missing metadata, naming inconsistencies) across an existing library - Plan engine —
earworm plan {list,review,approve,apply,import}provides a full draft → ready → apply lifecycle with dry-run as default, double confirmation on destructive steps, and audit logs (SHA-256 cross-filesystem verification, fsync, structured logging) - Flatten operations — Collapse nested
slug_audio/01-Cover/02-Product/Mp3s/structures into flat, Audiobookshelf-readable folders - Split operations —
earworm split {detect,plan}identifies multi-book folders (entire series bundled as one "book") and proposes per-book splits with metadata inference - Trash-safe cleanup —
earworm cleanup --plan-iddeletes only via reviewed plans with double confirmation - Metadata carriage in plans — Title, author, series, series_index travel through the CSV → plan → apply pipeline and override DB fallbacks in
write_metadata, enabling offline plan composition via spreadsheet - CSV column aliases — Plan import tolerates hand-edited CSVs with
source_path/current_path/src/sourcevariants - Blacklist —
earworm skipsuppresses known-bad books from download retries - Claude Code skill integration (Phase 14) — Exposes scan/plan/apply as a skill for agent-driven library remediation
- MP3 support — Phases 11-14 added
.mp3handling alongside M4A/M4B for mixed-format libraries
Architecture
Tech Stack
| Layer | Technology |
|---|---|
| Language | Go 1.26.1 |
| CLI | spf13/Cobra v1.10.2 + spf13/Viper v1.21.0 |
| Database | modernc.org/sqlite v1.48.1 (pure Go, CGo-free) |
| Audio Metadata | dhowden/tag (M4A/MP4, pure Go) + ffprobe fallback |
| Decryption | FFmpeg (AAXC -> M4B) |
| Terminal UI | charmbracelet/lipgloss + bubbles |
| Testing | testify v1.11.1 |
| Build | GoReleaser v2.x (darwin/linux, amd64/arm64) |
| Logging | log/slog (structured) |
Structure
cmd/earworm/
main.go # Entry point, version ldflags
internal/
cli/ # Cobra commands (auth, sync, download, organize, notify, daemon, scan, status, plan, split, cleanup, skip)
config/ # Viper-based YAML + env + flag configuration
db/ # SQLite layer with embedded migrations (001-007 — 007 adds metadata plan carriage)
audible/ # audible-cli subprocess wrapper (auth, download, library parsing)
download/ # Download pipeline (rate limiter, backoff, progress, staging, decryption)
organize/ # File organization (path builder, cross-filesystem mover)
scanner/ # Local filesystem scanner (ASIN extraction, metadata reading, deep scan, issue detection)
metadata/ # Metadata extraction (dhowden/tag, ffprobe, folder parsing) + write_metadata
daemon/ # Polling loop for unattended operation (gated by plan approval post-v1.1)
audiobookshelf/ # Audiobookshelf HTTP client (library scan API)
goodreads/ # Goodreads CSV export
venv/ # Python venv management for audible-cli
fileops/ # Idempotent file operations (flatten, move, cleanup) with fsync + SHA-256 verification
planengine/ # Plan draft→ready→apply lifecycle, CSV import, audit logging, metadata carriage
split/ # Multi-book folder detection and per-book split planning
Key Patterns
- Subprocess boundary — All Audible interaction isolated in
internal/audible/; clean license boundary (MIT wrapper over Python tool) - State machine per book — Download state tracked in SQLite; crash at any point resumes from last good state
- Staging directory — Downloads land in staging, verified, then moved to library; prevents partial files in final location
- Copy-verify-delete — Cross-filesystem moves (local -> NAS) verify file size before deleting staging copy
- Rate limiting — Conservative 5-30s between downloads; undocumented Audible thresholds mean caution is mandatory
Development History
v1.0 (Apr 1-5, 2026) — Download-and-organize core, 6 phases + 4 quick tasks:
| Phase | Plans | Focus |
|---|---|---|
| 1 | 3 | Foundation (Go project, SQLite, config management) |
| 2 | 3 | Local library scanning (filesystem scanner, ASIN extraction, metadata) |
| 3 | 3 | Audible integration (auth, library sync, subprocess wrapper) |
| 4 | 3 | Download pipeline (rate limiter, retry/backoff, progress, staging) |
| 5 | 3 | File organization (path builder, cross-filesystem moves, Libation compat) |
| 6 | 2 | Integrations & polish (Audiobookshelf, Goodreads, daemon, CLI polish) |
v1.1 – v1.3 (Apr 7-12, 2026) — Library operations expansion, phases 9-18.1:
| Phase(s) | Focus |
|---|---|
| 9-10 | Deep library scanner, issue detection, scan → structured output |
| 11 | Fileops package (flatten, idempotent moves) |
| 12 | Plan engine CLI (draft/ready/apply lifecycle) |
| 13 | CSV import with guarded cleanup |
| 14 | Multi-book splitting (detection + plan generation) + Claude Code skill integration |
| 15 | Data safety hardening (fsync, SHA-256 cross-fs verification, audit logging, pre-flight checks) |
| 16-17 | Plan draft→ready promotion, scan-to-plan bridge with JSON output, metadata wiring |
| 18-18.1 | CSV metadata field carriage (title/author/series/series_index) through plan pipeline |
Velocity: ~60 commits across 7 days post-v1.0; code volume grew ~66% from v1.0 baseline. Latest commit Apr 12, 2026.
Architectural Decisions
| Decision | Rationale |
|---|---|
| Go (not TypeScript) | Single-binary distribution, excellent CLI/subprocess ecosystem, cross-compilation |
| Pure Go SQLite (modernc.org) | Eliminates CGo, enables cross-compilation for single binary |
| SQLite on local filesystem only | Network filesystems (NAS) cause silent corruption and broken locking |
| Wrap audible-cli as subprocess | Clean license boundary; proven Audible auth; avoids GPL contamination |
| Staging directory for downloads | Prevents partial files in library; enables verification before move |
| Libation-compatible folder structure | Maximum compatibility with existing libraries and Audiobookshelf conventions |
| Conservative rate limiting | Audible throttle thresholds undocumented; 5-30s delays protect against bans |
| Embedded Python venv | Users don't manually install Python or audible-cli; single binary bootstraps everything |
Strengths
- Fault tolerance as core differentiator — Queue-level crash recovery + per-book timeout addresses Libation's primary weakness
- Zero-dependency single binary — Pure Go, CGo-free, 17.6 MB; GoReleaser produces cross-platform builds
- Clean subprocess isolation — All audible-cli interaction confined to one package; format changes don't cascade
- SQLite state machine — Persistent tracking enables resume-from-interruption at any point in the pipeline
- Cross-filesystem awareness — Handles local-to-NAS moves correctly with copy-verify-delete pattern
- Well-documented architecture — Research artifacts (ARCHITECTURE.md, FEATURES.md, PITFALLS.md) capture decisions before implementation
Weaknesses & Risks
- Audible rate limit thresholds undocumented — Conservative defaults may be overly cautious; tuning requires empirical testing
- audible-cli output not formally versioned — Subprocess wrapper must be defensive; format changes could break parsing silently
- Requires Python 3.9+ and FFmpeg — External runtime dependencies (though auto-bootstrapped)
- MP3 added; FLAC still unsupported — v1.3 extended split/flatten to
.mp3; no format conversion - Single-service focus — Audible only; no Libro.fm, Chirp, or other audiobook service support
- Limited Audiobookshelf integration — Only library scan trigger; no metadata field updates or two-way sync
- Plan review is manual-gate for daemon mode — Daemon cannot auto-apply plans post-v1.1; they stay as drafts until a human runs
plan approve <id>. Deliberate safety trade-off but reduces unattended usefulness
Real-World Usage (Apr 7-13)
Earworm has moved from feature-complete CLI to actively managing a 500+ book NAS library. Evidence in /Users/albair/book/vibes/:
audiobook-cleanup.csv/audiobook-cleanup-earworm.csv/earworm-library-cleanup-proposal.md— ~315 entries, including a full Discworld series sweep. Phase 1 of a two-phase cleanup strategy: inject metadata (series, series_index) additively before any destructive file op. Mirrors the "additive before destructive" philosophy that Phase 18.1 bakes into the plan engine.audiobook-flatten.csv— 18 Warhammer 40k folders stored as 3-level nestedslug_audio/01-Cover/02-Product/Mp3s/that Audiobookshelf can't traverse. Targeted forplan import+plan apply --confirmvia the fileops flatten operation.audiobook-split.csv/audiobook-split-phase1.csv/audiobook-split-phase2.csv— Malazan + Beast Arises bundles split from one multi-book folder into ~10 per-book folders with title/author/series metadata inferred at scan time.
This is the first project in the portfolio where the user is visibly using his own tool in anger on real data within days of shipping the feature. Every library hygiene complaint he couldn't fix by hand became a plan phase within 48 hours.
Connection to Other Projects
- No direct connections to the existing portfolio ecosystem (Roughneck, CNC, Etyde, GoVejle, Ollama)
- First Go project in the portfolio — validates Go as an alternative to the TypeScript default for CLI/systems work
- Audiobookshelf is the target media server; integration via REST API
- NAS deployment — Designed for headless/server use alongside other self-hosted media services
What Makes This Project Different
Earworm is the first project in the portfolio that:
- Uses Go — Breaking from the TypeScript default; chosen because the problem domain (CLI, subprocess management, cross-compilation) is a poor fit for Node.js
- Has no AI/LLM component — Pure systems/infrastructure work; no Ollama, no Roughneck
- Wraps a third-party CLI — The core value is reliability and orchestration around an existing tool, not building from scratch
- Targets a personal pain point with zero ambiguity — Libation doesn't work reliably; this does. No user validation needed because the user is the developer