Earworm

April 3, 2026

Tools

Claude CodeGoCobraViperSQLiteFFmpegGoReleasertestifycharmbracelet/lipgloss

What worked

Earworm is the first Go project in the portfolio — Claude Code handled the language switch cleanly, producing a 17.6 MB CGo-free single binary with GoReleaser cross-compiling to darwin/linux amd64/arm64. The plan engine (draft → ready → apply lifecycle with SHA-256 cross-filesystem verification and audit logs) emerged in ~60 commits over 7 days post-v1.0 and grew the code volume ~66%. Most importantly: within 48 hours of shipping a feature I was using it on real data — 315+ entries in library cleanup CSVs, 18 Warhammer 40k folder flattens, Malazan + Beast Arises multi-book splits.

What broke

The audible-cli subprocess boundary is a clean license firewall but its output format isn't formally versioned, so the wrapper has to be defensive — Claude's first cut was too trusting. Audible's rate-limit thresholds are undocumented, which meant conservative 5-30s delays that may be overly cautious; no way to tune without risking a ban. The daemon mode cannot auto-apply plans post-v1.1 (deliberate safety trade-off for destructive ops) which reduces unattended usefulness.

Roles

I defined the plan-engine philosophy — additive before destructive, dry-run as default, double-confirmation on destructive steps, CSV as the offline composition format. Claude Code wrote the state machine, the fileops package (fsync + SHA-256 verify), and the scanner/split detectors. The decision to use pure Go SQLite (modernc.org) to avoid CGo was mine; the scanner heuristics for multi-book detection were co-designed.

Earworm (Audiobook Library Manager)

Overview

Earworm is a Go-based CLI tool that started as a reliable, fault-tolerant Audible downloader (replacing Libation) and evolved through Apr 7-12 into a full library operations engine: deep scanning for structural issues, plan-based remediation with metadata carriage, idempotent file operations (flatten, split, cleanup), and auditable plan review/approval.

Core purpose (v1.0): Reliably download and organize Audible audiobooks into a local library with zero manual intervention.

Core purpose (v1.2+): Manage the full lifecycle of a self-hosted audiobook library — download, organize, detect structural issues, propose plans, review them, and apply fixes with audit trails and trash-safe deletes.

Target users: Audiobook collectors running self-hosted media servers (Audiobookshelf) who need reliable batch downloading and ongoing library hygiene for libraries that grew chaotically over years.

Key Features

Download & organize (v1.0 core):

SQLite-backed library state tracking with persistent download state machine (enables crash recovery and resume)
Audible library sync via audible-cli subprocess wrapper with typed interface and output parsing
Fault-tolerant batch downloads with rate limiting, exponential backoff, per-book timeout, and crash recovery
AAXC decryption via FFmpeg integration (AAXC -> M4B conversion with voucher parsing)
Automatic organization into Libation-compatible folder structure (Author/Title [ASIN]/)
Cross-filesystem file moves with copy-verify-delete pattern (local to NAS)
Audiobookshelf integration — REST API client for library scan triggers after organization
Goodreads CSV export for reading list sync
Daemon/polling mode for unattended server operation
Embedded Python venv bootstrapping for audible-cli dependency management
Real-time progress with tqdm output parsing and terminal UI (charmbracelet/lipgloss)

Library operations (phases 9-18.1, Apr 7-12):

Deep scan — earworm scan --deep / earworm scan issues detects structural problems (nested folders, multi-book containers, missing metadata, naming inconsistencies) across an existing library
Plan engine — earworm plan {list,review,approve,apply,import} provides a full draft → ready → apply lifecycle with dry-run as default, double confirmation on destructive steps, and audit logs (SHA-256 cross-filesystem verification, fsync, structured logging)
Flatten operations — Collapse nested slug_audio/01-Cover/02-Product/Mp3s/ structures into flat, Audiobookshelf-readable folders
Split operations — earworm split {detect,plan} identifies multi-book folders (entire series bundled as one "book") and proposes per-book splits with metadata inference
Trash-safe cleanup — earworm cleanup --plan-id deletes only via reviewed plans with double confirmation
Metadata carriage in plans — Title, author, series, series_index travel through the CSV → plan → apply pipeline and override DB fallbacks in write_metadata, enabling offline plan composition via spreadsheet
CSV column aliases — Plan import tolerates hand-edited CSVs with source_path/current_path/src/source variants
Blacklist — earworm skip suppresses known-bad books from download retries
Claude Code skill integration (Phase 14) — Exposes scan/plan/apply as a skill for agent-driven library remediation
MP3 support — Phases 11-14 added .mp3 handling alongside M4A/M4B for mixed-format libraries

Architecture

Tech Stack

Layer	Technology
Language	Go 1.26.1
CLI	spf13/Cobra v1.10.2 + spf13/Viper v1.21.0
Database	modernc.org/sqlite v1.48.1 (pure Go, CGo-free)
Audio Metadata	dhowden/tag (M4A/MP4, pure Go) + ffprobe fallback
Decryption	FFmpeg (AAXC -> M4B)
Terminal UI	charmbracelet/lipgloss + bubbles
Testing	testify v1.11.1
Build	GoReleaser v2.x (darwin/linux, amd64/arm64)
Logging	log/slog (structured)

Structure

cmd/earworm/
  main.go                  # Entry point, version ldflags
internal/
  cli/                     # Cobra commands (auth, sync, download, organize, notify, daemon, scan, status, plan, split, cleanup, skip)
  config/                  # Viper-based YAML + env + flag configuration
  db/                      # SQLite layer with embedded migrations (001-007 — 007 adds metadata plan carriage)
  audible/                 # audible-cli subprocess wrapper (auth, download, library parsing)
  download/                # Download pipeline (rate limiter, backoff, progress, staging, decryption)
  organize/                # File organization (path builder, cross-filesystem mover)
  scanner/                 # Local filesystem scanner (ASIN extraction, metadata reading, deep scan, issue detection)
  metadata/                # Metadata extraction (dhowden/tag, ffprobe, folder parsing) + write_metadata
  daemon/                  # Polling loop for unattended operation (gated by plan approval post-v1.1)
  audiobookshelf/          # Audiobookshelf HTTP client (library scan API)
  goodreads/               # Goodreads CSV export
  venv/                    # Python venv management for audible-cli
  fileops/                 # Idempotent file operations (flatten, move, cleanup) with fsync + SHA-256 verification
  planengine/              # Plan draft→ready→apply lifecycle, CSV import, audit logging, metadata carriage
  split/                   # Multi-book folder detection and per-book split planning

Key Patterns

Subprocess boundary — All Audible interaction isolated in internal/audible/; clean license boundary (MIT wrapper over Python tool)
State machine per book — Download state tracked in SQLite; crash at any point resumes from last good state
Staging directory — Downloads land in staging, verified, then moved to library; prevents partial files in final location
Copy-verify-delete — Cross-filesystem moves (local -> NAS) verify file size before deleting staging copy
Rate limiting — Conservative 5-30s between downloads; undocumented Audible thresholds mean caution is mandatory

Development History

v1.0 (Apr 1-5, 2026) — Download-and-organize core, 6 phases + 4 quick tasks:

Phase	Plans	Focus
1	3	Foundation (Go project, SQLite, config management)
2	3	Local library scanning (filesystem scanner, ASIN extraction, metadata)
3	3	Audible integration (auth, library sync, subprocess wrapper)
4	3	Download pipeline (rate limiter, retry/backoff, progress, staging)
5	3	File organization (path builder, cross-filesystem moves, Libation compat)
6	2	Integrations & polish (Audiobookshelf, Goodreads, daemon, CLI polish)

v1.1 – v1.3 (Apr 7-12, 2026) — Library operations expansion, phases 9-18.1:

Phase(s)	Focus
9-10	Deep library scanner, issue detection, scan → structured output
11	Fileops package (flatten, idempotent moves)
12	Plan engine CLI (draft/ready/apply lifecycle)
13	CSV import with guarded cleanup
14	Multi-book splitting (detection + plan generation) + Claude Code skill integration
15	Data safety hardening (fsync, SHA-256 cross-fs verification, audit logging, pre-flight checks)
16-17	Plan draft→ready promotion, scan-to-plan bridge with JSON output, metadata wiring
18-18.1	CSV metadata field carriage (title/author/series/series_index) through plan pipeline

Velocity: ~60 commits across 7 days post-v1.0; code volume grew ~66% from v1.0 baseline. Latest commit Apr 12, 2026.

Architectural Decisions

Decision	Rationale
Go (not TypeScript)	Single-binary distribution, excellent CLI/subprocess ecosystem, cross-compilation
Pure Go SQLite (modernc.org)	Eliminates CGo, enables cross-compilation for single binary
SQLite on local filesystem only	Network filesystems (NAS) cause silent corruption and broken locking
Wrap audible-cli as subprocess	Clean license boundary; proven Audible auth; avoids GPL contamination
Staging directory for downloads	Prevents partial files in library; enables verification before move
Libation-compatible folder structure	Maximum compatibility with existing libraries and Audiobookshelf conventions
Conservative rate limiting	Audible throttle thresholds undocumented; 5-30s delays protect against bans
Embedded Python venv	Users don't manually install Python or audible-cli; single binary bootstraps everything

Strengths

Fault tolerance as core differentiator — Queue-level crash recovery + per-book timeout addresses Libation's primary weakness
Zero-dependency single binary — Pure Go, CGo-free, 17.6 MB; GoReleaser produces cross-platform builds
Clean subprocess isolation — All audible-cli interaction confined to one package; format changes don't cascade
SQLite state machine — Persistent tracking enables resume-from-interruption at any point in the pipeline
Cross-filesystem awareness — Handles local-to-NAS moves correctly with copy-verify-delete pattern
Well-documented architecture — Research artifacts (ARCHITECTURE.md, FEATURES.md, PITFALLS.md) capture decisions before implementation

Weaknesses & Risks

Audible rate limit thresholds undocumented — Conservative defaults may be overly cautious; tuning requires empirical testing
audible-cli output not formally versioned — Subprocess wrapper must be defensive; format changes could break parsing silently
Requires Python 3.9+ and FFmpeg — External runtime dependencies (though auto-bootstrapped)
MP3 added; FLAC still unsupported — v1.3 extended split/flatten to .mp3; no format conversion
Single-service focus — Audible only; no Libro.fm, Chirp, or other audiobook service support
Limited Audiobookshelf integration — Only library scan trigger; no metadata field updates or two-way sync
Plan review is manual-gate for daemon mode — Daemon cannot auto-apply plans post-v1.1; they stay as drafts until a human runs plan approve <id>. Deliberate safety trade-off but reduces unattended usefulness

Real-World Usage (Apr 7-13)

Earworm has moved from feature-complete CLI to actively managing a 500+ book NAS library. Evidence in /Users/albair/book/vibes/:

audiobook-cleanup.csv / audiobook-cleanup-earworm.csv / earworm-library-cleanup-proposal.md — ~315 entries, including a full Discworld series sweep. Phase 1 of a two-phase cleanup strategy: inject metadata (series, series_index) additively before any destructive file op. Mirrors the "additive before destructive" philosophy that Phase 18.1 bakes into the plan engine.
audiobook-flatten.csv — 18 Warhammer 40k folders stored as 3-level nested slug_audio/01-Cover/02-Product/Mp3s/ that Audiobookshelf can't traverse. Targeted for plan import + plan apply --confirm via the fileops flatten operation.
audiobook-split.csv / audiobook-split-phase1.csv / audiobook-split-phase2.csv — Malazan + Beast Arises bundles split from one multi-book folder into ~10 per-book folders with title/author/series metadata inferred at scan time.

This is the first project in the portfolio where the user is visibly using his own tool in anger on real data within days of shipping the feature. Every library hygiene complaint he couldn't fix by hand became a plan phase within 48 hours.

Connection to Other Projects

No direct connections to the existing portfolio ecosystem (Roughneck, CNC, Etyde, GoVejle, Ollama)
First Go project in the portfolio — validates Go as an alternative to the TypeScript default for CLI/systems work
Audiobookshelf is the target media server; integration via REST API
NAS deployment — Designed for headless/server use alongside other self-hosted media services

What Makes This Project Different

Earworm is the first project in the portfolio that:

Uses Go — Breaking from the TypeScript default; chosen because the problem domain (CLI, subprocess management, cross-compilation) is a poor fit for Node.js
Has no AI/LLM component — Pure systems/infrastructure work; no Ollama, no Roughneck
Wraps a third-party CLI — The core value is reliability and orchestration around an existing tool, not building from scratch
Targets a personal pain point with zero ambiguity — Libation doesn't work reliably; this does. No user validation needed because the user is the developer

andrewlb notes