
Wallflower
Tools
What worked
The polyglot boundary (Rust ↔ Python via gRPC) was the biggest risk at the start and became the cleanest seam — protobuf contracts meant each language stayed in its lane. Post-release dogfooding surfaced real friction: audio metering was too insensitive for multi-channel interfaces, long recordings had no zoom, and macOS permission dialogs were missing. Five point releases fixed all of these — iteration driven by actual use, not spec review.
What broke
The feature I most wanted to build (spatial similarity — browse jams by harmonic distance instead of date) got descoped when I chose to ship accessibility and code-signed distribution instead. Right call for shipping, but the conceptual differentiator is still on the backlog. The HTTP API is mostly stubs because everything migrated to Tauri IPC — pragmatic, but it means headless/daemon mode doesn't work yet. Python sidecar requires users to have Python 3.13 installed, which is real distribution friction.
Roles
I defined the core workflow (record → auto-detect → extract → DAW) and the constraint that all ML runs locally. Claude Code wrote the Rust crates, Python analyzers, gRPC bridge, and React UI. Architecture decisions were co-designed: Rust + Tauri over Electron for audio reliability, demucs-mlx over PyTorch MPS after benchmarking showed 2.6x speedup on Apple Silicon.
Wallflower (Jam & Sample Manager)
Overview
Wallflower is a local-first jam and sample manager for musicians who improvise. It solves a specific creative problem: transforming multi-hour jam sessions into usable samples without workflow interruption. Record a 2-hour session, let the app auto-detect key, tempo, sections, and separate instruments, then extract an 8-bar synth loop and drag it into your DAW. All with local AI, crash-safe recording, and passive background processing.
Core purpose: Stay in creative flow. Record, tag, browse, extract — without switching out of the musical headspace to manage files.
Target users: Electronic musicians and improvisers who accumulate hours of jam recordings and need to find and extract the good parts later.
What It Does
The full loop works: record a multi-hour jam → auto-detect tempo, key, and sections → separate into stems (drums, bass, vocals, other) → browse and extract samples → drag into your DAW. All processing runs locally on Apple Silicon via essentia and demucs-mlx. No cloud, no upload, no waiting.
Key capabilities:
- Crash-safe recording — incremental WAV writes survive power loss mid-session; priority scheduler pauses all background ML while recording
- On-device ML pipeline — tempo (TempoCNN), key/scale, section boundaries, loop detection, and neural source separation via a Python sidecar over gRPC
- Content-addressed imports — SHA-256 hashing means re-importing the same file is a no-op
- Stem mixer and export — solo/mute separated instruments, export individual stems
- Sample browser — filter and preview extracted samples across your whole library
- Code-signed macOS distribution — notarized DMG, auto-update checker, global record hotkey
Architecture
How It Fits Together
Three languages, three processes, one protobuf contract:
- Rust (Tauri + Axum) — audio I/O, crash-safe recording, import pipeline, SQLite storage
- Python (gRPC sidecar) — ML analysis via essentia, source separation via demucs-mlx
- TypeScript (Next.js + React) — UI, waveform visualization, state management
The polyglot split follows the domain: Rust handles anything where data loss is unacceptable (recording, imports), Python handles anything that needs the ML ecosystem, and TypeScript handles the interactive surface. The gRPC boundary with streaming progress means each process can evolve independently.
Iterations and What Changed
The initial build established the full record → analyze → separate → browse → extract loop. What happened after shipping revealed more than what happened during.
Dogfooding corrections (v0.2.1–v0.2.5):
- Audio metering was too insensitive for multi-channel interfaces — expanded from -60dB to -96dB floor after real recording sessions showed the meters barely moving
- Long recordings (60+ minutes) had no waveform zoom — added pinch/scroll zoom centered on cursor position after trying to find a specific section in a 90-minute jam
- macOS microphone permission dialogs were missing entirely — the app silently failed to record until Info.plist got the right entitlement
- CoreAudio callbacks were allocating on the heap every cycle — caused audio glitches under load, fixed by pre-allocating channel remap buffers
The descoping decision: The feature I most wanted to build — spatial similarity, browsing jams by harmonic distance instead of date — got cut in favor of accessibility (keyboard nav, high-contrast mode) and distribution (code signing, notarization). A code-signed binary with accessibility ships; a half-built similarity map doesn't. The conceptual differentiator is now backlog, and I'm at peace with that trade.
Key Design Decisions
- Rust + Tauri over Electron — Audio reliability and 96% smaller binary. Recording can't tolerate GC pauses or Chromium overhead.
- Python sidecar via gRPC, not in-process — ML ecosystem (essentia, demucs) is Python-native. Separate processes with protobuf contracts mean a crash in ML analysis can't kill a recording in progress.
- demucs-mlx over PyTorch MPS — 2.6x faster on Apple Silicon. Benchmarking settled this quickly.
- Recording preempts all background work — Creative flow is non-negotiable. Background ML can wait; a dropped audio buffer can't be recovered.
- Content-addressed imports — SHA-256 on import means re-importing is a no-op, not a duplicate. Important when the same SD card gets plugged in repeatedly.
Open Questions
- No user validation yet. Built for my own workflow. Whether the record → analyze → extract loop matches how other improvisers actually work is untested.
- Single platform — macOS only. Tauri supports cross-platform but cpal and demucs-mlx are macOS-specific.
- Python distribution friction — Users need Python 3.13 installed for ML features. No bundled runtime in the DMG.
Ecosystem Role
Part of a music domain cluster: Etyde (practice/theory) → Evolver (instrument mastery) → Wallflower (recording/samples) → AbletonBuddy (production). Wallflower covers the stage between live performance and production — capturing the raw material.
Runs entirely on-device with no cloud or VPN dependency. Audio processing is real-time-adjacent and can't tolerate network latency, so the Roughneck/Ollama infrastructure used by other projects doesn't apply here.