Shotput

March 7, 2026

Tools

Claude CodeTypeScriptPlaywrightMCP SDKZodtsupVitestNode.js

What worked

3 of 5 phases complete in ~33 minutes total (2.9 min/plan average). Claude Code built the MCP server with 6 coarse-grained intent-based tools (not thin Playwright wrappers) — reducing MCP round-trips for agent workflows. Lazy browser initialization (Chromium not launched until first screenshot) cut startup from 1-3s to ~100ms. DOM inspection returns a <5KB curated summary instead of 100KB+ raw HTML, which means Claude can reason about selectors without blowing context. The natural-language element targeting flow (describe what to capture → inspect → get selector → capture) works well because the inspector was designed for agent consumption from the start.

What broke

SSRF hardening is incomplete and is the single most important thing to close before Shotput is wired to any untrusted input path — URL allow/deny-list handling is on the hardening backlog. No dark/light mode emulation yet (deferred to Phase 4). No device presets. No batch capture — one call per URL currently. Process-lifecycle tracking around browser.close() is fragile. Test coverage is incomplete — QUAL-01 deferred to Phase 5.

Roles

I set the MCP-first design — this is a tool built *for* agents, not a general Playwright wrapper, and the tool shape reflects that (coarse-grained intents, curated DOM summaries, graceful wait degradation). Claude Code wrote every line of Playwright and MCP SDK code. This is a particularly satisfying vibe because Shotput is the tool I use to capture screenshots for every other vibe in this portfolio — self-dogfooding from the start. The fresh-BrowserContext-per-capture decision was mine for state-leakage reasons.

Shotput (MCP Screenshot Capture Tool)

Overview

Shotput is a headless browser screenshot capture tool built as an MCP (Model Context Protocol) server that integrates with Claude Code. It enables programmatic capture of publication-ready screenshots — full-page or element-specific — entirely locally with zero external service dependencies.

Target users: Claude Code users needing automated screenshot capture for documentation, developers building docs, content creators capturing UI screenshots.

Key Features

Full-page and viewport screenshots in PNG/JPEG with quality control
Element-targeted screenshots via CSS selectors with configurable padding
Natural language element targeting — describe what to capture; Claude identifies the CSS selector
Page preparation — inject CSS/JavaScript, hide elements before capture
Authentication — manual login via visible browser OR programmatic cookie/token injection
Device emulation — custom viewport dimensions, scale factors (1x-3x retina)
Lazy content triggering — auto-scroll to load lazy-loaded images
Flexible wait strategies — networkidle, domcontentloaded, load, or custom delay

Architecture

Tech Stack

Layer	Technology
Browser Automation	Playwright 1.58.2
MCP Server	@modelcontextprotocol/sdk 1.27.1 (stdio transport)
Language	TypeScript 5.x
Validation	Zod 3.25.0
Runtime	Node.js 22 LTS
Build	tsup 8.x
Testing	Vitest + Playwright Test

Structure

src/
  index.ts    # Entry point, creates MCP server
  server.ts   # MCP tool registration (6 tools)
  browser.ts  # Browser manager (singleton, lazy initialization)
  capture.ts  # Screenshot capture pipeline
  inspect.ts  # DOM inspection + accessibility tree extraction
  auth.ts     # Session manager for authenticated captures
  output.ts   # File naming and output path resolution
  scroll.ts   # Auto-scroll for lazy content
  types.ts    # Shared TypeScript interfaces

Key Design Decisions

6 coarse-grained intent-based tools (not thin API wrappers) — reduces MCP round-trips
Lazy browser initialization — Chromium not launched until first screenshot (~100ms startup vs 1-3s)
DOM summary not raw HTML — Curated data (<5KB vs 100KB+) for Claude to reason about selectors
Fresh BrowserContext per capture — No state leakage between captures
Graceful wait degradation — Try networkidle first, fall back to domcontentloaded + delay

MCP Tools

shotput_capture — Full-page/viewport/element screenshot
shotput_inspect — DOM summary + accessibility tree for selector identification
shotput_set_cookies — Programmatic cookie injection
shotput_clear_sessions — Clear all stored sessions
shotput_login — Interactive login via visible browser
shotput_list_sessions — List stored sessions

Development History

3 of 5 phases complete (executed in ~33 minutes):

Phase	Status	Focus
1	Complete	Core capture engine (browser manager, pipeline, output)
2	Complete	Element targeting (CSS selectors, padding, DOM inspection)
3	Complete	Authentication (session manager, cookie injection, interactive login)
4	Pending	Skill layer + display polish (dark/light mode, device presets, batch)
5	Pending	Cross-client compatibility + quality (opencode, tests, docs)

Average velocity: 2.9 min/plan (7 plans in 33 minutes total).

Strengths

Context isolation — Fresh BrowserContext per capture; no state leakage
Graceful degradation — Timeout doesn't crash, missing elements don't hang
Explicit lifecycle management — Signal handlers + browser cleanup
DOM inspection design — Aria snapshot + curated summary vs raw HTML
Session security — No credential logging, fresh contexts, periodic StorageState capture

Weaknesses & Risks

SSRF hardening incomplete — URL allow/deny-list handling is on the hardening backlog and must be closed before Shotput is wired to any untrusted input path
No dark/light mode emulation — Deferred to Phase 4
No device presets — Users must manually set viewport/scale
No batch capture — One call per URL currently
Process lifecycle around browser.close() is fragile — Cleanup edge cases need tightening
Test coverage incomplete — QUAL-01 (full test suite) deferred to Phase 5

Connection to Other Projects

PM Toolkit — Could use Shotput for automated export/screenshot generation
2024.garden — Screenshot documentation of the digital garden

andrewlb notes