Files
patterm/SPIKE-REPORT.md
2026-05-14 13:37:20 +01:00

6.6 KiB

Milestone 1 Spike Report — libghostty-vt

Date: 2026-05-12 Pin: ghostty-org/ghostty@b0f8276658fbcc75318d2125d40146074a3fc505 (main, post-v1.3.1) Spike binary: ./bin/spike (built against the static libghostty-vt.a)

Verdict

The libghostty-vt bet is validated. Proceed to milestone 2 (daemon/client singleton + single PTY).

Every target except htop (not installed locally, deferred) rendered correctly. The libghostty-vt-emulated grid matched the live host-terminal display cell-for-cell across plain stream output, interactive bash, alt-screen TUIs (vim), and the three agent CLIs that are the actual point of the project (claude, opencode, codex).

Target matrix

# Target Screen Verdict Grid log PTY bytes
1 sh -c 'echo hello; sleep 1' primary spike-201533.grid.log spike-201533.bytes
2 bash -i primary spike-201606.grid.log spike-201606.bytes
3 vim SPEC.md alt → primary spike-201707.grid.log spike-201707.bytes
4 htop ⊘ deferred not run not run
5 claude primary spike-201848.grid.log spike-201848.bytes
6 opencode alt spike-202380.grid.log spike-202380.bytes
7 codex primary spike-202614.grid.log spike-202614.bytes

Notable observations per target

  • vim: alt-screen entry/exit tracked correctly; on :q returned to primary with empty grid and cursor at (0,0). The formatter's unwrap=true setting reassembled soft-wrapped paragraphs from SPEC.md into long single-line paragraphs — this is exactly the shape an orchestrator agent wants when reading sub-agent output.
  • claude and codex render on the primary screen, not alt-screen. They draw their own cursor (visible=false). Chrome heuristics in milestone 7 will need to be per-agent, not per-screen-buffer.
  • opencode uses alt-screen and a heavy bar-art logo that visually looks "broken" on first glance — it isn't. The grid dump confirms cell-perfect parsing.
  • bash: cooked PTY line discipline is doing its job — echo testtest[harry@…]$ exitexit all sequence correctly in the final dump, cursor at (0,4).

API surface used

From include/ghostty/vt/:

  • ghostty_terminal_new / ghostty_terminal_free
  • ghostty_terminal_vt_write — feed PTY bytes
  • ghostty_terminal_resize(cols, rows, 0, 0) — pixel dims ignored for headless
  • ghostty_terminal_set for: USERDATA, WRITE_PTY, DEVICE_ATTRIBUTES, XTVERSION, ENQUIRY
  • ghostty_terminal_get for: CURSOR_X, CURSOR_Y, CURSOR_VISIBLE, ACTIVE_SCREEN
  • ghostty_formatter_terminal_new + ghostty_formatter_format_alloc + ghostty_formatter_free + ghostty_free (allocator-aware)
  • Format options: FORMAT_PLAIN, unwrap=true, trim=true

Everything we needed was present and stable enough to use behind a Go interface.

Bugs found and fixed during the spike

  1. v1.3.1 tag didn't yet expose terminal.h / formatter.h. Bumped the pin forward to a commit on main where the full API is published.
  2. Recursive-mutex deadlock between Emulator.Write and the WRITE_PTY cgo callback (Write held e.mu, callback re-entered Go and tried to take e.mu again). Switched the callback field to atomic.Pointer.
  3. Vim hung on startup. Missing DEVICE_ATTRIBUTES callback meant DA1 queries (CSI c) were silently ignored; vim waited forever. Now we respond with a constant VT220-class identity (conformance 62, no features). Added stub XTVERSION and ENQUIRY responders for the same reason.
  4. Raw-mode stderr produced ragged output, with subsequent lines starting mid-line. After term.MakeRaw, OPOST is off, so \n doesn't generate a CR. Changed all spike stderr writes to \r\n.
  5. Grid dumps to stderr visually corrupted alt-screen TUIs. The libghostty-vt grid was always correct; the host terminal display was getting smashed because the spike was writing 40+ lines onto a display owned by the TUI. Default sink is now spike-<pid>.grid.log; user tails it from another terminal.
  6. Idle-dump breadcrumbs were chatty — fired once per ~1 s of typing. Now only hotkey dumps emit a breadcrumb, and only when the child is on the primary screen.

Resolved open questions from the plan

  • ghostty_terminal_resize exists. Takes pixel dimensions too; we pass 0, 0.
  • Build system is Zig (≥0.15.2). Downloaded the official tarball into .zig-cache/ and the Makefile make deps target produced static + shared libs on first try. Pacman's zig 0.16.0 was not tested; not needed.
  • Default WRITE_PTY callback is sufficient for DECRQM responses, but DA1, XTVERSION, ENQ need their own handlers or vim and friends hang. Wired up constant responders for all three.
  • Active vs alternate screen tracking via GHOSTTY_TERMINAL_DATA_ACTIVE_SCREEN is reliable across all three agent CLIs.
  • ◻ Formatter cost per dump: each PlainText() call allocates and frees a buffer. Acceptable for the spike. Action for daemon era: cache one formatter handle per emulator instead of recreating on every dump.

Risks / follow-ups for milestone 2

  1. Per-agent chrome trimming (SPEC.md §10): claude and codex render on primary screen, so we can't just "skip the alt-screen". Chrome-trim heuristics will need to identify banner/input-box regions by content, not by screen buffer. The .bytes recordings in this run are good fixtures for that work.
  2. Resize timing. Spike resizes both PTY and emulator on SIGWINCH. Daemon will have one emulator per pane and the spec's "primary-client-wins" policy must be enforced before any UI work.
  3. Reading-back the spike report after the run. The matrix script's stderr capture (2> >(tee -a "$REPORT" >&2)) interleaved per-case lines awkwardly (visible in spike-report-20260512T154705.txt). Cosmetic; doesn't affect evaluation.

Reproducing

# One-time
make deps                                            # zig builds libghostty-vt.a
make spike                                           # build ./bin/spike

# Single target
./bin/spike -- claude                                # then Ctrl-] to dump
tail -f spike-<pid>.grid.log                         # in another terminal

# Full matrix
./cmd/spike/testdata/run-matrix.sh

Decision

Proceed with libghostty-vt as the headless VT for milestone 2. Keep the cgo wrapper behind internal/vt.Emulator so the fallback (charmbracelet/x/vt, vt10x) stays swappable, but no longer treat it as a real risk for v1.