mise-pin zig 0.15.2; rebuild libghostty-vt ReleaseFast — 27-32x pipeline speedup

Added .mise.toml pinning zig = "0.15.2" (the minimum the vendored
Ghostty commit requires) and taught the Makefile to resolve zig
through mise when available, falling back to PATH. Contributors run
`mise install` once and `make deps` just works.

Re-ran the pipeline benchmarks after rebuilding libghostty-vt with
ReleaseFast (same hardware, AMD Ryzen 7 7800X3D):

                                Debug         ReleaseFast    speedup
  Pipeline 8-colour @120fps     63 fps         2030 fps       32x
  Pipeline truecolor @120fps    34 fps          931 fps       27x
  Emulator-only truecolor       34 fps         2051 fps       60x

7-16x headroom over 120 fps for the heaviest workload (truecolor
full-screen redraws). Static library size 33 MiB -> 13 MiB.

TODO.md baseline numbers updated to reflect post-fix throughput;
the "Debug-mode lib" finding is folded into the result it produced
rather than left as an open item.
This commit is contained in:
2026-05-15 13:54:48 +01:00
parent 2f109a84fa
commit bda799a3c6
4 changed files with 47 additions and 39 deletions

37
TODO.md
View File

@@ -3,8 +3,8 @@ Findings from a codebase sweep — not user-reported, needs review before
action. Each item names the anchor and a sketched fix.
Baseline benchmark numbers (`go test -bench=. ./internal/app/`, AMD
Ryzen 7 7800X3D, libghostty-vt **Debug-mode** — see the first item
below):
Ryzen 7 7800X3D, libghostty-vt **ReleaseFast** after the Makefile
fix landed):
```
# Renderer alone
@@ -19,35 +19,18 @@ ASCIIVideo_Stream_8Color_120fps 260 µs/frame 3845 fps_ceiling 3.1% budge
ASCIIVideo_Stream_TrueColor_120fps 576 µs/frame 1735 fps_ceiling 6.9% budget
# Full pipeline (em.Write + renderer + io.Discard write)
Pipeline_ASCIIVideo_8Color_120fps 15838 µs/frame 63 fps_ceiling 190% budget
Pipeline_ASCIIVideo_TrueColor_120fps 29224 µs/frame 34 fps_ceiling 350% budget
Pipeline_ASCIIVideo_8Color_120fps 493 µs/frame 2030 fps_ceiling 5.9% budget
Pipeline_ASCIIVideo_TrueColor_120fps 1075 µs/frame 931 fps_ceiling 12.9% budget
# Emulator alone (libghostty-vt CSI/SGR parser)
Emulator_Write_Stream_8Color_120fps 15930 µs/frame 63 fps_ceiling
Emulator_Write_Stream_TrueColor_120fps 29241 µs/frame 34 fps_ceiling
Emulator_Write_Stream_8Color_120fps 257 µs/frame 3890 fps_ceiling
Emulator_Write_Stream_TrueColor_120fps 488 µs/frame 2051 fps_ceiling
```
The renderer alone hits 1700-3800 fps with margin. The full pipeline
caps at 34-63 fps. **The whole gap is libghostty-vt's em.Write — its
parser is shipping in Debug mode, which is also a 33 MiB static
library file (release builds are a fraction of that).**
- [ ] **libghostty-vt was being built in Debug mode.** [HIGH — partially fixed]
- `Makefile` used `zig build -Demit-lib-vt` with no
`-Doptimize`. Zig's `standardOptimizeOption` defaults to
`.Debug`, so the shipped static lib was unoptimised. Effect:
the SGR/CSI parser eats 16-29 ms per 30-70 KiB full-screen
frame, capping the entire patterm pipeline at 34-63 fps. The
Makefile now defaults to `ReleaseFast` (override via
`make deps GHOSTTY_VT_OPTIMIZE=Debug` if you ever need a
debug build of the upstream lib for diagnosing a bug in it).
- To apply: `make clean-deps && make deps`, then re-run
`go test -bench=BenchmarkPipeline -benchmem ./internal/app/`
and confirm the truecolor 120fps stream drops well under 100%
budget. Update the numbers in this section after rebuilding.
- Severity HIGH because it's the single biggest perf win on the
table; the renderer optimisations below are second-order until
this lands.
Result of the fix below: 27-32× pipeline speedup, 60× emulator
speedup. Pipeline hits 930-2030 fps end-to-end — 7-16× headroom
over the 120 fps target on the heaviest workload (truecolor
full-screen redraws).
- [ ] **viewport renderer allocates ~1 alloc per 4 input bytes on SGR/CSI-heavy chunks.** [MEDIUM]