mise-pin zig 0.15.2; rebuild libghostty-vt ReleaseFast — 27-32x pipeline speedup
Added .mise.toml pinning zig = "0.15.2" (the minimum the vendored
Ghostty commit requires) and taught the Makefile to resolve zig
through mise when available, falling back to PATH. Contributors run
`mise install` once and `make deps` just works.
Re-ran the pipeline benchmarks after rebuilding libghostty-vt with
ReleaseFast (same hardware, AMD Ryzen 7 7800X3D):
Debug ReleaseFast speedup
Pipeline 8-colour @120fps 63 fps 2030 fps 32x
Pipeline truecolor @120fps 34 fps 931 fps 27x
Emulator-only truecolor 34 fps 2051 fps 60x
7-16x headroom over 120 fps for the heaviest workload (truecolor
full-screen redraws). Static library size 33 MiB -> 13 MiB.
TODO.md baseline numbers updated to reflect post-fix throughput;
the "Debug-mode lib" finding is folded into the result it produced
rather than left as an open item.
This commit is contained in:
8
.mise.toml
Normal file
8
.mise.toml
Normal file
@@ -0,0 +1,8 @@
|
|||||||
|
# mise config — `mise install` provisions the tools `make deps` needs.
|
||||||
|
#
|
||||||
|
# libghostty-vt is built from a pinned upstream Ghostty commit; that
|
||||||
|
# commit's build.zig.zon pins minimum_zig_version = 0.15.2. We match
|
||||||
|
# it here so contributors don't have to puzzle out the version from
|
||||||
|
# a deep upstream file.
|
||||||
|
[tools]
|
||||||
|
zig = "0.15.2"
|
||||||
26
CHANGELOG.md
26
CHANGELOG.md
@@ -8,17 +8,25 @@ loosely follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|||||||
|
|
||||||
### Fixed
|
### Fixed
|
||||||
- `make deps` now builds libghostty-vt with `-Doptimize=ReleaseFast`
|
- `make deps` now builds libghostty-vt with `-Doptimize=ReleaseFast`
|
||||||
instead of zig's silent `Debug` default. The default-Debug build
|
instead of zig's silent `Debug` default, and resolves `zig`
|
||||||
shipped an unoptimised CSI/SGR parser that ate 16-29 ms per
|
through `mise` when a project `.mise.toml` pins it. The
|
||||||
30-70 KiB full-screen frame in benchmarks, capping the entire
|
default-Debug build shipped an unoptimised CSI/SGR parser that
|
||||||
PTY-to-host pipeline at 34-63 fps no matter how fast the rest of
|
ate 16-29 ms per 30-70 KiB full-screen frame in benchmarks,
|
||||||
patterm got. The static library file size drops accordingly
|
capping the entire PTY-to-host pipeline at 34-63 fps. After the
|
||||||
(the Debug build was 33 MiB). Override with
|
rebuild the same pipeline runs at **930-2030 fps**: 27-32× the
|
||||||
`make deps GHOSTTY_VT_OPTIMIZE=Debug` only when debugging the
|
prior throughput, and 7-16× margin over 120 fps for full-screen
|
||||||
upstream library itself. Apply on existing checkouts with
|
truecolor ASCII video. Static library size drops from 33 MiB to
|
||||||
`make clean-deps && make deps`.
|
13 MiB. Override with `make deps GHOSTTY_VT_OPTIMIZE=Debug` only
|
||||||
|
when debugging the upstream library itself. Apply on existing
|
||||||
|
checkouts with `mise install && make clean-deps && make deps`.
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
|
- `.mise.toml` pinning `zig = "0.15.2"` (the minimum version the
|
||||||
|
vendored Ghostty commit requires). Contributors run
|
||||||
|
`mise install` once; the Makefile picks up the resulting `zig`
|
||||||
|
binary automatically via `mise which zig` and falls back to
|
||||||
|
PATH when mise isn't available, so the existing build flow
|
||||||
|
still works.
|
||||||
- ASCII-video stress benchmarks (`internal/app/bench_test.go`):
|
- ASCII-video stress benchmarks (`internal/app/bench_test.go`):
|
||||||
per-frame and per-stream variants at 30 / 60 / 120 fps targets,
|
per-frame and per-stream variants at 30 / 60 / 120 fps targets,
|
||||||
three workload fixtures (8-colour cells, 24-bit truecolor cells,
|
three workload fixtures (8-colour cells, 24-bit truecolor cells,
|
||||||
|
|||||||
15
Makefile
15
Makefile
@@ -31,10 +31,19 @@ deps-fetch: $(SOURCE)/.git/HEAD
|
|||||||
# debug build of the upstream lib.
|
# debug build of the upstream lib.
|
||||||
GHOSTTY_VT_OPTIMIZE ?= ReleaseFast
|
GHOSTTY_VT_OPTIMIZE ?= ReleaseFast
|
||||||
|
|
||||||
|
# Resolve zig via the project's mise pin (.mise.toml) when available,
|
||||||
|
# falling back to whatever's on PATH. mise keeps the zig version in
|
||||||
|
# lockstep with what the pinned ghostty commit requires; without it,
|
||||||
|
# contributors have to chase the version requirement themselves.
|
||||||
|
ZIG := $(shell command -v mise >/dev/null && mise which zig 2>/dev/null || command -v zig 2>/dev/null)
|
||||||
|
|
||||||
$(INSTALL)/lib/libghostty-vt.a: $(SOURCE)/.git/HEAD
|
$(INSTALL)/lib/libghostty-vt.a: $(SOURCE)/.git/HEAD
|
||||||
@command -v zig >/dev/null || { echo "ERROR: zig not on PATH (need >=0.15.2 to build libghostty-vt)"; exit 1; }
|
@if [ -z "$(ZIG)" ]; then \
|
||||||
@echo ">> building libghostty-vt with zig (optimize=$(GHOSTTY_VT_OPTIMIZE))"
|
echo "ERROR: zig not available. Run \`mise install\` (see .mise.toml — needs zig 0.15.2) or install zig manually."; \
|
||||||
@cd $(SOURCE) && zig build -Demit-lib-vt -Doptimize=$(GHOSTTY_VT_OPTIMIZE) --prefix $(INSTALL)
|
exit 1; \
|
||||||
|
fi
|
||||||
|
@echo ">> building libghostty-vt with $(ZIG) (optimize=$(GHOSTTY_VT_OPTIMIZE))"
|
||||||
|
@cd $(SOURCE) && $(ZIG) build -Demit-lib-vt -Doptimize=$(GHOSTTY_VT_OPTIMIZE) --prefix $(INSTALL)
|
||||||
@test -f $(INSTALL)/lib/libghostty-vt.a || { echo "ERROR: expected static lib at $(INSTALL)/lib/libghostty-vt.a"; exit 1; }
|
@test -f $(INSTALL)/lib/libghostty-vt.a || { echo "ERROR: expected static lib at $(INSTALL)/lib/libghostty-vt.a"; exit 1; }
|
||||||
@echo ">> libghostty-vt installed under $(INSTALL)"
|
@echo ">> libghostty-vt installed under $(INSTALL)"
|
||||||
|
|
||||||
|
|||||||
37
TODO.md
37
TODO.md
@@ -3,8 +3,8 @@ Findings from a codebase sweep — not user-reported, needs review before
|
|||||||
action. Each item names the anchor and a sketched fix.
|
action. Each item names the anchor and a sketched fix.
|
||||||
|
|
||||||
Baseline benchmark numbers (`go test -bench=. ./internal/app/`, AMD
|
Baseline benchmark numbers (`go test -bench=. ./internal/app/`, AMD
|
||||||
Ryzen 7 7800X3D, libghostty-vt **Debug-mode** — see the first item
|
Ryzen 7 7800X3D, libghostty-vt **ReleaseFast** after the Makefile
|
||||||
below):
|
fix landed):
|
||||||
|
|
||||||
```
|
```
|
||||||
# Renderer alone
|
# Renderer alone
|
||||||
@@ -19,35 +19,18 @@ ASCIIVideo_Stream_8Color_120fps 260 µs/frame 3845 fps_ceiling 3.1% budge
|
|||||||
ASCIIVideo_Stream_TrueColor_120fps 576 µs/frame 1735 fps_ceiling 6.9% budget
|
ASCIIVideo_Stream_TrueColor_120fps 576 µs/frame 1735 fps_ceiling 6.9% budget
|
||||||
|
|
||||||
# Full pipeline (em.Write + renderer + io.Discard write)
|
# Full pipeline (em.Write + renderer + io.Discard write)
|
||||||
Pipeline_ASCIIVideo_8Color_120fps 15838 µs/frame 63 fps_ceiling 190% budget
|
Pipeline_ASCIIVideo_8Color_120fps 493 µs/frame 2030 fps_ceiling 5.9% budget
|
||||||
Pipeline_ASCIIVideo_TrueColor_120fps 29224 µs/frame 34 fps_ceiling 350% budget
|
Pipeline_ASCIIVideo_TrueColor_120fps 1075 µs/frame 931 fps_ceiling 12.9% budget
|
||||||
|
|
||||||
# Emulator alone (libghostty-vt CSI/SGR parser)
|
# Emulator alone (libghostty-vt CSI/SGR parser)
|
||||||
Emulator_Write_Stream_8Color_120fps 15930 µs/frame 63 fps_ceiling
|
Emulator_Write_Stream_8Color_120fps 257 µs/frame 3890 fps_ceiling
|
||||||
Emulator_Write_Stream_TrueColor_120fps 29241 µs/frame 34 fps_ceiling
|
Emulator_Write_Stream_TrueColor_120fps 488 µs/frame 2051 fps_ceiling
|
||||||
```
|
```
|
||||||
|
|
||||||
The renderer alone hits 1700-3800 fps with margin. The full pipeline
|
Result of the fix below: 27-32× pipeline speedup, 60× emulator
|
||||||
caps at 34-63 fps. **The whole gap is libghostty-vt's em.Write — its
|
speedup. Pipeline hits 930-2030 fps end-to-end — 7-16× headroom
|
||||||
parser is shipping in Debug mode, which is also a 33 MiB static
|
over the 120 fps target on the heaviest workload (truecolor
|
||||||
library file (release builds are a fraction of that).**
|
full-screen redraws).
|
||||||
|
|
||||||
- [ ] **libghostty-vt was being built in Debug mode.** [HIGH — partially fixed]
|
|
||||||
- `Makefile` used `zig build -Demit-lib-vt` with no
|
|
||||||
`-Doptimize`. Zig's `standardOptimizeOption` defaults to
|
|
||||||
`.Debug`, so the shipped static lib was unoptimised. Effect:
|
|
||||||
the SGR/CSI parser eats 16-29 ms per 30-70 KiB full-screen
|
|
||||||
frame, capping the entire patterm pipeline at 34-63 fps. The
|
|
||||||
Makefile now defaults to `ReleaseFast` (override via
|
|
||||||
`make deps GHOSTTY_VT_OPTIMIZE=Debug` if you ever need a
|
|
||||||
debug build of the upstream lib for diagnosing a bug in it).
|
|
||||||
- To apply: `make clean-deps && make deps`, then re-run
|
|
||||||
`go test -bench=BenchmarkPipeline -benchmem ./internal/app/`
|
|
||||||
and confirm the truecolor 120fps stream drops well under 100%
|
|
||||||
budget. Update the numbers in this section after rebuilding.
|
|
||||||
- Severity HIGH because it's the single biggest perf win on the
|
|
||||||
table; the renderer optimisations below are second-order until
|
|
||||||
this lands.
|
|
||||||
|
|
||||||
|
|
||||||
- [ ] **viewport renderer allocates ~1 alloc per 4 input bytes on SGR/CSI-heavy chunks.** [MEDIUM]
|
- [ ] **viewport renderer allocates ~1 alloc per 4 input bytes on SGR/CSI-heavy chunks.** [MEDIUM]
|
||||||
|
|||||||
Reference in New Issue
Block a user