From 442eed605c18a02c255ad006229fd66f464ed89c Mon Sep 17 00:00:00 2001 From: Harry Bayliss Date: Fri, 15 May 2026 12:46:42 +0100 Subject: [PATCH] Add auto-generated perf audit findings to TODO MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codebase sweep for perf issues outside the per-PTY-chunk path that recent CHANGELOG work already covered. Ten findings under a new "Perf Audit (auto-generated)" section in TODO.md — anchored to file:line, classified MEDIUM/LOW, with a sketched fix per entry. None landed as code changes; review pending. --- TODO.md | 111 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 111 insertions(+) diff --git a/TODO.md b/TODO.md index 62018d3..c942441 100644 --- a/TODO.md +++ b/TODO.md @@ -1,3 +1,114 @@ +# Perf Audit (auto-generated 2026-05-15) +Findings from a codebase sweep — not user-reported, needs review before +action. Each item names the anchor and a sketched fix. + +- [ ] **Session.Children() allocates a fresh slice on every call.** [MEDIUM] + - `internal/app/session.go:530-541` walks `s.order` under `s.mu` and + builds a new `[]*Child` slice every time. Callers on hot paths: + `drawSidebar` calls it twice per frame + (`internal/app/sidebar.go:139` and `:171`); `drawTabBar` calls it + once per frame (`internal/app/tabbar.go:37`); the classifier + iterates it every 250 ms (`internal/app/classifier.go:38`); and + palette/navigation hit it on every Ctrl-A/D/W/S keystroke. + - Fix direction: store the snapshot in an `atomic.Pointer[[]*Child]` + on `Session`, refresh it under `s.mu` only when `Spawn` / `delete` + mutates the map. Readers get O(1) `Load()` with zero allocation — + same pattern already used for `listeners` (session.go:118-123). + +- [ ] **wait_for_pattern re-scans the entire stream/grid every iteration.** [MEDIUM] + - `internal/app/host.go:476-493` (the `check` closure). On `scope = + "scrollback"` it calls `c.StreamRead(0)` followed by + `stripANSIBytes(nil, b)` over the entire ring on every wake — a + full O(ring size) walk per chunk arrival. On `grid` it goes + through PlainText (one CGO call) plus a regex match against the + full grid string. For an agent waiting on a marker in a chatty + pane, every PTY chunk fires `check()`. + - Fix direction: for `scrollback`, track the offset of the last + check and run the regex only over the new tail, reusing a + per-call scratch buffer for ANSI stripping. For `grid`, dedupe + on `ScreenVersion()` — skip when version hasn't changed. + +- [ ] **search_output compiles regex + strips ANSI on every call.** [MEDIUM] + - `internal/app/host.go:428` compiles a fresh `regexp.Regexp` per + invocation; `:434` strips ANSI over the entire ring buffer when + `kind="rendered"`. Agents that poll `search_output` with the same + pattern (the typical "watch for marker" loop) repay both costs on + every call. + - Fix direction: small LRU of compiled regexes keyed by pattern + string (cap maybe 32) on `toolHost`. Cache the stripped-ANSI + buffer keyed by `c.ScreenVersion()` so consecutive searches over + an unchanged ring reuse the strip. + +- [ ] **GetProcessOutput grid mode acquires the emulator twice.** [MEDIUM] + - `internal/app/host.go:375-391` does `em := c.Emulator()` for + ActiveScreen / Cursor / Size, then at line 387 re-fetches + `em := c.Emulator()` for PlainText. Each `Emulator()` call goes + through `ptyMu` and inspects the live PTY pointer. Under a + chatty agent polling `get_process_output` every 100 ms this is + a redundant lock and pointer chase per call. + - Fix direction: hold the emulator reference from the first + lookup; reuse it for PlainText. The check `if em == nil` still + runs cleanly because the variable is captured. + +- [ ] **FindChildByIdentity is O(N) under the session lock.** [LOW] + - `internal/app/session.go:553-565` scans the children map looking + for a matching `Identity` token on every new mcp-stdio + connection. Not a steady-state hot path — only fires once per + child spawn — but with many short-lived sub-agents it adds up + and contends with everyone else taking `s.mu`. + - Fix direction: maintain an `identityIndex map[string]string` + (identity → child id) updated alongside spawn / exit, give the + lookup an O(1) read. + +- [ ] **Per-promoter regex matches in the idle classifier.** [LOW] + - `internal/app/idle.go:175-182` (`matchAny`) walks each compiled + pattern and runs the DFA over the same 4 KiB tail. A preset with + five permission patterns + five error patterns is ten DFA + invocations per child per 250 ms tick. + - Fix direction: at preset load time, compile each `_patterns` + list into a single alternation regex (`(?:p1)|(?:p2)|…`). The + classifier then makes one Match call per category per tick. + +- [ ] **Port-detection dedup is O(N²) over c.ports.** [LOW] + - `internal/app/child.go:461-467`: for each fresh URL match the + code linearly scans the existing port list. The list rarely + grows past a handful, but a dev server that lists "all open + ports" in one log line interacts badly: M new matches × N + existing entries. + - Fix direction: keep a `seenPorts map[int]struct{}` next to + `c.ports`, rebuilt on prune (none today). O(1) per match. + +- [ ] **Port-sighting string allocations happen before the dedup check.** [LOW] + - `internal/app/child.go:455-456` allocates `urlForm` and `portStr` + before line 461's `seen` walk. Both strings are wasted when the + port is already in `c.ports`. Inside `c.portsMu` for the whole + loop body too, blocking the `Ports()` reader path. + - Fix direction: bind the port int first (cheap parse from + `m[1]`), do the seen check, only then allocate the URL string + for the surviving sighting. + +- [ ] **classifier `time.Now()` syscall per child per tick.** [LOW] + - `internal/app/classifier.go:54` (and the `IdleMS` / + `TitleIdleMS` helpers it transitively calls in + `internal/app/child.go:343-374`) each call `time.Now()`. + Reading time on Linux is fast (vDSO) but with N children × 4 + `time.Now()` per tick × 4 ticks/sec it's wasted work that can + be batched. + - Fix direction: capture `now := time.Now().UnixNano()` once at + the top of `classifyAll` and thread it into `classifyOne` and + the helpers as a parameter. + +- [ ] **wait_for_pattern subscribes a listener for every call.** [LOW] + - `internal/app/host.go:472-474`: each invocation calls + `Session.Subscribe(wake)` which clones the listener slice and + swaps the atomic pointer; the `defer Unsubscribe` does the same + on exit. Two allocations per `wait_for_pattern`. The agent + pattern of looping on `wait_for_pattern` after every tool call + pays this churn on the steady-state path. + - Fix direction: a per-child `chunkBroadcaster` registered once + at child spawn that hands out lightweight subscriber tokens, + rather than going through the full session listener machinery. + # On Hold - [ ] There's a unicode being displayed in opencode [ON HOLD] - Investigated 2026-05-14: patterm passes ghostty grapheme codepoints