Use mise to install zig + go in release CI; cut 0.0.4

`mlugg/setup-zig` was chasing mirrors for ~4 minutes on every run (see v0.0.1 / v0.0.2 logs) and `actions/setup-go` was spending another ~4 minutes downloading Go before patterm started building. mise already manages the project's zig pin; adding `go = "1.26.3"` to `.mise.toml` (matching go.mod) lets `jdx/mise-action@v2` install both with one cached step. Subsequent runs reuse the mise cache instead of re-resolving mirror URLs and re-downloading toolchains. Also adds an `actions/cache@v4` step for `~/.cache/go-build` and `~/go/pkg/mod` keyed on `go.sum` so `go build` itself doesn't re-pull modules every tag push.
Fix error flashes replacing focused pane
2026-05-15 19:38:13 +01:00 · 2026-05-15 19:27:42 +01:00 · 2026-05-15 19:14:21 +01:00 · 2026-05-15 19:13:54 +01:00 · 2026-05-15 19:09:21 +01:00 · 2026-05-15 18:25:38 +01:00
36 changed files with 5567 additions and 389 deletions
--- a/.gitea/workflows/release.yml
+++ b/.gitea/workflows/release.yml
@@ -11,14 +11,19 @@ jobs:
    steps:
      - uses: actions/checkout@v4

-      - uses: actions/setup-go@v5
+      - uses: jdx/mise-action@v2
        with:
-          go-version-file: go.mod
          cache: true

-      - uses: mlugg/setup-zig@v1
+      - name: Cache Go modules
+        uses: actions/cache@v4
        with:
-          version: 0.15.2
+          path: |
+            ~/.cache/go-build
+            ~/go/pkg/mod
+          key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
+          restore-keys: |
+            ${{ runner.os }}-go-

      - name: Build libghostty-vt
        run: make deps
--- a/.mise.toml
+++ b/.mise.toml
@@ -0,0 +1,10 @@
+# mise config — `mise install` provisions the tools `make deps` needs.
+#
+# libghostty-vt is built from a pinned upstream Ghostty commit; that
+# commit's build.zig.zon pins minimum_zig_version = 0.15.2. We match
+# it here so contributors don't have to puzzle out the version from
+# a deep upstream file. The go pin matches go.mod so CI and local
+# builds use the same toolchain.
+[tools]
+zig = "0.15.2"
+go = "1.26.3"
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -6,7 +6,207 @@ loosely follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

 ## [Unreleased]

+## [0.0.4] - 2026-05-15
+
+### Changed
+- Release workflow (`.gitea/workflows/release.yml`) now provisions
+  Zig and Go through `jdx/mise-action@v2`, reading the versions from
+  `.mise.toml` (zig 0.15.2, go 1.26.3). Both toolchains were
+  previously installed via `mlugg/setup-zig` and `actions/setup-go`,
+  whose mirror chase / GitHub fetch combined for ~8 minutes per run
+  before any patterm code compiled. mise pulls each tool once and
+  caches the install dir, so subsequent runs hit the cache instead of
+  re-downloading. `make deps` still resolves zig via `mise which zig`
+  with a PATH fallback; `go.mod` already pinned `go 1.26.3`, so the
+  new `go` entry in `.mise.toml` just keeps CI and local builds on
+  the same toolchain.
+- A Go module/build cache step (`actions/cache@v4`, keyed on
+  `go.sum`) was added so `go build` doesn't re-download dependencies
+  on every tag push.
+
+## [0.0.3] - 2026-05-15
+
 ### Added
+- Auto-summarization for top-level agent tabs. patterm now loads
+  `$XDG_CONFIG_HOME/patterm/settings.json`, enables Codex-based
+  summaries by default (`gpt-5.4-mini`; OpenCode defaults to
+  `opencode-go/minimax-m2.7`), and can run Codex, OpenCode, or opt-in
+  Claude summarizers with configurable model names. Summary
+  attempts are armed by meaningful human input, wait for recent output
+  to go quiet, and respect a minimum cadence so unchanged tabs are not
+  summarized on a timer. The active thread summary appears under the
+  top tab title and in the sidebar below the Agent Tree section.
+- Settings overlay reachable from the command palette via
+  `Open Settings`. The searchable Settings picker opens
+  `Agents / Auto-summarization`, where users can enable/disable
+  summaries, choose provider, edit provider model names, cycle cadence,
+  test the selected summarizer (`patterm okay`), summarize the current
+  top-level agent immediately, and explicitly save or cancel draft
+  settings changes. Cadence choices match Solo: `15s`, `30s`, and
+  `1m`; the value is a minimum quiet/activity gap before another
+  summary attempt for the same top-level agent, not a background
+  periodic timer.
+
+### Changed
+- Command palette UX overhaul. The single flat list grew section
+  bands (`── Focused ──`, `── Open ──`, `── Spawn ──`, `── Quit ──`)
+  so the rows are scannable at a glance; cursor navigation skips
+  the dim header rows transparently. A chip strip — `[All] Open
+  Spawn Close` — sits below the query line and tracks the active
+  macro filter; `Tab` / `Shift-Tab` cycle through the chips, and
+  the typed-prefix macros (`sw `, `sp `, `k `) still work and now
+  collapse the whole prefix on a single backspace instead of
+  leaving a stray `sw` behind. The title bar surfaces the current
+  focus subject (`on: <child>` / `pad: <name>`) so the user knows
+  which Focused row is targeting what. The duplicate global Close
+  list is gone — close is reachable via the Focused-section action,
+  the `k ` macro / `[Close]` chip, or the new `Ctrl-X` inline close
+  on a Switch row. The "(current)" marker on the focused Switch row
+  became a leading `▶`. The empty-state hint now reads `no matches
+  · ⌫ to widen` instead of bare `no matches`. The middle divider
+  shows a `▼ N more` / `▲ N above` scroll indicator when the list
+  overflows, and the footer carries a `cursor/total` counter.
+- Spawn verbs are unified on **Spawn**: `Run process: …` →
+  `Spawn process: …`, `New Terminal` → `Spawn terminal`, and the
+  freeform-form row is now `Spawn process… (custom)` so the
+  trailing ellipsis still signals it opens a form.
+- Filtering switched from binary fuzzy-include to scored ranking.
+  Prefix matches beat word-boundary matches beat substring matches
+  beat scattered-fuzzy matches; ties fall back to section order so
+  a Focused-section hit always outranks an equally tight Spawn
+  hit. The matched characters in the rendered label render in
+  accent+bold so the user can see why a row matched.
+- Rename forms split the long subject (`scratchpad:
+  some-really-long-name.md`) onto its own dim row above the input
+  so the title bar no longer truncates with an ellipsis when the
+  subject name is wide.
+- New palette accelerators: `Alt-1` … `Alt-9` quick-pick the Nth
+  visible row, `Home` / `End` jump to first / last selectable row,
+  `?` (with empty query) opens an inline keybinding cheat-sheet
+  which any further keystroke dismisses, and `Ctrl-R` inside the
+  Spawn-process form toggles "Relaunch on exit" without leaving
+  the command field.
+
+### Fixed
+- Error/status flashes now restore the currently focused pane instead
+  of drawing the empty-state hint over a running agent or process.
+- Release workflow (`.gitea/workflows/release.yml`) now uses
+  `mlugg/setup-zig@v2` instead of the deprecated `@v1`. v1 hard-coded
+  the pre-0.14 tarball name (`zig-linux-x86_64-<ver>.tar.xz`), so
+  every mirror and the official `ziglang.org/builds` returned 404 for
+  Zig 0.15.2 and the v0.0.1 / v0.0.2 tag pushes never produced a
+  release asset. v2 uses the post-0.14 `zig-x86_64-linux-<ver>.tar.xz`
+  layout, so the runner can fetch Zig and build patterm.
+- Typing into a focused child while its emulator viewport is
+  scrolled up into scrollback history now auto-snaps the viewport
+  back to the live area. Previously the keystroke reached the
+  child PTY but the input box was off-screen below the visible
+  region, so it looked like typing did nothing. Wheel scrolling
+  and Ctrl-B are unchanged; only forwarded keystrokes snap.
+- Top tab bar now keeps the top-level agent's tab highlighted
+  when focus is on one of its sub-agents (or on a Processes pane
+  entry, matching the existing agent-tree behavior). Previously
+  the tab would lose its highlight as soon as you stepped into a
+  child agent, even though you were still within that thread.
+
+### Changed
+- MCP tool descriptions and `help('coordination')` /
+  `help('readiness')` now spell out that a sub-agent's reply to
+  `send_message` lands in the caller's own pane (tagged
+  `[sub-agent:<name>]`), not in the sub-agent's output. The canonical
+  wait-for-reply pattern — `send_message` → `timer_fire_when_idle_any`
+  on the sub-agent → read your own pane — is now called out on
+  `send_message`, `wait_for_pattern`, both `timer_fire_when_idle_*`,
+  the help topics, and the server-instructions preamble every agent
+  reads at startup. Previously `wait_for_pattern` was the obvious
+  blocking primitive in the catalog, and agents routinely called it
+  against the sub-agent for a reply that had already arrived in their
+  own pane, deadlocking until the wait timed out. No behaviour
+  changes; descriptions only.
+- Agent-initiated `spawn_agent` and `spawn_process` MCP calls no
+  longer steal viewport focus from the currently active tab. The
+  new child still appears in the sidebar and tab bar; switch to it
+  explicitly via the palette or `select_process`. Palette-initiated
+  spawns and persistence restores are unchanged — they still auto-
+  focus the new pane.
+- Sidebar rows (Processes, Agent Tree, Scratchpads) now truncate
+  overflowing names with a trailing `…` instead of spilling into
+  the main viewport. The focused row marquees its name when it
+  overflows — 1 s hold on the head, ~150 ms per cell scroll until
+  the tail is visible, 1 s hold on the tail, snap back. Row
+  position never moves while the marquee animates. When budget is
+  tight, the trailing timer indicator drops before the name
+  ellipses, since the name is the only identifier the row carries.
+
+## [0.0.2] - 2026-05-15
+
+### Added
+- `.mise.toml` pinning `zig = "0.15.2"` (the minimum version the
+  vendored Ghostty commit requires). Contributors run
+  `mise install` once; the Makefile picks up the resulting `zig`
+  binary automatically via `mise which zig` and falls back to
+  PATH when mise isn't available, so the existing build flow
+  still works.
+- ASCII-video stress benchmarks (`internal/app/bench_test.go`):
+  per-frame and per-stream variants at 30 / 60 / 120 fps targets,
+  three workload fixtures (8-colour cells, 24-bit truecolor cells,
+  and a Bad-Apple-style 1-bit pattern). Each stream benchmark
+  reports `µs/frame`, an achievable `fps_ceiling`, and `budget_pct`
+  so you can read off "do we hit N fps?" directly. A matching
+  Pipeline_ASCIIVideo_* set includes libghostty-vt's em.Write CGO
+  and an io.Discard stdout write so the FPS claim reflects the
+  whole pipeline, not just the renderer.
+- MCP `initialize.instructions`, the `spawn_agent` tool description
+  (visible to LLMs via `tools/list`), and the `help('spawning')`
+  topic now spell out — in the three places vendor TUIs actually
+  consult — that the connected `patterm` MCP server is the only
+  correct way to drive the host. Anti-patterns called out by name:
+  (a) trying to launch `patterm` / `patterm mcp-stdio` themselves,
+  (b) piping JSON-RPC into the per-PID Unix socket via `perl` /
+  `nc` / `socat` / `curl`, and (c) shelling out to `claude` /
+  `codex` / `opencode` to start a peer. Each of those bypasses
+  caller identity, so a sub-agent spawned that way reads back as
+  a stray top-level tab instead of a child under the spawning
+  agent. Codex was hitting (b) and (c) in practice — this is the
+  fix.
+- `--debug[=DIR]` flag captures detailed run artefacts for offline
+  analysis: a verbose `patterm.log` (the existing `PATTERM_DEBUG_LOG`
+  stream), an `events.jsonl` lifecycle log (spawn / exit / idle-state
+  changes with timestamps), and per-child `<id>.raw` files containing
+  the raw PTY byte stream. With no argument, the dated subdir
+  `$XDG_STATE_HOME/patterm/debug/YYYYMMDD-HHMMSS` is used; pass an
+  explicit path to override. All output goes to files — stdout/stderr
+  are untouched.
+- `--profile[=DIR]` flag captures pprof data plus concrete
+  performance counters for performance work: `cpu.pprof` (running
+  for the lifetime of the session), plus `heap.pprof` and
+  `goroutine.pprof` snapshots written on shutdown; alongside them,
+  a per-hot-path metrics tracker writes `metrics.jsonl` (one row
+  per second with chunk/byte rates, per-stage mean and max
+  latencies, and cache hit rates) plus a final `metrics.json`
+  aggregate and a human-readable `summary.txt` on exit.
+  Instrumented hot paths: `OnPTYOut`, viewport `renderer.Render`,
+  host stdout writes, libghostty-vt `emulator.Write` / `Title`,
+  sidebar / tab bar / status line draws (with cache-hit
+  accounting), snapshot replays, and the chrome ticker (so you can
+  see how often it fires with nothing to do). Defaults to
+  `$XDG_STATE_HOME/patterm/profile/YYYYMMDD-HHMMSS`. All
+  diagnostics (startup, errors) are written to `profile.log`
+  inside the dir, never to stdout/stderr.
+- Renderer benchmark suite (`internal/app/bench_test.go`). Three
+  workload fixtures — plain ASCII, SGR-styled lines, and a
+  ratatui-style cursor-shuffling burst — plus an OSC-gate
+  micro-benchmark. Run via `go test -bench=. -benchmem
+  ./internal/app/`. Gives a stable reference for the per-chunk
+  cost of the viewport renderer so future changes can be compared
+  apples-to-apples.
+- "New Terminal" entry in the command palette spawns a bare interactive
+  `$SHELL` pane (kind `terminal`). Unlike "Run process: …" presets,
+  which are session-persistent and reachable via `restart_process`,
+  terminals are ephemeral — once they exit they vanish from the
+  Processes sidebar instead of lingering as a dead row. The default
+  `shell` process preset that previously seeded on first run has been
+  removed; this entry replaces it.
 - Per-child idle-state classifier with five states (`idle`, `working`,
  `thinking`, `permission`, `error`) and three pluggable strategies:
  `output_activity` (claude / opencode defaults), `osc_title_stability`
@@ -98,6 +298,11 @@ loosely follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
  after a child program disables mouse tracking.

 ### Changed
+- The palette's per-child "Kill <name>" action is now labelled
+  "Close <name>". The underlying signal (SIGTERM) and behaviour are
+  unchanged; the new label matches the existing "Close agent: …"
+  context entry and reads less violent for what is really just a
+  graceful termination.
 - `timer_wait` is now a thin wrapper over the shared timer manager
  (`timer_set` semantics). Existing callers see no behavioural change;
  the timer is visible in `timer_list` while it's pending.
@@ -108,6 +313,47 @@ loosely follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
  renders the canonical `--flag` form.

 ### Fixed
+- `make deps` now builds libghostty-vt with `-Doptimize=ReleaseFast`
+  instead of zig's silent `Debug` default, and resolves `zig`
+  through `mise` when a project `.mise.toml` pins it. The
+  default-Debug build shipped an unoptimised CSI/SGR parser that
+  ate 16-29 ms per 30-70 KiB full-screen frame in benchmarks,
+  capping the entire PTY-to-host pipeline at 34-63 fps. After the
+  rebuild the same pipeline runs at **930-2030 fps**: 27-32× the
+  prior throughput, and 7-16× margin over 120 fps for full-screen
+  truecolor ASCII video. Static library size drops from 33 MiB to
+  13 MiB. Override with `make deps GHOSTTY_VT_OPTIMIZE=Debug` only
+  when debugging the upstream library itself. Apply on existing
+  checkouts with `mise install && make clean-deps && make deps`.
+- Long claude session resume (and codex steady-state rendering) is
+  noticeably faster. Two costs that scaled per-PTY-chunk are now
+  deferred or short-circuited: (1) `drawSidebar()` used to run
+  synchronously for every chunk that scrolled — on a session
+  resume where every chunk scrolls, this rebuilt the full sidebar
+  string hundreds of times for a frame that was almost always
+  cache-equal. The sidebar now signals dirty and the chrome ticker
+  (60 Hz) handles the repaint. (2) `pumpChild` polled the
+  emulator's OSC title after every PTY chunk via CGO, even for
+  chunks (the common case under codex/ratatui) that carry no OSC
+  bytes at all. The poll is now gated on a containsOSC scan over
+  the chunk.
+- Click-and-drag text selection from alt-screen TUIs (codex in
+  particular) now works. Patterm used to keep host SGR mouse
+  reporting armed continuously, which forced the host terminal to
+  forward every click as an escape sequence and prevented native
+  selection. The host's mouse mode now follows the focused child's
+  screen side: primary-screen children keep mouse armed (so wheel
+  scrollback works), alt-screen children get host mouse disabled by
+  default. Alt-screen TUIs that need mouse events (vim, less, etc.)
+  re-enable mouse-mode themselves; the viewport renderer forwards
+  those toggles to the host while the child is on alt. Leaving alt
+  re-arms host mouse reporting so wheel scrollback resumes.
+- Exited terminal panes (kind `terminal`, including those launched via
+  the new "New Terminal" palette entry or MCP `spawn_process` with
+  `kind=terminal`) are now removed from the session and the Processes
+  sidebar as soon as they exit. Previously they stuck around as a
+  greyed-out row indistinguishable from an exited command process,
+  even though terminals have no restart path.
 - `whoami` and `help("timers")` now advertise the full Solo-parity timer
  surface (`timer_set`, `timer_fire_when_idle_any`,
  `timer_fire_when_idle_all`, `timer_cancel`, `timer_pause`,
--- a/26
+++ b/26
@@ -20,10 +20,30 @@ $(SOURCE)/.git/HEAD:

 deps-fetch: $(SOURCE)/.git/HEAD

+# Zig's `standardOptimizeOption` defaults to .Debug when no
+# -Doptimize is passed, which makes libghostty-vt's CSI/SGR parser
+# an order of magnitude slower — truecolor full-screen frames spend
+# ~16-29 ms each in em.Write under Debug (see
+# internal/app/bench_test.go BenchmarkEmulator_Write_*), which caps
+# the full PTY-to-host pipeline at ~60 fps. ReleaseFast is the
+# right default for the shipped artefact. Override with
+# `make deps GHOSTTY_VT_OPTIMIZE=Debug` when you actually want a
+# debug build of the upstream lib.
+GHOSTTY_VT_OPTIMIZE ?= ReleaseFast
+
+# Resolve zig via the project's mise pin (.mise.toml) when available,
+# falling back to whatever's on PATH. mise keeps the zig version in
+# lockstep with what the pinned ghostty commit requires; without it,
+# contributors have to chase the version requirement themselves.
+ZIG := $(shell command -v mise >/dev/null && mise which zig 2>/dev/null || command -v zig 2>/dev/null)
+
 $(INSTALL)/lib/libghostty-vt.a: $(SOURCE)/.git/HEAD
-	@command -v zig >/dev/null || { echo "ERROR: zig not on PATH (need >=0.15.2 to build libghostty-vt)"; exit 1; }
-	@echo ">> building libghostty-vt with zig"
-	@cd $(SOURCE) && zig build -Demit-lib-vt --prefix $(INSTALL)
+	@if [ -z "$(ZIG)" ]; then \
+		echo "ERROR: zig not available. Run \`mise install\` (see .mise.toml — needs zig 0.15.2) or install zig manually."; \
+		exit 1; \
+	fi
+	@echo ">> building libghostty-vt with $(ZIG) (optimize=$(GHOSTTY_VT_OPTIMIZE))"
+	@cd $(SOURCE) && $(ZIG) build -Demit-lib-vt -Doptimize=$(GHOSTTY_VT_OPTIMIZE) --prefix $(INSTALL)
 	@test -f $(INSTALL)/lib/libghostty-vt.a || { echo "ERROR: expected static lib at $(INSTALL)/lib/libghostty-vt.a"; exit 1; }
 	@echo ">> libghostty-vt installed under $(INSTALL)"

--- a/TODO.md
+++ b/TODO.md
@@ -1,16 +1,102 @@
- [ ] We should probably rename the Kill <Process> terminology to Close <Process> instead, across processes and agents.
- [ ] Exited shells are still being treated as active processes. They should be removed from the process list when they exit.
- [ ] Shells should be renamed to terminals. "New Terminal" etc.
- [ ] Codex seemed to think that it needed to launch patterm itself to get the mcp working
- [ ] I cant click and drag to select text from codex
- [ ] codex uses perl to interact with the socket rather than calling mcp tools
-  - when it _did_ open a sub claude it opened it as a separate tab rather than a sub-agent.
- [ ] codex rendering is VERY slow
-  - maybe we need to use diffing rather than rendering the entire viewport for performance
- We should add a --debug and --profile flag, so we can get detailed performance data and full logs of the agent output to be debugged later on.
-  - I don't mind what format this is in, ideally easy for LLMs to understand
- [ ] Resuming a long claude session takes a couple of seconds for the entire buffer to load in, it looks like it's scrolling down for a couple seconds.
-  - In raw alacritty this is instant, so there's some sort of performance issue with patterm's terminal emulation.
+# Perf Audit (reviewed 2026-05-15)
+Findings that survived the 2026-05-15 review pass. Low and marginal
+items from the original sweep were removed; remaining items have enough
+measured or workflow evidence to justify action.
+
+Baseline benchmark numbers (`go test -bench=. ./internal/app/`, AMD
+Ryzen 7 7800X3D, libghostty-vt **ReleaseFast** after the Makefile
+fix landed):
+
+```
+# Renderer alone
+ViewportRenderer_PlainASCII       229 MB/s     1.3 KB/op    6 allocs/op
+ViewportRenderer_StyledLines       89 MB/s    91   KB/op  4325 allocs/op
+ViewportRenderer_RatatuiBurst      40 MB/s   365   KB/op 17306 allocs/op
+RendererThroughput_ReuseInstance   90 MB/s   316   KB/op 17380 allocs/op
+ContainsOSC_NoOSC                3050 MB/s     0   B/op     0 allocs/op
+
+# ASCII-video stream (renderer only — 3 sec at the target fps)
+ASCIIVideo_Stream_8Color_120fps     260 µs/frame  3845 fps_ceiling   3.1% budget
+ASCIIVideo_Stream_TrueColor_120fps  576 µs/frame  1735 fps_ceiling   6.9% budget
+
+# Full pipeline (em.Write + renderer + io.Discard write)
+Pipeline_ASCIIVideo_8Color_120fps     493 µs/frame  2030 fps_ceiling   5.9% budget
+Pipeline_ASCIIVideo_TrueColor_120fps 1075 µs/frame   931 fps_ceiling  12.9% budget
+
+# Emulator alone (libghostty-vt CSI/SGR parser)
+Emulator_Write_Stream_8Color_120fps    257 µs/frame  3890 fps_ceiling
+Emulator_Write_Stream_TrueColor_120fps 488 µs/frame  2051 fps_ceiling
+```
+
+The current pipeline still has large 120 fps headroom. The remaining
+renderer concern is multi-MiB styled replay latency and allocation
+churn, not normal steady-state frame budget.
+
+
+- [ ] **viewport renderer allocates heavily on SGR/CSI-heavy chunks.** [MEDIUM]
+  - Review evidence: five benchmark reps confirmed
+    `ViewportRenderer_StyledLines` at about 4,325 allocs per 16 KiB
+    chunk (~91.5 KB/op, roughly 1 alloc per 3.8 input bytes), and
+    `ViewportRenderer_RatatuiBurst` at about 17,306 allocs per chunk
+    (~365 KB/op). A 5 MiB styled resume benchmark allocated about
+    31 MB across 1.38M objects.
+  - Likely hot paths: generic CSI/SGR output in
+    `internal/app/viewport_renderer.go` sends many sequences through
+    `vr.shifter.Shift(vr.buf)`, while `internal/app/cursorshift.go`
+    returns a fresh `[]byte` via `pending.String()` on every
+    `Shift` call and parses CSI params through `string(raw)` /
+    `strings.Split`. The mode-helper `string(params)` conversions
+    are real, but probably not the main SGR-heavy cost.
+  - Fix direction: make `cursorShifter` write into caller-owned
+    scratch output or directly into the viewport renderer's pending
+    builder; parse CSI params from byte slices; pre-grow/reuse
+    renderer and shifter buffers. Re-run styled-lines, ratatui, and
+    5 MiB resume benchmarks; use pprof when available to confirm the
+    top allocation sites.
+
+- [ ] **large styled resume/replay dumps spend visible time in viewport rendering.** [MEDIUM]
+  - Review evidence: `BenchmarkSessionResume_5MiBStyled` measured
+    about 58 ms median and 63 ms p95 over five reps. The plain 5 MiB
+    benchmark was about 23-24 ms with only 21 allocs. The live path
+    renders focused PTY chunks through `renderer.Render`, then still
+    pays emulator writes, ring writes, event dispatch, stdout writes,
+    and real terminal paint.
+  - Scope: this is not a Codex steady-state throughput limit. A
+    100 KB/s stream is far below the styled renderer's ~80-90 MB/s
+    ceiling. It matters for multi-MiB burst replay, resume/startup
+    dumps, and dense full-screen churn.
+  - Fix direction: do the allocation fix first, since it should also
+    improve throughput. After that, invest further only if styled
+    resume traces remain user-visible or the styled-lines benchmark
+    is still under roughly 300 MB/s.
+
+- [ ] **wait_for_pattern re-scans the entire stream/grid while waiting.** [MEDIUM]
+  - `internal/app/host.go:476-493` (the `check` closure). On
+    `scope="scrollback"` it calls `c.StreamRead(0)` followed by
+    `stripANSIBytes(nil, b)`, so each check can copy, strip, and
+    search the full 1 MiB ring. On `scope="grid"` it calls
+    `PlainText()` and runs the regex against the full grid string.
+  - Caveat from review: the current chunk notifier coalesces bursts
+    with a buffered channel and has a 500 ms fallback, so this is not
+    necessarily one full scan per PTY chunk. It is still meaningful
+    for active waits on chatty panes.
+  - Fix direction: for `scrollback`, track the last checked stream
+    offset and search only new output plus a bounded overlap/scratch
+    buffer so matches spanning chunks are not missed. For `grid`,
+    dedupe on `ScreenVersion()` and skip work when the version has
+    not changed.
+
+- [ ] **search_output rebuilds and searches whole scrollback on every call.** [MEDIUM]
+  - `internal/app/host.go:428-437` compiles a fresh regex, reads the
+    stream from offset 0, strips ANSI for `kind="rendered"`, converts
+    the full buffer to a string, and splits it into lines before
+    applying `limit`. This is meaningful when agents poll the same
+    pattern; it is low impact for ad hoc searches.
+  - Fix direction: cache compiled regexes by pattern; cache stripped
+    rendered output by child id and stream end offset; avoid
+    `strings.Split` over the whole ring when only the first `limit`
+    matches are needed. Prefer an incremental search shape if this
+    becomes the standard "watch for marker" path.

 # On Hold
 - [ ] There's a unicode <?> being displayed in opencode [ON HOLD]
--- a/cmd/patterm/main.go
+++ b/cmd/patterm/main.go
@@ -16,7 +16,10 @@ import (
 	"context"
 	"fmt"
 	"os"
+	"path/filepath"
+	"runtime"
 	"runtime/debug"
+	"runtime/pprof"
 	"time"

 	flag "github.com/spf13/pflag"
@@ -49,7 +52,13 @@ func main() {
 	var (
 		projectDir  = flag.String("project", "", "project directory (default $PWD)")
 		showVersion = flag.Bool("version", false, "print version and exit")
+		debugDir    = flag.String("debug", "", "write debug logs + per-child raw PTY output to DIR (auto-picks a dated subdir under $XDG_STATE_HOME/patterm/debug when DIR is omitted)")
+		profileDir  = flag.String("profile", "", "write pprof files (cpu/heap/goroutine) and live perf counters (metrics.jsonl per-second, metrics.json + summary.txt on exit) to DIR (auto-picks a dated subdir under $XDG_STATE_HOME/patterm/profile when DIR is omitted)")
 	)
+	// Allow bare `--debug` / `--profile` with no value — pflag treats
+	// them as boolean-shaped strings, picking a sensible default dir.
+	flag.Lookup("debug").NoOptDefVal = "auto"
+	flag.Lookup("profile").NoOptDefVal = "auto"
 	flag.Parse()

 	if *showVersion {
@@ -73,15 +82,104 @@ func main() {
 		die("chdir %s: %v", cwd, err)
 	}

+	resolvedDebug, err := resolveDiagDir(*debugDir, "debug")
+	if err != nil {
+		die("debug: %v", err)
+	}
+	resolvedProfile, err := resolveDiagDir(*profileDir, "profile")
+	if err != nil {
+		die("profile: %v", err)
+	}
+
+	stopProfile := startProfile(resolvedProfile)
+	defer stopProfile()
+
 	ctx := context.Background()
 	if err := app.Run(ctx, app.Options{
 		ProjectDir: cwd,
 		ProjectKey: key,
+		DebugDir:   resolvedDebug,
+		ProfileDir: resolvedProfile,
 	}); err != nil {
 		die("%v", err)
 	}
 }

+// resolveDiagDir turns the raw flag value into an absolute directory
+// path. Empty string disables the feature. The sentinel "auto" (set by
+// NoOptDefVal on bare flags) picks $XDG_STATE_HOME/patterm/<kind>/<ts>.
+// Any other value is treated as a literal path.
+func resolveDiagDir(raw, kind string) (string, error) {
+	if raw == "" {
+		return "", nil
+	}
+	if raw == "auto" {
+		base := os.Getenv("XDG_STATE_HOME")
+		if base == "" {
+			home, err := os.UserHomeDir()
+			if err != nil {
+				return "", err
+			}
+			base = filepath.Join(home, ".local", "state")
+		}
+		ts := time.Now().Format("20060102-150405")
+		return filepath.Join(base, "patterm", kind, ts), nil
+	}
+	return raw, nil
+}
+
+// startProfile begins a CPU profile under dir and returns a stop func
+// that writes heap + goroutine snapshots before flushing the CPU file.
+// Returns a no-op stop func when dir is empty. All diagnostics are
+// written to <dir>/profile.log — never to stdout/stderr — so the TUI
+// stays uncluttered.
+func startProfile(dir string) func() {
+	if dir == "" {
+		return func() {}
+	}
+	if err := os.MkdirAll(dir, 0o700); err != nil {
+		return func() {}
+	}
+	logPath := filepath.Join(dir, "profile.log")
+	plog := func(format string, args ...any) {
+		f, err := os.OpenFile(logPath, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0o600)
+		if err != nil {
+			return
+		}
+		defer f.Close()
+		fmt.Fprintf(f, format+"\n", args...)
+	}
+	cpuPath := filepath.Join(dir, "cpu.pprof")
+	f, err := os.Create(cpuPath)
+	if err != nil {
+		plog("cpu open: %v", err)
+		return func() {}
+	}
+	if err := pprof.StartCPUProfile(f); err != nil {
+		plog("cpu start: %v", err)
+		_ = f.Close()
+		return func() {}
+	}
+	plog("profiling started at %s", time.Now().Format(time.RFC3339Nano))
+	return func() {
+		pprof.StopCPUProfile()
+		_ = f.Close()
+		// Heap and goroutine snapshots at exit. Heap captures
+		// steady-state allocation; goroutine catches stragglers
+		// that didn't get cleaned up.
+		runtime.GC()
+		if hf, err := os.Create(filepath.Join(dir, "heap.pprof")); err == nil {
+			_ = pprof.Lookup("heap").WriteTo(hf, 0)
+			_ = hf.Close()
+		}
+		if gf, err := os.Create(filepath.Join(dir, "goroutine.pprof")); err == nil {
+			_ = pprof.Lookup("goroutine").WriteTo(gf, 0)
+			_ = gf.Close()
+		}
+		plog("profiling stopped at %s", time.Now().Format(time.RFC3339Nano))
+	}
+}
+
 func runMCPProxy() {
 	var (
 		socket   = flag.String("socket", "", "path to the running patterm process's MCP socket")
--- a/internal/app/app.go
+++ b/internal/app/app.go
@@ -29,6 +29,17 @@ import (
 type Options struct {
 	ProjectDir string
 	ProjectKey string
+	// DebugDir, when non-empty, enables verbose debug logging to
+	// <DebugDir>/patterm.log and per-child raw PTY output capture to
+	// <DebugDir>/<child-id>.raw. The dir is created if missing. Events
+	// (spawn / exit / state change) land in <DebugDir>/events.jsonl.
+	DebugDir string
+	// ProfileDir, when non-empty, enables in-process performance
+	// counters. patterm writes a per-second JSONL snapshot stream to
+	// <ProfileDir>/metrics.jsonl, a final aggregate to metrics.json,
+	// and a human-readable summary.txt on shutdown. The pprof files
+	// written by --profile sit alongside these in the same dir.
+	ProfileDir string
 }

 const keyCtrlK byte = 0x0b
@@ -44,6 +55,10 @@ func Run(ctx context.Context, opts Options) error {
 	if err != nil {
 		return fmt.Errorf("app: load presets: %w", err)
 	}
+	appSettings, settingsPath, err := loadSettings()
+	if err != nil {
+		logf("settings load: %v", err)
+	}

 	// Ensure the per-project scratchpad dir exists so MCP and the UI
 	// can read/write into it. SPEC §3.
@@ -77,6 +92,22 @@ func Run(ctx context.Context, opts Options) error {

 	sess := NewSession(opts.ProjectDir, opts.ProjectKey)
 	defer sess.Shutdown()
+
+	// Debug capture: when --debug=<dir> is set, write a verbose log
+	// (patterm.log), per-child raw PTY output (<id>.raw), and a
+	// JSONL event stream (events.jsonl). Installed before the TUI
+	// listener so the very first OnChildSpawned / OnPTYOut event
+	// is captured.
+	if opts.DebugDir != "" {
+		dc, err := openDebugCapture(opts.DebugDir)
+		if err != nil {
+			return fmt.Errorf("app: debug capture: %w", err)
+		}
+		os.Setenv("PATTERM_DEBUG_LOG", dc.LogPath())
+		sess.Subscribe(dc)
+		defer dc.Close()
+		logf("debug capture enabled at %s", opts.DebugDir)
+	}
 	// Snapshot persisted processes BEFORE attaching the store: Spawn
 	// mints fresh ids, so the old records would otherwise linger
 	// alongside the new ones. Drop them up front; the restore loop
@@ -113,29 +144,61 @@ func Run(ctx context.Context, opts Options) error {
 	ctx, cancel := context.WithCancel(ctx)
 	defer cancel()

+	// Performance tracker — instrumented hot-path timings written to
+	// <ProfileDir>. nil when --profile is off, in which case every
+	// record*() call is a fast nil check.
+	metrics, err := newMetricsTracker(opts.ProfileDir)
+	if err != nil {
+		return fmt.Errorf("app: metrics tracker: %w", err)
+	}
+	if metrics != nil {
+		go metrics.run(ctx)
+		defer metrics.close()
+	}
+
 	// Per-session idle-detection classifier. One goroutine ticks every
 	// 250ms over every live child and updates IdleState. It stops when
 	// ctx is cancelled.
 	go sess.runClassifier(ctx)

 	st := &uiState{
-		sess:       sess,
-		presets:    presets,
-		launcher:   launcher,
-		pads:       pads,
-		chromeWake: make(chan struct{}, 1),
-		trust:      trustStore,
-		timers:     host.timers,
-		hostCols:   cols,
-		hostRows:   rows,
-		stdinTTY:   term.IsTerminal(int(os.Stdin.Fd())),
+		sess:         sess,
+		presets:      presets,
+		launcher:     launcher,
+		pads:         pads,
+		chromeWake:   make(chan struct{}, 1),
+		trust:        trustStore,
+		timers:       host.timers,
+		hostCols:     cols,
+		hostRows:     rows,
+		stdinTTY:     term.IsTerminal(int(os.Stdin.Fd())),
+		metrics:      metrics,
+		settings:     appSettings,
+		settingsPath: settingsPath,
+		ctx:          ctx,
 	}
+	st.summaries = newSummaryManager(sess, opts.ProjectDir, presets, func() autoSummarySettings {
+		st.settingsMu.Lock()
+		defer st.settingsMu.Unlock()
+		return st.settings.AutoSummary.clone()
+	}, func() {
+		st.markChromeDirty()
+		st.markSidebarDirty()
+	}, func(_ string, result summaryState) {
+		if result.Error != "" {
+			st.flashError(fmt.Sprintf("summary: %v", result.Error))
+			return
+		}
+		st.flashTransient("summary updated")
+	})
+	sess.SetMetrics(metrics)
 	host.attention = st
 	host.focus = st
 	host.prompter = st
 	host.scratch = st
 	st.lastExit.Store(-1)
 	sess.Subscribe(st)
+	go st.summaries.run(ctx)

 	st.enterScreen()
 	st.renderEmptyState()
@@ -248,11 +311,42 @@ func Run(ctx context.Context, opts Options) error {
 			case <-st.chromeWake:
 			case <-ticker.C:
 			}
-			if !st.chromeDirty.Swap(false) {
+			chromeChanged := st.chromeDirty.Swap(false)
+			sidebarChanged := st.sidebarDirty.Swap(false)
+			didWork := chromeChanged || sidebarChanged
+			st.metrics.recordTickerFire(didWork)
+			if !didWork {
 				continue
 			}
-			st.drawTabBar()
-			st.drawStatusLine()
+			if chromeChanged {
+				st.drawTabBar()
+				st.drawStatusLine()
+			}
+			if sidebarChanged {
+				st.drawSidebar()
+			}
+		}
+	}()
+
+	// Marquee ticker: while a focused sidebar row's name overflows the
+	// rail width, advance the pause-scroll-pause animation by marking
+	// the sidebar dirty every marqueeStep. The chrome ticker above does
+	// the actual repaint. When no row is animating, this is a single
+	// cheap wakeup with no work.
+	wg.Add(1)
+	go func() {
+		defer wg.Done()
+		ticker := time.NewTicker(marqueeStep)
+		defer ticker.Stop()
+		for {
+			select {
+			case <-ctx.Done():
+				return
+			case <-ticker.C:
+			}
+			if st.marquee.active() {
+				st.markSidebarDirty()
+			}
 		}
 	}()

@@ -326,7 +420,6 @@ type uiState struct {
 	// switch resets the offset cleanly.
 	padOffsetName string

-
 	// activeAgentID tracks which top-level agent tab "owns" the agent
 	// tree section of the sidebar. It only updates when focus lands on
 	// an agent (or one of its sub-agents), so the agent tree stays
@@ -355,6 +448,17 @@ type uiState struct {
 	hostCols, hostRows uint16
 	stdinTTY           bool

+	// metrics is the optional performance tracker. nil when --profile
+	// is off. Hot paths call metrics.recordX which is a fast nil
+	// check on the disabled path.
+	metrics *metricsTracker
+
+	settingsMu   sync.Mutex
+	settings     settings
+	settingsPath string
+	ctx          context.Context
+	summaries    *summaryManager
+
 	// chromeCacheMu guards the last-rendered byte cache for each chrome
 	// element. The tab bar, sidebar, and status line all repaint on
 	// many state changes and on every PTY chunk, but their content
@@ -372,7 +476,19 @@ type uiState struct {
 	// sensitive paths (owner flip, attention, trust, focus change)
 	// continue to call drawStatusLine / drawTabBar synchronously.
 	chromeDirty atomic.Bool
-	chromeWake  chan struct{}
+	// sidebarDirty defers sidebar repaints off the per-chunk hot path
+	// in the same way. A long claude session resume — where every PTY
+	// chunk scrolls the viewport — used to call drawSidebar()
+	// synchronously per chunk, which dominated the resume's wall time
+	// (hundreds of full-sidebar rebuilds for a frame that was almost
+	// always cache-equal).
+	sidebarDirty atomic.Bool
+	chromeWake   chan struct{}
+
+	// marquee animates the focused sidebar row's name when it overflows
+	// the rail width. The dedicated 150ms ticker below flips
+	// sidebarDirty while a row is animating; idle case is free.
+	marquee marqueeState

 	// padsCacheMu guards the cached scratchpad listing. The sidebar
 	// and palette/sidebar nav helpers read it on every chunk-driven
@@ -389,6 +505,33 @@ func (st *uiState) dbgf(format string, args ...any) {
 	logf(format, args...)
 }

+func (st *uiState) activeSummaryText(width int) string {
+	if width <= 0 || st.summaries == nil {
+		return ""
+	}
+	st.settingsMu.Lock()
+	enabled := st.settings.AutoSummary.Enabled
+	st.settingsMu.Unlock()
+	if !enabled {
+		return ""
+	}
+	st.mu.Lock()
+	active := st.activeAgentID
+	st.mu.Unlock()
+	if active == "" {
+		return ""
+	}
+	sum := st.summaries.Summary(active)
+	text := strings.TrimSpace(sum.Text)
+	if text == "" {
+		return ""
+	}
+	if visibleLen(text) > width {
+		text = clipRunes(text, width-1) + "…"
+	}
+	return text
+}
+
 // trustRequest is one outstanding SPEC §7 trust prompt: an agent tried
 // to spawn / start / restart against an untrusted command preset and
 // the host wants user confirmation before the next attempt succeeds.
@@ -414,15 +557,20 @@ func (st *uiState) focusProcess(processID string) {
 	if c == nil {
 		return
 	}
+	st.marquee.reset()
 	layout := st.layoutSnapshot()
+	onAlt := childIsOnAlt(c)
 	st.mu.Lock()
 	leavingPad := st.focusedPad != ""
 	st.focusedPad = ""
 	st.focusedID = c.ID
 	st.focusedName = c.DisplayName()
 	st.updateActiveAgentLocked(c)
-	st.renderer = newViewportRenderer(layout)
+	r := newViewportRenderer(layout)
+	r.SetChildOnAlt(onAlt)
+	st.renderer = r
 	st.mu.Unlock()
+	st.syncHostMouseForChild(onAlt)
 	// Wipe whatever the previous focus (PTY child or pad view) left in
 	// the viewport before painting the new child's snapshot.
 	if leavingPad {
@@ -434,6 +582,41 @@ func (st *uiState) focusProcess(processID string) {
 	st.drawStatusLine()
 }

+// childIsOnAlt reports whether the child's emulator is currently on
+// its alternate screen. Returns false if the emulator is gone or the
+// query fails.
+func childIsOnAlt(c *Child) bool {
+	if c == nil {
+		return false
+	}
+	em := c.Emulator()
+	if em == nil {
+		return false
+	}
+	sc, err := em.ActiveScreen()
+	if err != nil {
+		return false
+	}
+	return sc == vt.ScreenAlternate
+}
+
+// syncHostMouseForChild emits the host mouse-reporting toggle that
+// matches a newly-focused child's screen side. Primary-screen children
+// want host mouse armed so the wheel drives inline scrollback; alt-
+// screen children get host mouse disabled by default so click-and-drag
+// selection works. Alt-screen TUIs that need mouse (vim, ranger, etc.)
+// re-enable it themselves, and the viewport renderer forwards those
+// toggles back to the host.
+func (st *uiState) syncHostMouseForChild(onAlt bool) {
+	st.outMu.Lock()
+	defer st.outMu.Unlock()
+	if onAlt {
+		_, _ = os.Stdout.WriteString("\x1b[?1000l\x1b[?1006l")
+	} else {
+		_, _ = os.Stdout.WriteString("\x1b[?1000h\x1b[?1006h")
+	}
+}
+
 // focusScratchpad shifts focus to a scratchpad. The main viewport
 // renders the pad's text instead of any child PTY; PTY output for the
 // previously focused child is dropped until focus moves back to a
@@ -442,6 +625,7 @@ func (st *uiState) focusScratchpad(name string) {
 	if name == "" {
 		return
 	}
+	st.marquee.reset()
 	st.mu.Lock()
 	if st.padOffsetName != name {
 		st.padOffset = 0
@@ -485,6 +669,7 @@ func (st *uiState) restartFocusedCommand(processID string) {
 	if c == nil || c.Kind != KindCommand {
 		return
 	}
+	st.marquee.reset()
 	layout := st.layoutSnapshot()
 	renderer := newViewportRenderer(layout)
 	st.mu.Lock()
@@ -569,15 +754,39 @@ func (st *uiState) scratchpadsChanged() {
 	}
 }

-// OnChildSpawned auto-focuses the new child.
+// OnChildSpawned auto-focuses the new child when the spawn came from
+// the user (palette, persistence restore, or an external MCP client with
+// no resolved identity). When ParentID is set — meaning a patterm-managed
+// agent spawned this child via spawn_agent/spawn_process — focus stays
+// on whatever the user was watching; the new child is still surfaced in
+// the sidebar/tab bar so it's reachable via the palette or select_process.
 func (st *uiState) OnChildSpawned(c *Child) {
+	if st.summaries != nil {
+		st.summaries.RegisterChild(c)
+	}
+	if c.ParentID != "" {
+		st.mu.Lock()
+		if st.palette != nil {
+			st.palette.children = st.sess.Children()
+			st.palette.focused = st.focusedID
+			st.palette.rebuild()
+			st.renderPaletteLocked()
+		}
+		st.mu.Unlock()
+		st.drawTabBar()
+		st.drawSidebar()
+		return
+	}
+	st.marquee.reset()
 	layout := st.layoutSnapshot()
+	onAlt := childIsOnAlt(c)
 	st.mu.Lock()
 	st.focusedPad = ""
 	st.focusedID = c.ID
 	st.focusedName = c.DisplayName()
 	st.updateActiveAgentLocked(c)
 	renderer := newViewportRenderer(layout)
+	renderer.SetChildOnAlt(onAlt)
 	st.renderer = renderer
 	palOpen := st.palette != nil
 	if palOpen {
@@ -611,6 +820,7 @@ func (st *uiState) OnChildSpawned(c *Child) {
 		st.outMu.Unlock()
 	}

+	st.syncHostMouseForChild(onAlt)
 	st.moveToViewportOrigin()
 	st.drawTabBar()
 	st.drawSidebar()
@@ -628,7 +838,11 @@ func (st *uiState) OnChildStateChanged(string, IdleState) {
 // OnChildExited drops focus and shows the empty state if it was the
 // focused child.
 func (st *uiState) OnChildExited(c *Child) {
+	if st.summaries != nil {
+		st.summaries.UnregisterChild(c.ID)
+	}
 	st.lastExit.Store(int32(c.ExitCode()))
+	st.marquee.reset()
 	layout := st.layoutSnapshot()
 	renderEmpty := false
 	st.mu.Lock()
@@ -710,6 +924,13 @@ func (st *uiState) scheduleAutoRestart(c *Child) {
 // disabled only around the replay so long styled runs cannot wrap into
 // the right rail.
 func (st *uiState) OnPTYOut(childID string, chunk []byte) {
+	var entry time.Time
+	if st.metrics != nil {
+		entry = time.Now()
+	}
+	if st.summaries != nil {
+		st.summaries.ObserveOutput(childID)
+	}
 	layout := st.layoutSnapshot()
 	st.mu.Lock()
 	focus := st.focusedID
@@ -726,16 +947,31 @@ func (st *uiState) OnPTYOut(childID string, chunk []byte) {
 	}
 	st.mu.Unlock()
 	if palOpen || focus != childID || renderer == nil {
+		st.metrics.recordPTYOutDrop()
 		return
 	}
 	var out []byte
 	if forceRepaint {
+		var snapStart time.Time
+		if st.metrics != nil {
+			snapStart = time.Now()
+		}
 		out = st.renderFocusedSnapshot(childID, renderer, layout)
+		if st.metrics != nil {
+			st.metrics.recordSnapshot(time.Since(snapStart))
+		}
 		if len(out) == 0 {
 			return
 		}
 	} else {
+		var rstart time.Time
+		if st.metrics != nil {
+			rstart = time.Now()
+		}
 		out = renderer.Render(chunk)
+		if st.metrics != nil {
+			st.metrics.recordRender(time.Since(rstart))
+		}
 	}
 	// One write covers the autowrap-disable prelude, the chunk, and the
 	// autowrap-restore postlude — three syscalls collapsed into one
@@ -745,9 +981,16 @@ func (st *uiState) OnPTYOut(childID string, chunk []byte) {
 	wrapped = append(wrapped, "\x1b[?7l"...)
 	wrapped = append(wrapped, out...)
 	wrapped = append(wrapped, "\x1b[?7h"...)
+	var wstart time.Time
+	if st.metrics != nil {
+		wstart = time.Now()
+	}
 	st.outMu.Lock()
 	_, _ = os.Stdout.Write(wrapped)
 	st.outMu.Unlock()
+	if st.metrics != nil {
+		st.metrics.recordStdout(time.Since(wstart), len(wrapped))
+	}
 	// RI / IND / NEL / SU / SD / IL / DL and bottom-margin LF / VT / FF
 	// scroll content within the host's scroll region, which spans every
 	// column — so any of them drags the right-hand sidebar's session-tree
@@ -760,15 +1003,23 @@ func (st *uiState) OnPTYOut(childID string, chunk []byte) {
 		st.chromeCacheMu.Lock()
 		st.sidebarCache = ""
 		st.chromeCacheMu.Unlock()
-		// Scrolled chunks can clobber the sidebar columns; repaint
-		// synchronously so the gap fills before the next chunk lands.
-		st.drawSidebar()
+		// Defer the sidebar repaint to the chrome ticker. On a long
+		// session resume every PTY chunk scrolls, and a synchronous
+		// drawSidebar() per chunk dominates wall time even when the
+		// frame ends up cache-equal — the rebuild work is unconditional.
+		// The chrome ticker drains the dirty flag at ~60 Hz, so the
+		// visible gap a scrolled chunk can leave in the sidebar columns
+		// is bounded by one frame.
+		st.markSidebarDirty()
 	}
 	// Defer the tab bar + status line repaint to the chrome ticker.
 	// The cached frame already short-circuits the wire write, but
 	// avoiding the string build, FindChild, and locking on every
 	// chunk pulls steady-state CPU off the hot path.
 	st.markChromeDirty()
+	if st.metrics != nil {
+		st.metrics.recordPTYOut(time.Since(entry), len(chunk))
+	}
 }

 func (st *uiState) enterScreen() {
@@ -866,6 +1117,18 @@ func (st *uiState) markChromeDirty() {
 	}
 }

+// markSidebarDirty schedules a sidebar repaint on the next ticker
+// frame. Hot path — every scrolled PTY chunk lands here. Synchronous
+// repaints from latency-sensitive sites (spawn, exit, focus, state
+// change, trust) keep calling drawSidebar directly.
+func (st *uiState) markSidebarDirty() {
+	st.sidebarDirty.Store(true)
+	select {
+	case st.chromeWake <- struct{}{}:
+	default:
+	}
+}
+
 func (st *uiState) invalidateChromeCache() {
 	st.chromeCacheMu.Lock()
 	st.tabBarCache = ""
@@ -896,6 +1159,10 @@ func (st *uiState) renderPaletteLocked() {
 // attention ask. Right side: palette hint. The PTY child occupies
 // host_rows-1 rows so this row is exclusively ours.
 func (st *uiState) drawStatusLine() {
+	var entry time.Time
+	if st.metrics != nil {
+		entry = time.Now()
+	}
 	st.mu.Lock()
 	palOpen := st.palette != nil
 	focusID := st.focusedID
@@ -982,10 +1249,16 @@ func (st *uiState) drawStatusLine() {
 	st.chromeCacheMu.Lock()
 	if line == st.statusLineCache {
 		st.chromeCacheMu.Unlock()
+		if st.metrics != nil {
+			st.metrics.recordStatus(time.Since(entry), true)
+		}
 		return
 	}
 	st.statusLineCache = line
 	st.chromeCacheMu.Unlock()
+	if st.metrics != nil {
+		defer func() { st.metrics.recordStatus(time.Since(entry), false) }()
+	}

 	st.outMu.Lock()
 	defer st.outMu.Unlock()
@@ -1131,6 +1404,15 @@ func (st *uiState) processStdin(chunk []byte) {
 	}

 	forward := make([]byte, 0, len(chunk))
+
+	var pendingAction *paletteAction
+	var pendingNav navEntry
+	var pendingRestartID string
+	var pendingViewportDelta int
+	var pendingViewportBottom bool
+	var pendingPadStep int
+	var pendingPadExit bool
+
 	flushForward := func() {
 		if len(forward) == 0 {
 			return
@@ -1142,22 +1424,22 @@ func (st *uiState) processStdin(chunk []byte) {
 				// writes so claude / codex / opencode don't treat a
 				// "text\r" batch as a paste.
 				_ = c.InjectAsUser(forward)
+				if st.summaries != nil {
+					st.summaries.ObserveHumanInput(c.ID, forward)
+				}
 				if prev != OwnerUser {
 					go st.drawStatusLine()
 				}
+				// Auto-snap the emulator viewport to the live area
+				// on any forwarded keystroke. Without this, typing
+				// while scrolled into history leaves the cursor /
+				// echoed bytes off-screen below the visible region.
+				pendingViewportBottom = true
 			}
 		}
 		forward = forward[:0]
 	}

-	var pendingAction *paletteAction
-	var pendingNav navEntry
-	var pendingRestartID string
-	var pendingViewportDelta int
-	var pendingViewportBottom bool
-	var pendingPadStep int
-	var pendingPadExit bool
-
 	// childOnPrimary captures whether the focused child is on its primary
 	// screen at the start of this chunk. Wheel events on the primary
 	// screen scroll the emulator viewport (inline scrollback); on the
@@ -1547,7 +1829,10 @@ func (st *uiState) scrollFocusedViewportToBottom() {
 }

 func (st *uiState) openPaletteLocked() {
-	st.palette = newPalette(st.sess.Children(), st.focusedID, st.focusedPad, st.presets)
+	st.settingsMu.Lock()
+	appSettings := st.settings.clone()
+	st.settingsMu.Unlock()
+	st.palette = newPalette(st.sess.Children(), st.focusedID, st.focusedPad, st.presets, appSettings)
 	// Push a "no kitty flags" entry onto the host terminal's keyboard
 	// stack so palette input arrives in plain legacy form regardless of
 	// what the focused child pushed. Codex/ratatui enables kitty mode
@@ -1622,6 +1907,13 @@ func (st *uiState) closePalette(action paletteAction) {
 			st.flashError(fmt.Sprintf("spawn %s: %v", action.preset.Name, err))
 		}

+	case "spawn-terminal":
+		l := st.layoutSnapshot()
+		st.launcher.SetSize(l.childCols(), l.childRows())
+		if _, err := st.launcher.LaunchTerminal(nil, "terminal", "", "", nil); err != nil {
+			st.flashError(fmt.Sprintf("spawn terminal: %v", err))
+		}
+
 	case "spawn-process-submit":
 		if action.command == "" {
 			restoreView()
@@ -1713,9 +2005,85 @@ func (st *uiState) closePalette(action paletteAction) {

 	case "proc-restart":
 		st.handleProcRestart(action.childID)
+
+	case "settings-close":
+		st.applySettingsAction(action)
+		restoreView()
+		st.drawTabBar()
+		st.drawSidebar()
+		st.drawStatusLine()
+
+	case "settings-test":
+		st.applySettingsAction(action)
+		restoreView()
+		st.drawTabBar()
+		st.drawSidebar()
+		st.drawStatusLine()
+		go st.testSummarizer()
+
+	case "settings-run-now":
+		st.applySettingsAction(action)
+		restoreView()
+		st.drawTabBar()
+		st.drawSidebar()
+		st.drawStatusLine()
+		st.runSummaryNow()
 	}
 }

+func (st *uiState) applySettingsAction(action paletteAction) {
+	if action.settings == nil {
+		return
+	}
+	next := action.settings.clone()
+	st.settingsMu.Lock()
+	path := st.settingsPath
+	st.settingsMu.Unlock()
+	if err := saveSettings(path, next); err != nil {
+		st.flashError(fmt.Sprintf("save settings: %v", err))
+		return
+	}
+	st.settingsMu.Lock()
+	st.settings = next
+	st.settingsMu.Unlock()
+}
+
+func (st *uiState) testSummarizer() {
+	if st.summaries == nil {
+		return
+	}
+	base := st.ctx
+	if base == nil {
+		base = context.Background()
+	}
+	ctx, cancel := context.WithTimeout(base, summaryTimeout)
+	defer cancel()
+	if err := st.summaries.Test(ctx); err != nil {
+		st.flashError(fmt.Sprintf("summarizer test: %v", err))
+		return
+	}
+	st.flashTransient("summarizer test passed")
+}
+
+func (st *uiState) runSummaryNow() {
+	if st.summaries == nil {
+		return
+	}
+	st.mu.Lock()
+	active := st.activeAgentID
+	st.mu.Unlock()
+	if active == "" {
+		st.flashError("no active top-level agent to summarize")
+		return
+	}
+	ctx := st.ctx
+	if ctx == nil {
+		ctx = context.Background()
+	}
+	st.summaries.RunNow(ctx, active)
+	st.flashTransient("summary requested")
+}
+
 func (st *uiState) handlePadDelete(name string) {
 	if name == "" || st.pads == nil {
 		st.repaintFocused()
@@ -1900,8 +2268,17 @@ func (st *uiState) flashError(msg string) {
 	st.mu.Lock()
 	st.attentionText = msg
 	st.attentionAt = "" // shows on every focus until cleared
+	focusedPad := st.focusedPad
+	focusedID := st.focusedID
 	st.mu.Unlock()
-	st.renderEmptyState()
+	switch {
+	case focusedPad != "":
+		st.repaintFocusedPad()
+	case focusedID != "":
+		st.repaintFocused()
+	default:
+		st.renderEmptyState()
+	}
 	st.drawTabBar()
 	st.drawSidebar()
 	st.drawStatusLine()
--- a/internal/app/bench_test.go
+++ b/internal/app/bench_test.go
@@ -0,0 +1,546 @@
+package app
+
+import (
+	"fmt"
+	"io"
+	"strings"
+	"testing"
+
+	"github.com/hjbdev/patterm/internal/vt"
+)
+
+// Benchmarks for patterm's hot paths. Run with:
+//
+//	go test -bench=. -benchmem ./internal/app/
+//
+// or target one:
+//
+//	go test -bench=BenchmarkViewportRenderer_PlainASCII -benchmem ./internal/app/
+//
+// The fixtures below model the three workloads we care about most:
+//
+//   - PlainASCII: long-running text output (claude streaming a code
+//     diff, codex outputting a tool result body). Fast-path territory.
+//   - StyledLines: SGR-heavy output (claude/codex chat history with
+//     coloured tokens). State-machine path.
+//   - RatatuiBurst: many short cursor-positioning / SGR transitions in
+//     a tight chunk, matching codex/ratatui's incremental diff
+//     updates.
+//   - SnapshotReplay: full styled-grid replay (focus switch).
+
+// buildPlainASCIIChunk returns a roughly N-byte chunk of pure
+// printable ASCII text with the occasional newline — the cheapest
+// workload, exercises the fast path in viewport_renderer.Render.
+func buildPlainASCIIChunk(n int) []byte {
+	var b strings.Builder
+	b.Grow(n)
+	line := "The quick brown fox jumps over the lazy dog 0123456789 "
+	for b.Len() < n {
+		b.WriteString(line)
+		if b.Len()%80 < len(line) {
+			b.WriteByte('\n')
+		}
+	}
+	return []byte(b.String()[:n])
+}
+
+// buildStyledLinesChunk simulates SGR-heavy output: every word wears
+// a colour, so the renderer breaks out of its fast path on every
+// escape sequence.
+func buildStyledLinesChunk(n int) []byte {
+	var b strings.Builder
+	b.Grow(n)
+	colours := []string{"31", "32", "33", "34", "35", "36"}
+	words := []string{"package", "func", "return", "import", "struct", "type", "const", "var"}
+	i := 0
+	for b.Len() < n {
+		fmt.Fprintf(&b, "\x1b[%sm%s\x1b[0m ", colours[i%len(colours)], words[i%len(words)])
+		if i%10 == 9 {
+			b.WriteByte('\n')
+		}
+		i++
+	}
+	return []byte(b.String()[:n])
+}
+
+// buildRatatuiBurst simulates a single ratatui-style diff frame:
+// CUP, SGR, a few chars, CUP, SGR, a few chars… for a viewport's
+// worth of cells.
+func buildRatatuiBurst(cells int) []byte {
+	var b strings.Builder
+	for i := 0; i < cells; i++ {
+		row := (i / 80) + 1
+		col := (i % 80) + 1
+		fmt.Fprintf(&b, "\x1b[%d;%dH\x1b[3%dm%c", row, col, i%8, byte('A'+(i%26)))
+	}
+	b.WriteString("\x1b[0m")
+	return []byte(b.String())
+}
+
+// BenchmarkViewportRenderer_PlainASCII drives a 16 KiB plain-text
+// chunk through Render once per iteration. Reports ns/op,
+// allocations, and B/op.
+func BenchmarkViewportRenderer_PlainASCII(b *testing.B) {
+	chunk := buildPlainASCIIChunk(16 * 1024)
+	b.SetBytes(int64(len(chunk)))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		vr := newViewportRenderer(newTerminalLayout(120, 40))
+		_ = vr.Render(chunk)
+	}
+}
+
+// BenchmarkViewportRenderer_StyledLines exercises the per-byte CSI
+// path on SGR-heavy output. Most claude/codex chat resume traffic
+// looks like this — coloured prose with frequent style toggles.
+func BenchmarkViewportRenderer_StyledLines(b *testing.B) {
+	chunk := buildStyledLinesChunk(16 * 1024)
+	b.SetBytes(int64(len(chunk)))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		vr := newViewportRenderer(newTerminalLayout(120, 40))
+		_ = vr.Render(chunk)
+	}
+}
+
+// BenchmarkViewportRenderer_RatatuiBurst measures the worst-case
+// cursor-shuffling workload: full-frame diff updates dominated by
+// CUP + SGR + single-char writes.
+func BenchmarkViewportRenderer_RatatuiBurst(b *testing.B) {
+	chunk := buildRatatuiBurst(80 * 24) // one screenful of cells
+	b.SetBytes(int64(len(chunk)))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		vr := newViewportRenderer(newTerminalLayout(120, 40))
+		_ = vr.Render(chunk)
+	}
+}
+
+// BenchmarkContainsOSC measures the OSC-gate fast path used by
+// pumpChild before deciding whether to fire the per-chunk Title()
+// CGO call. Inputs:
+//   - "hot": SGR-styled output without OSC — the common case for
+//     codex/ratatui. We want this near zero.
+//   - "cold": chunk with an OSC sequence in the middle.
+func BenchmarkContainsOSC_NoOSC(b *testing.B) {
+	chunk := buildStyledLinesChunk(8 * 1024)
+	b.SetBytes(int64(len(chunk)))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		_ = containsOSC(chunk)
+	}
+}
+
+func BenchmarkContainsOSC_WithOSC(b *testing.B) {
+	chunk := append(buildStyledLinesChunk(8*1024), []byte("\x1b]0;new title\x07")...)
+	b.SetBytes(int64(len(chunk)))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		_ = containsOSC(chunk)
+	}
+}
+
+// BenchmarkRendererThroughput_ReuseInstance approximates real
+// session behaviour: a single viewport renderer fed many chunks in
+// sequence, no per-iteration allocation. Reports a throughput
+// closer to the steady-state OnPTYOut path. Chunks are 4 KiB to
+// match typical PTY read sizes; the renderer is reset every
+// benchmark run.
+func BenchmarkRendererThroughput_ReuseInstance(b *testing.B) {
+	chunks := make([][]byte, 16)
+	for i := range chunks {
+		chunks[i] = buildStyledLinesChunk(4 * 1024)
+	}
+	totalBytes := 0
+	for _, c := range chunks {
+		totalBytes += len(c)
+	}
+	b.SetBytes(int64(totalBytes))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		vr := newViewportRenderer(newTerminalLayout(120, 40))
+		for _, c := range chunks {
+			_ = vr.Render(c)
+		}
+	}
+}
+
+// Stress workloads — these model the worst things a real session
+// can throw at us. The headline target is "ASCII video": every cell
+// of an 80x40 viewport carries an SGR colour change and a printable
+// character, rendered as one chunk per frame. Real ASCII-video CLIs
+// (ascii-image-converter, asciinema-render, towel.blinkenlights, the
+// Bad Apple meme) hit patterm with exactly this pattern at 24-30 fps
+// for minutes at a time.
+//
+// We synthesise the workload rather than ship a captured corpus so
+// the benchmarks stay deterministic and the repo doesn't carry tens
+// of MiB of fixture data. The encoding is faithful to what those
+// tools actually emit.
+
+// buildASCIIVideoFrame builds a single full-viewport frame with
+// 8-colour SGR per cell (`\x1b[3Nm`). One frame ≈ 30 KiB for an
+// 80x40 viewport, which lines up with what ascii-video tools emit.
+func buildASCIIVideoFrame(cols, rows int) []byte {
+	var b strings.Builder
+	b.WriteString("\x1b[H") // home cursor before the frame starts
+	for r := 0; r < rows; r++ {
+		for c := 0; c < cols; c++ {
+			fmt.Fprintf(&b, "\x1b[3%dm%c", (r+c)%8, byte(' '+(r*c)%(0x7e-' ')))
+		}
+		b.WriteString("\x1b[0m\r\n")
+	}
+	return []byte(b.String())
+}
+
+// buildASCIIVideoFrameTrueColor builds the same frame but with
+// 24-bit RGB SGR (`\x1b[38;2;R;G;Bm`). Every cell is ~20 bytes of
+// escape + 1 byte glyph, so a frame is ≈ 70 KiB. This is what
+// chafa --colors=full and modern terminal video players emit, and
+// it's the heaviest SGR variant the renderer's CSI path sees.
+func buildASCIIVideoFrameTrueColor(cols, rows int) []byte {
+	var b strings.Builder
+	b.WriteString("\x1b[H")
+	for r := 0; r < rows; r++ {
+		for c := 0; c < cols; c++ {
+			rd := (r * 7) % 256
+			gd := (c * 11) % 256
+			bd := ((r + c) * 13) % 256
+			fmt.Fprintf(&b, "\x1b[38;2;%d;%d;%dm%c", rd, gd, bd, byte(' '+(r*c)%(0x7e-' ')))
+		}
+		b.WriteString("\x1b[0m\r\n")
+	}
+	return []byte(b.String())
+}
+
+// buildBadApplePattern builds the simplest possible ASCII video
+// frame: alternating black/white cells (the Bad Apple meme is
+// essentially a 1-bit silhouette video). This is the pattern that
+// stresses the SGR state-machine without exercising truecolor parse
+// — useful for isolating "is the cost in the colour parsing or in
+// the cell-by-cell switching?"
+func buildBadApplePattern(cols, rows int) []byte {
+	var b strings.Builder
+	b.WriteString("\x1b[H")
+	for r := 0; r < rows; r++ {
+		for c := 0; c < cols; c++ {
+			if (r+c)%2 == 0 {
+				b.WriteString("\x1b[37m█")
+			} else {
+				b.WriteString("\x1b[30m█")
+			}
+		}
+		b.WriteString("\x1b[0m\r\n")
+	}
+	return []byte(b.String())
+}
+
+// BenchmarkASCIIVideo_Frame_8Color renders a single full-screen
+// frame as one chunk. The headline number is MB/s — at 30 fps a
+// frame is one PTY chunk every ~33 ms, so this should comfortably
+// stay well under 1 ms.
+func BenchmarkASCIIVideo_Frame_8Color(b *testing.B) {
+	frame := buildASCIIVideoFrame(80, 40)
+	b.SetBytes(int64(len(frame)))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		vr := newViewportRenderer(newTerminalLayout(120, 40))
+		_ = vr.Render(frame)
+	}
+}
+
+// BenchmarkASCIIVideo_Frame_TrueColor renders a single truecolor
+// frame. ~70 KiB per frame. Compare this to the 8-colour number to
+// see how much extra cost the truecolor SGR parse imposes — the
+// `\x1b[38;2;R;G;Bm` form is the longest and most parameter-rich
+// CSI patterm sees in practice.
+func BenchmarkASCIIVideo_Frame_TrueColor(b *testing.B) {
+	frame := buildASCIIVideoFrameTrueColor(80, 40)
+	b.SetBytes(int64(len(frame)))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		vr := newViewportRenderer(newTerminalLayout(120, 40))
+		_ = vr.Render(frame)
+	}
+}
+
+// BenchmarkASCIIVideo_Frame_BadApple is the 1-bit pattern: simplest
+// SGR (two colours, alternating). Isolates the renderer's cell-by-
+// cell SGR cycling cost from the truecolor parse cost.
+func BenchmarkASCIIVideo_Frame_BadApple(b *testing.B) {
+	frame := buildBadApplePattern(80, 40)
+	b.SetBytes(int64(len(frame)))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		vr := newViewportRenderer(newTerminalLayout(120, 40))
+		_ = vr.Render(frame)
+	}
+}
+
+// runStreamBench is the shared body for the per-fps stream
+// benchmarks. It feeds a fixed frame N times through a single
+// renderer instance and reports µs/frame + an achievable-fps
+// ceiling alongside the standard ns/op + MB/s. The fps value in
+// the benchmark name is the *target* — the workload itself doesn't
+// rate-limit; we just decide how many frames make a benchmark op
+// (3 seconds' worth) so steady-state cost dominates warm-up.
+func runStreamBench(b *testing.B, frame []byte, fps int) {
+	frames := fps * 3 // 3 seconds at the target rate
+	totalBytes := int64(len(frame) * frames)
+	b.SetBytes(totalBytes)
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		vr := newViewportRenderer(newTerminalLayout(120, 40))
+		for f := 0; f < frames; f++ {
+			_ = vr.Render(frame)
+		}
+	}
+	nsPerFrame := float64(b.Elapsed().Nanoseconds()) / float64(b.N*frames)
+	b.ReportMetric(nsPerFrame/1000.0, "µs/frame")
+	b.ReportMetric(1e9/nsPerFrame, "fps_ceiling")
+	// budget_pct = how much of the per-frame budget at the target
+	// rate we burn. Under 100 means we can hit the target; over
+	// means we can't.
+	budgetNs := 1e9 / float64(fps)
+	b.ReportMetric(nsPerFrame/budgetNs*100, "budget_pct")
+}
+
+// BenchmarkASCIIVideo_Stream_8Color_30fps / _60fps / _120fps reuse
+// one renderer across (3 × fps) frames. The headline numbers are
+// µs/frame, fps_ceiling (= 1e9 / ns/frame), and budget_pct (=
+// percent of the per-frame budget at the target rate we consume).
+//
+// 30 fps is the typical ASCII-video baseline (towel, chafa, Bad
+// Apple ports). 60 is the "smooth playback" target. 120 is a
+// future-proofing stress level matching modern high-refresh
+// terminals.
+func BenchmarkASCIIVideo_Stream_8Color_30fps(b *testing.B) {
+	runStreamBench(b, buildASCIIVideoFrame(80, 40), 30)
+}
+func BenchmarkASCIIVideo_Stream_8Color_60fps(b *testing.B) {
+	runStreamBench(b, buildASCIIVideoFrame(80, 40), 60)
+}
+func BenchmarkASCIIVideo_Stream_8Color_120fps(b *testing.B) {
+	runStreamBench(b, buildASCIIVideoFrame(80, 40), 120)
+}
+
+// BenchmarkASCIIVideo_Stream_TrueColor_* same set but with the
+// truecolor frames. Compare against the 8-colour numbers to see
+// what the longer `\x1b[38;2;R;G;Bm` parse costs us.
+func BenchmarkASCIIVideo_Stream_TrueColor_30fps(b *testing.B) {
+	runStreamBench(b, buildASCIIVideoFrameTrueColor(80, 40), 30)
+}
+func BenchmarkASCIIVideo_Stream_TrueColor_60fps(b *testing.B) {
+	runStreamBench(b, buildASCIIVideoFrameTrueColor(80, 40), 60)
+}
+func BenchmarkASCIIVideo_Stream_TrueColor_120fps(b *testing.B) {
+	runStreamBench(b, buildASCIIVideoFrameTrueColor(80, 40), 120)
+}
+
+// BenchmarkASCIIVideo_Stream_BadApple_* tracks the 1-bit alternating
+// pattern. Isolates per-cell SGR cycling cost from the truecolor
+// parse cost above — useful when reading the diff between the two
+// stream variants.
+func BenchmarkASCIIVideo_Stream_BadApple_30fps(b *testing.B) {
+	runStreamBench(b, buildBadApplePattern(80, 40), 30)
+}
+func BenchmarkASCIIVideo_Stream_BadApple_60fps(b *testing.B) {
+	runStreamBench(b, buildBadApplePattern(80, 40), 60)
+}
+func BenchmarkASCIIVideo_Stream_BadApple_120fps(b *testing.B) {
+	runStreamBench(b, buildBadApplePattern(80, 40), 120)
+}
+
+// BenchmarkEmulator_Write_8Color / _TrueColor isolate the
+// libghostty-vt CGO cost — same frames the Pipeline benchmarks use,
+// but feeding only the emulator. The delta between this and
+// BenchmarkASCIIVideo_Stream_… is the renderer's share; the rest
+// is libghostty-vt.
+func BenchmarkEmulator_Write_8Color_Frame(b *testing.B) {
+	frame := buildASCIIVideoFrame(80, 40)
+	b.SetBytes(int64(len(frame)))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		em, err := vt.NewGhosttyEmulator(80, 40)
+		if err != nil {
+			b.Fatalf("emulator: %v", err)
+		}
+		if _, werr := em.Write(frame); werr != nil {
+			b.Fatalf("emulator.Write: %v", werr)
+		}
+		_ = em.Close()
+	}
+}
+
+func BenchmarkEmulator_Write_TrueColor_Frame(b *testing.B) {
+	frame := buildASCIIVideoFrameTrueColor(80, 40)
+	b.SetBytes(int64(len(frame)))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		em, err := vt.NewGhosttyEmulator(80, 40)
+		if err != nil {
+			b.Fatalf("emulator: %v", err)
+		}
+		if _, werr := em.Write(frame); werr != nil {
+			b.Fatalf("emulator.Write: %v", werr)
+		}
+		_ = em.Close()
+	}
+}
+
+// BenchmarkEmulator_Write_Stream_120fps reuses one emulator across
+// 360 frames (3 sec × 120 fps). This is the cleanest measurement
+// of em.Write steady-state cost.
+func BenchmarkEmulator_Write_Stream_8Color_120fps(b *testing.B) {
+	frame := buildASCIIVideoFrame(80, 40)
+	const frames = 360
+	b.SetBytes(int64(len(frame) * frames))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		em, err := vt.NewGhosttyEmulator(80, 40)
+		if err != nil {
+			b.Fatalf("emulator: %v", err)
+		}
+		for f := 0; f < frames; f++ {
+			if _, werr := em.Write(frame); werr != nil {
+				b.Fatalf("emulator.Write: %v", werr)
+			}
+		}
+		_ = em.Close()
+	}
+	nsPerFrame := float64(b.Elapsed().Nanoseconds()) / float64(b.N*frames)
+	b.ReportMetric(nsPerFrame/1000.0, "µs/frame")
+	b.ReportMetric(1e9/nsPerFrame, "fps_ceiling")
+}
+
+func BenchmarkEmulator_Write_Stream_TrueColor_120fps(b *testing.B) {
+	frame := buildASCIIVideoFrameTrueColor(80, 40)
+	const frames = 360
+	b.SetBytes(int64(len(frame) * frames))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		em, err := vt.NewGhosttyEmulator(80, 40)
+		if err != nil {
+			b.Fatalf("emulator: %v", err)
+		}
+		for f := 0; f < frames; f++ {
+			if _, werr := em.Write(frame); werr != nil {
+				b.Fatalf("emulator.Write: %v", werr)
+			}
+		}
+		_ = em.Close()
+	}
+	nsPerFrame := float64(b.Elapsed().Nanoseconds()) / float64(b.N*frames)
+	b.ReportMetric(nsPerFrame/1000.0, "µs/frame")
+	b.ReportMetric(1e9/nsPerFrame, "fps_ceiling")
+}
+
+// runPipelineStreamBench includes the libghostty-vt emulator.Write
+// CGO call and a stdout write to io.Discard alongside the renderer
+// — i.e. everything OnPTYOut does in production except the host
+// terminal's own paint time (which patterm doesn't control). This
+// is the honest "can we hit N fps end-to-end?" measurement.
+func runPipelineStreamBench(b *testing.B, frame []byte, fps int) {
+	frames := fps * 3
+	totalBytes := int64(len(frame) * frames)
+	b.SetBytes(totalBytes)
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		em, err := vt.NewGhosttyEmulator(80, 40)
+		if err != nil {
+			b.Fatalf("emulator: %v", err)
+		}
+		vr := newViewportRenderer(newTerminalLayout(120, 40))
+		for f := 0; f < frames; f++ {
+			if _, werr := em.Write(frame); werr != nil {
+				b.Fatalf("emulator.Write: %v", werr)
+			}
+			out := vr.Render(frame)
+			// Match OnPTYOut's autowrap prelude/postlude wrapping so
+			// the byte count is faithful.
+			_, _ = io.Discard.Write([]byte("\x1b[?7l"))
+			_, _ = io.Discard.Write(out)
+			_, _ = io.Discard.Write([]byte("\x1b[?7h"))
+		}
+		_ = em.Close()
+	}
+	nsPerFrame := float64(b.Elapsed().Nanoseconds()) / float64(b.N*frames)
+	b.ReportMetric(nsPerFrame/1000.0, "µs/frame")
+	b.ReportMetric(1e9/nsPerFrame, "fps_ceiling")
+	budgetNs := 1e9 / float64(fps)
+	b.ReportMetric(nsPerFrame/budgetNs*100, "budget_pct")
+}
+
+// BenchmarkPipeline_ASCIIVideo_* — the FULL OnPTYOut path
+// (emulator.Write CGO + viewport renderer + a stdout write to
+// io.Discard) running at 30/60/120 fps targets. These are the
+// numbers to trust when asking "can we sustain N fps?" The
+// renderer-only Stream benchmarks above isolate one stage and
+// understate the real cost.
+//
+// 120 fps is the explicit baseline: anything under 100% of the
+// per-frame budget here means we hit 120 fps with margin to spare.
+func BenchmarkPipeline_ASCIIVideo_8Color_30fps(b *testing.B) {
+	runPipelineStreamBench(b, buildASCIIVideoFrame(80, 40), 30)
+}
+func BenchmarkPipeline_ASCIIVideo_8Color_60fps(b *testing.B) {
+	runPipelineStreamBench(b, buildASCIIVideoFrame(80, 40), 60)
+}
+func BenchmarkPipeline_ASCIIVideo_8Color_120fps(b *testing.B) {
+	runPipelineStreamBench(b, buildASCIIVideoFrame(80, 40), 120)
+}
+
+func BenchmarkPipeline_ASCIIVideo_TrueColor_30fps(b *testing.B) {
+	runPipelineStreamBench(b, buildASCIIVideoFrameTrueColor(80, 40), 30)
+}
+func BenchmarkPipeline_ASCIIVideo_TrueColor_60fps(b *testing.B) {
+	runPipelineStreamBench(b, buildASCIIVideoFrameTrueColor(80, 40), 60)
+}
+func BenchmarkPipeline_ASCIIVideo_TrueColor_120fps(b *testing.B) {
+	runPipelineStreamBench(b, buildASCIIVideoFrameTrueColor(80, 40), 120)
+}
+
+// BenchmarkSessionResume_5MiBStyled simulates the user's
+// motivating case: claude resuming a long chat session and dumping
+// the whole history. 5 MiB of styled output as a single Render
+// call. Numbers here tell us how long the visible "scrolling
+// while resume loads" window will be.
+func BenchmarkSessionResume_5MiBStyled(b *testing.B) {
+	chunk := buildStyledLinesChunk(5 * 1024 * 1024)
+	b.SetBytes(int64(len(chunk)))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		vr := newViewportRenderer(newTerminalLayout(120, 40))
+		_ = vr.Render(chunk)
+	}
+}
+
+// BenchmarkSessionResume_5MiBPlain same as above but pure text.
+// Lower bound — what we'd hit if the resume content were styling-
+// free.
+func BenchmarkSessionResume_5MiBPlain(b *testing.B) {
+	chunk := buildPlainASCIIChunk(5 * 1024 * 1024)
+	b.SetBytes(int64(len(chunk)))
+	b.ReportAllocs()
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		vr := newViewportRenderer(newTerminalLayout(120, 40))
+		_ = vr.Render(chunk)
+	}
+}
--- a/internal/app/debug.go
+++ b/internal/app/debug.go
@@ -0,0 +1,155 @@
+package app
+
+import (
+	"encoding/json"
+	"fmt"
+	"os"
+	"path/filepath"
+	"sync"
+	"time"
+)
+
+// debugCapture implements ChildEventListener and writes structured
+// debug artefacts under a single directory:
+//
+//   - patterm.log    — the existing logf() stream
+//   - events.jsonl   — one JSON object per lifecycle event
+//   - <id>.raw       — raw PTY bytes for each child, by id+name
+//
+// The capture is installed only when --debug=<dir> is set, so default
+// runs pay nothing.
+type debugCapture struct {
+	dir     string
+	logPath string
+
+	mu      sync.Mutex
+	events  *os.File
+	rawByID map[string]*os.File
+}
+
+func openDebugCapture(dir string) (*debugCapture, error) {
+	if err := os.MkdirAll(dir, 0o700); err != nil {
+		return nil, err
+	}
+	logPath := filepath.Join(dir, "patterm.log")
+	// Truncate-style fresh log per run is friendlier for grep'ing one
+	// session. The existing logf opens O_APPEND though, so concurrent
+	// runs against the same dir would interleave — that's on the user.
+	if f, err := os.Create(logPath); err != nil {
+		return nil, err
+	} else {
+		_ = f.Close()
+	}
+	ev, err := os.Create(filepath.Join(dir, "events.jsonl"))
+	if err != nil {
+		return nil, err
+	}
+	dc := &debugCapture{
+		dir:     dir,
+		logPath: logPath,
+		events:  ev,
+		rawByID: make(map[string]*os.File),
+	}
+	dc.writeEvent("session_start", map[string]any{
+		"time": time.Now().Format(time.RFC3339Nano),
+		"pid":  os.Getpid(),
+	})
+	return dc, nil
+}
+
+func (d *debugCapture) LogPath() string { return d.logPath }
+
+func (d *debugCapture) Close() error {
+	d.mu.Lock()
+	defer d.mu.Unlock()
+	d.writeEventLocked("session_end", map[string]any{
+		"time": time.Now().Format(time.RFC3339Nano),
+	})
+	for _, f := range d.rawByID {
+		_ = f.Close()
+	}
+	d.rawByID = nil
+	if d.events != nil {
+		_ = d.events.Close()
+		d.events = nil
+	}
+	return nil
+}
+
+func (d *debugCapture) OnChildSpawned(c *Child) {
+	d.writeEvent("child_spawned", map[string]any{
+		"time":      time.Now().Format(time.RFC3339Nano),
+		"id":        c.ID,
+		"name":      c.Name,
+		"kind":      string(c.Kind),
+		"parent_id": c.ParentID,
+		"preset":    c.PresetRef,
+		"argv":      c.Argv,
+	})
+}
+
+func (d *debugCapture) OnChildExited(c *Child) {
+	d.writeEvent("child_exited", map[string]any{
+		"time":      time.Now().Format(time.RFC3339Nano),
+		"id":        c.ID,
+		"name":      c.Name,
+		"exit_code": c.ExitCode(),
+	})
+	d.mu.Lock()
+	defer d.mu.Unlock()
+	if f, ok := d.rawByID[c.ID]; ok {
+		_ = f.Close()
+		delete(d.rawByID, c.ID)
+	}
+}
+
+func (d *debugCapture) OnChildStateChanged(id string, state IdleState) {
+	d.writeEvent("child_state", map[string]any{
+		"time":  time.Now().Format(time.RFC3339Nano),
+		"id":    id,
+		"state": string(state),
+	})
+}
+
+func (d *debugCapture) OnPTYOut(childID string, chunk []byte) {
+	if len(chunk) == 0 {
+		return
+	}
+	d.mu.Lock()
+	defer d.mu.Unlock()
+	f, ok := d.rawByID[childID]
+	if !ok {
+		path := filepath.Join(d.dir, childID+".raw")
+		nf, err := os.Create(path)
+		if err != nil {
+			return
+		}
+		f = nf
+		d.rawByID[childID] = nf
+	}
+	// Listener contract: don't retain chunk past return. Writing now
+	// is fine; the slice's backing buffer is reused for the next read
+	// only after this listener chain completes.
+	_, _ = f.Write(chunk)
+}
+
+func (d *debugCapture) writeEvent(kind string, fields map[string]any) {
+	d.mu.Lock()
+	defer d.mu.Unlock()
+	d.writeEventLocked(kind, fields)
+}
+
+func (d *debugCapture) writeEventLocked(kind string, fields map[string]any) {
+	if d.events == nil {
+		return
+	}
+	if fields == nil {
+		fields = map[string]any{}
+	}
+	fields["event"] = kind
+	enc, err := json.Marshal(fields)
+	if err != nil {
+		return
+	}
+	_, _ = fmt.Fprintln(d.events, string(enc))
+}
--- a/internal/app/host.go
+++ b/internal/app/host.go
@@ -1111,7 +1111,7 @@ func helpFor(topic string) mcp.HelpResponse {
 	case "spawning":
 		return mcp.HelpResponse{
 			Topic:        "spawning",
-			Content:      "spawn_agent launches another vendor LLM CLI as a sub-agent (orchestrator only). spawn_process(kind: command, preset: …) starts a stored command; spawn_process(kind: terminal) opens a shell. Command presets need trust the first time — you'll get needs_trust until the human accepts. Whatever you spawn is yours to clean up — see help('lifecycle').",
+			Content:      "spawn_agent launches another vendor LLM CLI as a sub-agent (orchestrator only). spawn_process(kind: command, preset: …) starts a stored command; spawn_process(kind: terminal) opens a shell. Command presets need trust the first time — you'll get needs_trust until the human accepts. ANTI-PATTERNS: do not shell out to `claude` / `codex` / `opencode` (or any other agent CLI) yourself, and do not pipe JSON-RPC into patterm's Unix socket via perl / nc / socat / curl. Either path bypasses caller-identity and the new agent reads back as a stray top-level tab instead of your child — call spawn_agent through the MCP transport you were initialised on. Whatever you spawn is yours to clean up — see help('lifecycle').",
 			RelatedTools: []string{"spawn_agent", "spawn_process", "start_process", "restart_process", "close_process"},
 		}
 	case "lifecycle":
@@ -1134,9 +1134,10 @@ func helpFor(topic string) mcp.HelpResponse {
 		}
 	case "coordination":
 		return mcp.HelpResponse{
-			Topic:        "coordination",
-			Content:      "send_message tags the message with the caller's role (parent → [orchestrator], child → [sub-agent:<name>]). Siblings must route through their parent. request_human_attention raises a UI notification when you can't safely decide.",
-			RelatedTools: []string{"send_message", "request_human_attention"},
+			Topic: "coordination",
+			Content: "send_message tags the message with the caller's role (parent → [orchestrator], child → [sub-agent:<name>]). Siblings must route through their parent. request_human_attention raises a UI notification when you can't safely decide.\n\n" +
+				"Reply routing: a sub-agent's reply to your send_message lands in YOUR pane tagged `[sub-agent:<name>]`, not in the sub-agent's output. Anti-pattern: `wait_for_pattern(sub_agent, …)` to wait for a reply — the sub-agent is already idle, its output won't change, and the call spins to timeout. Pattern: send_message → timer_fire_when_idle_any([sub_agent_id], body=\"[system] sub-agent finished\") → when the timer fires, the reply is already queued as your next user turn (or visible via get_process_output on your own pane).",
+			RelatedTools: []string{"send_message", "request_human_attention", "timer_fire_when_idle_any", "timer_fire_when_idle_all"},
 		}
 	case "scratchpads":
 		return mcp.HelpResponse{
@@ -1161,9 +1162,14 @@ func helpFor(topic string) mcp.HelpResponse {
 		}
 	case "readiness":
 		return mcp.HelpResponse{
-			Topic:        "readiness",
-			Content:      "A pane is 'idle' once nothing has been written to its PTY for ~1s (SPEC §11). Treat idle as a signal to read, not a guarantee of completion. wait_for_pattern lets you wait on a known terminal marker for stronger evidence.",
-			RelatedTools: []string{"wait_for_pattern", "get_process_status"},
+			Topic: "readiness",
+			Content: "A pane is 'idle' once nothing has been written to its PTY for ~1s (SPEC §11). Treat idle as a signal to read, not a guarantee of completion.\n\n" +
+				"Waiting for a sub-agent's reply (canonical pattern):\n" +
+				"  1. send_message(sub_agent_id, request)\n" +
+				"  2. timer_fire_when_idle_any(watched=[sub_agent_id], body=\"[system] sub-agent done\")\n" +
+				"  3. When the timer fires you re-enter as a fresh user turn; the sub-agent's reply is already in your own pane tagged `[sub-agent:<name>]` (read via get_process_output on yourself if you need it explicitly).\n\n" +
+				"wait_for_pattern is for waiting on text a process emits in its OWN output (a shell prompt, a build's \"tests passed\" line). It does NOT see send_message replies, because those land in the caller's pane, not the target's — calling wait_for_pattern on a sub-agent to wait for its reply deadlocks until timeout.",
+			RelatedTools: []string{"wait_for_pattern", "get_process_status", "timer_fire_when_idle_any", "send_message"},
 		}
 	case "permissions":
 		return mcp.HelpResponse{
--- a/internal/app/layout_test.go
+++ b/internal/app/layout_test.go
@@ -14,10 +14,10 @@ func TestTerminalLayoutWideUsesMainViewport(t *testing.T) {
 	if l.childCols() != 91 {
 		t.Fatalf("child cols: got %d want 91", l.childCols())
 	}
-	if l.childRows() != 37 {
-		t.Fatalf("child rows: got %d want 37", l.childRows())
+	if l.childRows() != 36 {
+		t.Fatalf("child rows: got %d want 36", l.childRows())
 	}
-	if l.mainTop != 3 || l.statusRow != 40 {
+	if l.mainTop != 4 || l.statusRow != 40 {
 		t.Fatalf("unexpected vertical chrome: mainTop=%d statusRow=%d", l.mainTop, l.statusRow)
 	}
 }
@@ -30,8 +30,8 @@ func TestTerminalLayoutNarrowHidesSidebar(t *testing.T) {
 	if l.childCols() != 38 {
 		t.Fatalf("child cols: got %d want 38", l.childCols())
 	}
-	if l.childRows() != 9 {
-		t.Fatalf("child rows: got %d want 9", l.childRows())
+	if l.childRows() != 8 {
+		t.Fatalf("child rows: got %d want 8", l.childRows())
 	}
 }

@@ -46,13 +46,13 @@ func TestSpawnSizingUsesViewportDimensions(t *testing.T) {
 	l := newTerminalLayout(120, 40)
 	launcher := NewLauncher(nil, "", l.childCols(), l.childRows())
 	cols, rows := launcher.size()
-	if cols != 91 || rows != 37 {
-		t.Fatalf("launcher size: got %dx%d want 91x37", cols, rows)
+	if cols != 91 || rows != 36 {
+		t.Fatalf("launcher size: got %dx%d want 91x36", cols, rows)
 	}

 	host := newToolHost(nil, nil, nil, preset.Set{}, nil, l.childCols(), l.childRows())
 	cols, rows = host.size()
-	if cols != 91 || rows != 37 {
-		t.Fatalf("tool host size: got %dx%d want 91x37", cols, rows)
+	if cols != 91 || rows != 36 {
+		t.Fatalf("tool host size: got %dx%d want 91x36", cols, rows)
 	}
 }
--- a/internal/app/marquee.go
+++ b/internal/app/marquee.go
@@ -0,0 +1,123 @@
+package app
+
+import (
+	"sync"
+	"time"
+)
+
+// Phase ordering of the marquee state machine: hold the head, scroll
+// one cell per marqueeStep until the tail is visible, hold the tail,
+// snap back to the head.
+const (
+	phaseHoldStart = iota
+	phaseScroll
+	phaseHoldEnd
+)
+
+const (
+	marqueeHoldStart = time.Second
+	marqueeStep      = 150 * time.Millisecond
+	marqueeHoldEnd   = time.Second
+)
+
+// marqueeState drives the focused sidebar row's pause-scroll-pause
+// animation. State is wall-clock anchored (since), not tick-count
+// anchored, so a missed tick yields a slightly later frame rather
+// than a skipped one.
+type marqueeState struct {
+	mu      sync.Mutex
+	id      string
+	nameLen int
+	budget  int
+	state   int
+	offset  int
+	since   time.Time
+}
+
+// step advances the state machine for the row identified by id with
+// the given visible name length (in runes) and column budget. It
+// returns the current scroll offset, whether the row is animating
+// (i.e. nameLen > budget), and how long until the next visual change.
+//
+// When id changes, or nameLen <= budget, the state machine resets to
+// phaseHoldStart with offset 0 anchored at now.
+func (m *marqueeState) step(id string, nameLen, budget int, now time.Time) (offset int, animating bool, nextWake time.Duration) {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+
+	if id != m.id || nameLen != m.nameLen || budget != m.budget {
+		m.id = id
+		m.nameLen = nameLen
+		m.budget = budget
+		m.state = phaseHoldStart
+		m.offset = 0
+		m.since = now
+	}
+
+	if nameLen <= budget || budget <= 0 {
+		return 0, false, 0
+	}
+
+	maxOffset := nameLen - budget
+
+	for {
+		elapsed := now.Sub(m.since)
+		switch m.state {
+		case phaseHoldStart:
+			if elapsed < marqueeHoldStart {
+				return 0, true, marqueeHoldStart - elapsed
+			}
+			m.state = phaseScroll
+			m.since = m.since.Add(marqueeHoldStart)
+			continue
+		case phaseScroll:
+			steps := int(elapsed / marqueeStep)
+			if steps >= maxOffset {
+				m.offset = maxOffset
+				m.state = phaseHoldEnd
+				m.since = m.since.Add(time.Duration(maxOffset) * marqueeStep)
+				continue
+			}
+			m.offset = steps
+			rem := marqueeStep - (elapsed % marqueeStep)
+			return m.offset, true, rem
+		case phaseHoldEnd:
+			if elapsed < marqueeHoldEnd {
+				return maxOffset, true, marqueeHoldEnd - elapsed
+			}
+			m.state = phaseHoldStart
+			m.offset = 0
+			m.since = m.since.Add(marqueeHoldEnd)
+			continue
+		default:
+			m.state = phaseHoldStart
+			m.offset = 0
+			m.since = now
+			return 0, true, marqueeHoldStart
+		}
+	}
+}
+
+// active reports whether the marquee currently has an overflowing row
+// to animate. The marquee ticker goroutine uses this to gate dirty
+// flag flips so an idle sidebar costs nothing.
+func (m *marqueeState) active() bool {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+	return m.id != "" && m.nameLen > m.budget && m.budget > 0
+}
+
+// reset clears all state, forcing the next step() call to start a
+// fresh phaseHoldStart. Call this when focus changes so the newly
+// focused row begins with a full head-hold instead of inheriting
+// whatever phase the previous focus was in.
+func (m *marqueeState) reset() {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+	m.id = ""
+	m.nameLen = 0
+	m.budget = 0
+	m.state = phaseHoldStart
+	m.offset = 0
+	m.since = time.Time{}
+}
--- a/internal/app/marquee_test.go
+++ b/internal/app/marquee_test.go
@@ -0,0 +1,161 @@
+package app
+
+import (
+	"testing"
+	"time"
+)
+
+func TestMarqueeStepFits(t *testing.T) {
+	var m marqueeState
+	now := time.Unix(0, 0)
+	off, animating, _ := m.step("a", 5, 10, now)
+	if animating {
+		t.Fatalf("expected no animation when name fits in budget")
+	}
+	if off != 0 {
+		t.Fatalf("expected offset 0, got %d", off)
+	}
+}
+
+func TestMarqueePhaseProgression(t *testing.T) {
+	var m marqueeState
+	// name 10 runes, budget 5 → maxOffset = 5.
+	const nameLen, budget = 10, 5
+	t0 := time.Unix(0, 0)
+
+	// At t0: phaseHoldStart, offset 0, animating.
+	off, anim, wake := m.step("row", nameLen, budget, t0)
+	if off != 0 || !anim || wake != marqueeHoldStart {
+		t.Fatalf("t0: off=%d anim=%v wake=%v", off, anim, wake)
+	}
+
+	// Just before hold expires: still offset 0.
+	off, anim, _ = m.step("row", nameLen, budget, t0.Add(marqueeHoldStart-time.Millisecond))
+	if off != 0 || !anim {
+		t.Fatalf("pre-expiry hold: off=%d anim=%v", off, anim)
+	}
+
+	// At hold expiry + 1 step: should have transitioned to scroll, offset 1.
+	off, anim, _ = m.step("row", nameLen, budget, t0.Add(marqueeHoldStart+marqueeStep))
+	if !anim || off != 1 {
+		t.Fatalf("first scroll step: off=%d anim=%v", off, anim)
+	}
+
+	// Mid-scroll: offset == 3.
+	off, _, _ = m.step("row", nameLen, budget, t0.Add(marqueeHoldStart+3*marqueeStep))
+	if off != 3 {
+		t.Fatalf("mid scroll: off=%d", off)
+	}
+
+	// Tail reached: offset == maxOffset == 5.
+	off, _, _ = m.step("row", nameLen, budget, t0.Add(marqueeHoldStart+5*marqueeStep+time.Millisecond))
+	if off != 5 {
+		t.Fatalf("tail: off=%d", off)
+	}
+
+	// Hold-end window still pegged at maxOffset.
+	off, _, _ = m.step("row", nameLen, budget, t0.Add(marqueeHoldStart+5*marqueeStep+marqueeHoldEnd/2))
+	if off != 5 {
+		t.Fatalf("hold-end mid: off=%d", off)
+	}
+
+	// After hold-end: snap back to offset 0.
+	off, _, _ = m.step("row", nameLen, budget, t0.Add(marqueeHoldStart+5*marqueeStep+marqueeHoldEnd+time.Millisecond))
+	if off != 0 {
+		t.Fatalf("snap back: off=%d", off)
+	}
+}
+
+func TestMarqueeIDChangeResets(t *testing.T) {
+	var m marqueeState
+	t0 := time.Unix(0, 0)
+	_, _, _ = m.step("a", 10, 5, t0)
+	// Advance well into scroll for row "a".
+	_, _, _ = m.step("a", 10, 5, t0.Add(marqueeHoldStart+3*marqueeStep))
+	// Now focus moves to "b": offset must reset to 0 and phase to hold-start.
+	off, anim, wake := m.step("b", 10, 5, t0.Add(marqueeHoldStart+3*marqueeStep))
+	if off != 0 || !anim || wake != marqueeHoldStart {
+		t.Fatalf("id reset: off=%d anim=%v wake=%v", off, anim, wake)
+	}
+}
+
+func TestMarqueeActive(t *testing.T) {
+	var m marqueeState
+	if m.active() {
+		t.Fatalf("fresh marquee should not be active")
+	}
+	_, _, _ = m.step("row", 10, 5, time.Unix(0, 0))
+	if !m.active() {
+		t.Fatalf("expected active after overflow step")
+	}
+	_, _, _ = m.step("row", 4, 5, time.Unix(0, 0))
+	if m.active() {
+		t.Fatalf("should not be active when name fits")
+	}
+}
+
+func TestMarqueeReset(t *testing.T) {
+	var m marqueeState
+	_, _, _ = m.step("row", 10, 5, time.Unix(0, 0))
+	m.reset()
+	if m.active() {
+		t.Fatalf("expected inactive after reset")
+	}
+	// After reset, stepping the same id starts fresh.
+	off, _, wake := m.step("row", 10, 5, time.Unix(5, 0))
+	if off != 0 || wake != marqueeHoldStart {
+		t.Fatalf("post-reset start: off=%d wake=%v", off, wake)
+	}
+}
+
+func TestFitName(t *testing.T) {
+	cases := []struct {
+		name, in string
+		budget   int
+		want     string
+	}{
+		{"fits", "abc", 5, "abc"},
+		{"exact", "abcde", 5, "abcde"},
+		{"truncate", "abcdef", 5, "abcd…"},
+		{"budget1", "abcdef", 1, "…"},
+		{"budget0", "abc", 0, ""},
+		{"unicode", "αβγδεζη", 4, "αβγ…"},
+	}
+	for _, c := range cases {
+		t.Run(c.name, func(t *testing.T) {
+			got := fitName(c.in, c.budget)
+			if got != c.want {
+				t.Fatalf("fitName(%q, %d) = %q want %q", c.in, c.budget, got, c.want)
+			}
+		})
+	}
+}
+
+func TestMarqueeWindow(t *testing.T) {
+	got := marqueeWindow("abcdefgh", 4, 2)
+	if got != "cdef" {
+		t.Fatalf("window = %q", got)
+	}
+	// Clamp end-of-string overflow.
+	got = marqueeWindow("abcdef", 4, 10)
+	if got != "cdef" {
+		t.Fatalf("clamped window = %q", got)
+	}
+}
+
+func TestClampVisible(t *testing.T) {
+	// Plain string longer than width.
+	if got := clampVisible("abcdef", 3); visibleLen(got) != 3 {
+		t.Fatalf("plain clamp visible = %d (%q)", visibleLen(got), got)
+	}
+	// Already-fitting string is unchanged.
+	if got := clampVisible("abc", 5); got != "abc" {
+		t.Fatalf("unchanged = %q", got)
+	}
+	// SGR-wrapped string: visible portion must be <= width.
+	in := "\x1b[1mhello\x1b[0m world"
+	got := clampVisible(in, 5)
+	if visibleLen(got) != 5 {
+		t.Fatalf("sgr clamp visible = %d (%q)", visibleLen(got), got)
+	}
+}
--- a/internal/app/metrics.go
+++ b/internal/app/metrics.go
@@ -0,0 +1,462 @@
+package app
+
+import (
+	"context"
+	"encoding/json"
+	"fmt"
+	"os"
+	"path/filepath"
+	"sync/atomic"
+	"time"
+)
+
+// metricsTracker collects per-hot-path counters and timings. All
+// fields are atomic so callers can record from the per-PTY-chunk path
+// without taking a lock. Enabled only when --profile is set.
+//
+// Sampled rates ("X per second", "p99 latency") are not tracked here
+// directly — the snapshotter goroutine writes a row to metrics.jsonl
+// every second, and analysis tools compute rates from the deltas.
+// Aggregate totals are written to metrics.json on shutdown.
+type metricsTracker struct {
+	startedAt time.Time
+
+	// PTY chunk arrival → stdout write pipeline (per OnPTYOut call).
+	ptyChunks      atomic.Int64
+	ptyBytes       atomic.Int64
+	onPTYOutNs     atomic.Int64
+	onPTYOutMaxNs  atomic.Int64
+	onPTYOutDrops  atomic.Int64 // chunks for non-focused children — fast-path returns
+	stdoutWrites   atomic.Int64
+	stdoutBytes    atomic.Int64
+	stdoutNs       atomic.Int64
+	stdoutMaxNs    atomic.Int64
+
+	// Viewport renderer (state-machine over child PTY bytes).
+	renderCalls atomic.Int64
+	renderNs    atomic.Int64
+	renderMaxNs atomic.Int64
+
+	// CGO into libghostty-vt (counted from pumpChild).
+	emuWriteCalls atomic.Int64
+	emuWriteNs    atomic.Int64
+	emuWriteMaxNs atomic.Int64
+	emuTitleCalls atomic.Int64
+	emuTitleNs    atomic.Int64
+	emuTitleSkips atomic.Int64 // OSC-gate fast path — title poll skipped
+
+	// Chrome paint pipeline.
+	sidebarDraws     atomic.Int64
+	sidebarCacheHits atomic.Int64
+	sidebarNs        atomic.Int64
+	sidebarMaxNs     atomic.Int64
+
+	tabbarDraws     atomic.Int64
+	tabbarCacheHits atomic.Int64
+	tabbarNs        atomic.Int64
+
+	statusDraws     atomic.Int64
+	statusCacheHits atomic.Int64
+	statusNs        atomic.Int64
+
+	// Snapshot replay (focus / spawn / nudge).
+	snapshotReplays atomic.Int64
+	snapshotNs      atomic.Int64
+	snapshotMaxNs   atomic.Int64
+
+	// Chrome ticker — distinguishes useful work from idle wakeups.
+	tickerFires      atomic.Int64
+	tickerIdleFires  atomic.Int64 // nothing dirty when the ticker fired
+
+	// Output destination (set when enabled).
+	rowFile *os.File // metrics.jsonl
+	dir     string
+}
+
+// newMetricsTracker creates an enabled tracker writing to <dir>/.
+// Returns nil + nil err if dir is empty (feature off). Caller must
+// call tracker.run(ctx) in a goroutine and tracker.close() at exit.
+func newMetricsTracker(dir string) (*metricsTracker, error) {
+	if dir == "" {
+		return nil, nil
+	}
+	if err := os.MkdirAll(dir, 0o700); err != nil {
+		return nil, err
+	}
+	row, err := os.Create(filepath.Join(dir, "metrics.jsonl"))
+	if err != nil {
+		return nil, err
+	}
+	return &metricsTracker{
+		startedAt: time.Now(),
+		rowFile:   row,
+		dir:       dir,
+	}, nil
+}
+
+// observeMax updates dst to max(dst, v) using a CAS loop. Atomic max
+// isn't a hardware primitive on most CPUs; this is the standard idiom.
+// Spurious wakeups can race but the result settles at the true max.
+func observeMax(dst *atomic.Int64, v int64) {
+	for {
+		old := dst.Load()
+		if v <= old {
+			return
+		}
+		if dst.CompareAndSwap(old, v) {
+			return
+		}
+	}
+}
+
+// recordPTYOut is called once at the end of each OnPTYOut invocation.
+// `dur` is the full per-chunk wall time (renderer + stdout + chrome
+// signals); `bytes` is the chunk's byte count.
+func (m *metricsTracker) recordPTYOut(dur time.Duration, bytes int) {
+	if m == nil {
+		return
+	}
+	m.ptyChunks.Add(1)
+	m.ptyBytes.Add(int64(bytes))
+	ns := dur.Nanoseconds()
+	m.onPTYOutNs.Add(ns)
+	observeMax(&m.onPTYOutMaxNs, ns)
+}
+
+func (m *metricsTracker) recordPTYOutDrop() {
+	if m == nil {
+		return
+	}
+	m.onPTYOutDrops.Add(1)
+}
+
+func (m *metricsTracker) recordRender(dur time.Duration) {
+	if m == nil {
+		return
+	}
+	m.renderCalls.Add(1)
+	ns := dur.Nanoseconds()
+	m.renderNs.Add(ns)
+	observeMax(&m.renderMaxNs, ns)
+}
+
+func (m *metricsTracker) recordStdout(dur time.Duration, bytes int) {
+	if m == nil {
+		return
+	}
+	m.stdoutWrites.Add(1)
+	m.stdoutBytes.Add(int64(bytes))
+	ns := dur.Nanoseconds()
+	m.stdoutNs.Add(ns)
+	observeMax(&m.stdoutMaxNs, ns)
+}
+
+func (m *metricsTracker) recordEmuWrite(dur time.Duration) {
+	if m == nil {
+		return
+	}
+	m.emuWriteCalls.Add(1)
+	ns := dur.Nanoseconds()
+	m.emuWriteNs.Add(ns)
+	observeMax(&m.emuWriteMaxNs, ns)
+}
+
+func (m *metricsTracker) recordEmuTitle(dur time.Duration, skipped bool) {
+	if m == nil {
+		return
+	}
+	if skipped {
+		m.emuTitleSkips.Add(1)
+		return
+	}
+	m.emuTitleCalls.Add(1)
+	m.emuTitleNs.Add(dur.Nanoseconds())
+}
+
+func (m *metricsTracker) recordSidebar(dur time.Duration, cacheHit bool) {
+	if m == nil {
+		return
+	}
+	m.sidebarDraws.Add(1)
+	if cacheHit {
+		m.sidebarCacheHits.Add(1)
+	}
+	ns := dur.Nanoseconds()
+	m.sidebarNs.Add(ns)
+	observeMax(&m.sidebarMaxNs, ns)
+}
+
+func (m *metricsTracker) recordTabbar(dur time.Duration, cacheHit bool) {
+	if m == nil {
+		return
+	}
+	m.tabbarDraws.Add(1)
+	if cacheHit {
+		m.tabbarCacheHits.Add(1)
+	}
+	m.tabbarNs.Add(dur.Nanoseconds())
+}
+
+func (m *metricsTracker) recordStatus(dur time.Duration, cacheHit bool) {
+	if m == nil {
+		return
+	}
+	m.statusDraws.Add(1)
+	if cacheHit {
+		m.statusCacheHits.Add(1)
+	}
+	m.statusNs.Add(dur.Nanoseconds())
+}
+
+func (m *metricsTracker) recordSnapshot(dur time.Duration) {
+	if m == nil {
+		return
+	}
+	m.snapshotReplays.Add(1)
+	ns := dur.Nanoseconds()
+	m.snapshotNs.Add(ns)
+	observeMax(&m.snapshotMaxNs, ns)
+}
+
+func (m *metricsTracker) recordTickerFire(didWork bool) {
+	if m == nil {
+		return
+	}
+	m.tickerFires.Add(1)
+	if !didWork {
+		m.tickerIdleFires.Add(1)
+	}
+}
+
+// snapshot captures the tracker's current state as a JSON-serialisable
+// map. Suitable for both the per-second JSONL row and the final
+// metrics.json aggregate.
+type metricsSnapshot struct {
+	WallSeconds   float64 `json:"wall_seconds"`
+	PTYChunks     int64   `json:"pty_chunks"`
+	PTYBytes      int64   `json:"pty_bytes"`
+	OnPTYOutNs    int64   `json:"on_pty_out_ns_total"`
+	OnPTYOutMaxNs int64   `json:"on_pty_out_ns_max"`
+	OnPTYOutDrops int64   `json:"on_pty_out_drops"`
+	StdoutWrites  int64   `json:"stdout_writes"`
+	StdoutBytes   int64   `json:"stdout_bytes"`
+	StdoutNs      int64   `json:"stdout_ns_total"`
+	StdoutMaxNs   int64   `json:"stdout_ns_max"`
+
+	RenderCalls int64 `json:"render_calls"`
+	RenderNs    int64 `json:"render_ns_total"`
+	RenderMaxNs int64 `json:"render_ns_max"`
+
+	EmuWriteCalls int64 `json:"emu_write_calls"`
+	EmuWriteNs    int64 `json:"emu_write_ns_total"`
+	EmuWriteMaxNs int64 `json:"emu_write_ns_max"`
+	EmuTitleCalls int64 `json:"emu_title_calls"`
+	EmuTitleNs    int64 `json:"emu_title_ns_total"`
+	EmuTitleSkips int64 `json:"emu_title_skips"`
+
+	SidebarDraws     int64 `json:"sidebar_draws"`
+	SidebarCacheHits int64 `json:"sidebar_cache_hits"`
+	SidebarNs        int64 `json:"sidebar_ns_total"`
+	SidebarMaxNs     int64 `json:"sidebar_ns_max"`
+
+	TabbarDraws     int64 `json:"tabbar_draws"`
+	TabbarCacheHits int64 `json:"tabbar_cache_hits"`
+	TabbarNs        int64 `json:"tabbar_ns_total"`
+
+	StatusDraws     int64 `json:"status_draws"`
+	StatusCacheHits int64 `json:"status_cache_hits"`
+	StatusNs        int64 `json:"status_ns_total"`
+
+	SnapshotReplays int64 `json:"snapshot_replays"`
+	SnapshotNs      int64 `json:"snapshot_ns_total"`
+	SnapshotMaxNs   int64 `json:"snapshot_ns_max"`
+
+	TickerFires     int64 `json:"ticker_fires"`
+	TickerIdleFires int64 `json:"ticker_idle_fires"`
+
+	// Derived rates (computed at snapshot time so consumers don't have
+	// to). All "per_second" values are averaged over wall_seconds.
+	PTYChunksPerSec      float64 `json:"pty_chunks_per_sec"`
+	PTYBytesPerSec       float64 `json:"pty_bytes_per_sec"`
+	OnPTYOutMeanUs       float64 `json:"on_pty_out_mean_us"`
+	StdoutMeanUs         float64 `json:"stdout_mean_us"`
+	EmuWriteMeanUs       float64 `json:"emu_write_mean_us"`
+	SidebarMeanUs        float64 `json:"sidebar_mean_us"`
+	SidebarCacheHitRate  float64 `json:"sidebar_cache_hit_rate"`
+	TabbarCacheHitRate   float64 `json:"tabbar_cache_hit_rate"`
+	StatusCacheHitRate   float64 `json:"status_cache_hit_rate"`
+	EmuTitleSkipRate     float64 `json:"emu_title_skip_rate"`
+	TickerIdleRate       float64 `json:"ticker_idle_rate"`
+	Timestamp            string  `json:"timestamp"`
+}
+
+func (m *metricsTracker) snapshotNow() metricsSnapshot {
+	wall := time.Since(m.startedAt).Seconds()
+	if wall <= 0 {
+		wall = 1
+	}
+	chunks := m.ptyChunks.Load()
+	bytes := m.ptyBytes.Load()
+	onptyTotal := m.onPTYOutNs.Load()
+	stdW := m.stdoutWrites.Load()
+	stdNs := m.stdoutNs.Load()
+	emuW := m.emuWriteCalls.Load()
+	emuWNs := m.emuWriteNs.Load()
+	sbDraws := m.sidebarDraws.Load()
+	sbHits := m.sidebarCacheHits.Load()
+	sbNs := m.sidebarNs.Load()
+	tbDraws := m.tabbarDraws.Load()
+	tbHits := m.tabbarCacheHits.Load()
+	stDraws := m.statusDraws.Load()
+	stHits := m.statusCacheHits.Load()
+	emuTC := m.emuTitleCalls.Load()
+	emuTS := m.emuTitleSkips.Load()
+	tickerF := m.tickerFires.Load()
+	tickerI := m.tickerIdleFires.Load()
+
+	div := func(num, denom int64) float64 {
+		if denom == 0 {
+			return 0
+		}
+		return float64(num) / float64(denom)
+	}
+
+	return metricsSnapshot{
+		WallSeconds:   wall,
+		PTYChunks:     chunks,
+		PTYBytes:      bytes,
+		OnPTYOutNs:    onptyTotal,
+		OnPTYOutMaxNs: m.onPTYOutMaxNs.Load(),
+		OnPTYOutDrops: m.onPTYOutDrops.Load(),
+		StdoutWrites:  stdW,
+		StdoutBytes:   m.stdoutBytes.Load(),
+		StdoutNs:      stdNs,
+		StdoutMaxNs:   m.stdoutMaxNs.Load(),
+
+		RenderCalls: m.renderCalls.Load(),
+		RenderNs:    m.renderNs.Load(),
+		RenderMaxNs: m.renderMaxNs.Load(),
+
+		EmuWriteCalls: emuW,
+		EmuWriteNs:    emuWNs,
+		EmuWriteMaxNs: m.emuWriteMaxNs.Load(),
+		EmuTitleCalls: emuTC,
+		EmuTitleNs:    m.emuTitleNs.Load(),
+		EmuTitleSkips: emuTS,
+
+		SidebarDraws:     sbDraws,
+		SidebarCacheHits: sbHits,
+		SidebarNs:        sbNs,
+		SidebarMaxNs:     m.sidebarMaxNs.Load(),
+
+		TabbarDraws:     tbDraws,
+		TabbarCacheHits: tbHits,
+		TabbarNs:        m.tabbarNs.Load(),
+
+		StatusDraws:     stDraws,
+		StatusCacheHits: stHits,
+		StatusNs:        m.statusNs.Load(),
+
+		SnapshotReplays: m.snapshotReplays.Load(),
+		SnapshotNs:      m.snapshotNs.Load(),
+		SnapshotMaxNs:   m.snapshotMaxNs.Load(),
+
+		TickerFires:     tickerF,
+		TickerIdleFires: tickerI,
+
+		PTYChunksPerSec:     float64(chunks) / wall,
+		PTYBytesPerSec:      float64(bytes) / wall,
+		OnPTYOutMeanUs:      div(onptyTotal/1000, chunks),
+		StdoutMeanUs:        div(stdNs/1000, stdW),
+		EmuWriteMeanUs:      div(emuWNs/1000, emuW),
+		SidebarMeanUs:       div(sbNs/1000, sbDraws),
+		SidebarCacheHitRate: div(sbHits, sbDraws),
+		TabbarCacheHitRate:  div(tbHits, tbDraws),
+		StatusCacheHitRate:  div(stHits, stDraws),
+		EmuTitleSkipRate:    div(emuTS, emuTC+emuTS),
+		TickerIdleRate:      div(tickerI, tickerF),
+		Timestamp:           time.Now().Format(time.RFC3339Nano),
+	}
+}
+
+// run is the snapshotter goroutine: write a JSONL row every second
+// until ctx is cancelled. Stops cleanly without flushing partial
+// rows.
+func (m *metricsTracker) run(ctx context.Context) {
+	if m == nil {
+		return
+	}
+	enc := json.NewEncoder(m.rowFile)
+	ticker := time.NewTicker(time.Second)
+	defer ticker.Stop()
+	for {
+		select {
+		case <-ctx.Done():
+			return
+		case <-ticker.C:
+			snap := m.snapshotNow()
+			_ = enc.Encode(snap)
+		}
+	}
+}
+
+// close writes the final aggregate snapshot to metrics.json + a
+// short human-readable summary.txt, then closes the row file. Safe
+// to call on a nil receiver.
+func (m *metricsTracker) close() {
+	if m == nil {
+		return
+	}
+	snap := m.snapshotNow()
+	if f, err := os.Create(filepath.Join(m.dir, "metrics.json")); err == nil {
+		enc := json.NewEncoder(f)
+		enc.SetIndent("", "  ")
+		_ = enc.Encode(snap)
+		_ = f.Close()
+	}
+	if f, err := os.Create(filepath.Join(m.dir, "summary.txt")); err == nil {
+		writeSummary(f, snap)
+		_ = f.Close()
+	}
+	if m.rowFile != nil {
+		_ = m.rowFile.Close()
+		m.rowFile = nil
+	}
+}
+
+// writeSummary renders a brief human-readable digest of a snapshot.
+// Designed for `cat summary.txt` after a session — quick orientation
+// before diving into metrics.json / pprof.
+func writeSummary(w *os.File, s metricsSnapshot) {
+	fmt.Fprintf(w, "patterm performance summary\n")
+	fmt.Fprintf(w, "===========================\n\n")
+	fmt.Fprintf(w, "session length:        %.1fs\n", s.WallSeconds)
+	fmt.Fprintf(w, "pty chunks:            %d  (%.1f /s)\n", s.PTYChunks, s.PTYChunksPerSec)
+	fmt.Fprintf(w, "pty bytes:             %d  (%.0f /s, %.1f KiB/s)\n",
+		s.PTYBytes, s.PTYBytesPerSec, s.PTYBytesPerSec/1024)
+	fmt.Fprintf(w, "pty chunks dropped:    %d  (focus not on caller — fast-path return)\n", s.OnPTYOutDrops)
+	fmt.Fprintf(w, "\n")
+	fmt.Fprintf(w, "OnPTYOut mean:         %.1fµs   max: %.1fms\n",
+		s.OnPTYOutMeanUs, float64(s.OnPTYOutMaxNs)/1e6)
+	fmt.Fprintf(w, "viewport.Render calls: %d  total %.1fms  max %.1fms\n",
+		s.RenderCalls, float64(s.RenderNs)/1e6, float64(s.RenderMaxNs)/1e6)
+	fmt.Fprintf(w, "stdout writes:         %d  mean %.1fµs  max %.1fms  bytes %d\n",
+		s.StdoutWrites, s.StdoutMeanUs, float64(s.StdoutMaxNs)/1e6, s.StdoutBytes)
+	fmt.Fprintf(w, "\n")
+	fmt.Fprintf(w, "emulator.Write (cgo):  %d  mean %.1fµs  max %.1fms\n",
+		s.EmuWriteCalls, s.EmuWriteMeanUs, float64(s.EmuWriteMaxNs)/1e6)
+	fmt.Fprintf(w, "emulator.Title polls:  %d real, %d gated   skip rate %.1f%%\n",
+		s.EmuTitleCalls, s.EmuTitleSkips, s.EmuTitleSkipRate*100)
+	fmt.Fprintf(w, "\n")
+	fmt.Fprintf(w, "sidebar draws:         %d  mean %.1fµs  max %.1fms  cache-hit %.1f%%\n",
+		s.SidebarDraws, s.SidebarMeanUs, float64(s.SidebarMaxNs)/1e6, s.SidebarCacheHitRate*100)
+	fmt.Fprintf(w, "tabbar draws:          %d  cache-hit %.1f%%\n",
+		s.TabbarDraws, s.TabbarCacheHitRate*100)
+	fmt.Fprintf(w, "status draws:          %d  cache-hit %.1f%%\n",
+		s.StatusDraws, s.StatusCacheHitRate*100)
+	fmt.Fprintf(w, "snapshot replays:      %d  total %.1fms  max %.1fms\n",
+		s.SnapshotReplays, float64(s.SnapshotNs)/1e6, float64(s.SnapshotMaxNs)/1e6)
+	fmt.Fprintf(w, "\n")
+	fmt.Fprintf(w, "chrome ticker:         %d fires, %d idle   idle rate %.1f%%\n",
+		s.TickerFires, s.TickerIdleFires, s.TickerIdleRate*100)
+}
--- a/internal/app/metrics_test.go
+++ b/internal/app/metrics_test.go
@@ -0,0 +1,116 @@
+package app
+
+import (
+	"encoding/json"
+	"os"
+	"path/filepath"
+	"testing"
+	"time"
+)
+
+func TestMetricsTrackerDisabledByEmptyDir(t *testing.T) {
+	m, err := newMetricsTracker("")
+	if err != nil {
+		t.Fatalf("newMetricsTracker(\"\") err: %v", err)
+	}
+	if m != nil {
+		t.Fatalf("expected nil tracker for empty dir, got %v", m)
+	}
+}
+
+func TestMetricsTrackerRecordsAndWrites(t *testing.T) {
+	dir := t.TempDir()
+	m, err := newMetricsTracker(dir)
+	if err != nil {
+		t.Fatalf("newMetricsTracker: %v", err)
+	}
+	if m == nil {
+		t.Fatal("expected non-nil tracker")
+	}
+
+	m.recordPTYOut(2*time.Millisecond, 1024)
+	m.recordPTYOut(5*time.Millisecond, 4096)
+	m.recordRender(800 * time.Microsecond)
+	m.recordStdout(300*time.Microsecond, 1100)
+	m.recordEmuWrite(150 * time.Microsecond)
+	m.recordEmuTitle(0, true)
+	m.recordEmuTitle(20*time.Microsecond, false)
+	m.recordSidebar(100*time.Microsecond, true)
+	m.recordSidebar(900*time.Microsecond, false)
+	m.recordTabbar(50*time.Microsecond, true)
+	m.recordStatus(40*time.Microsecond, true)
+	m.recordSnapshot(2 * time.Millisecond)
+	m.recordTickerFire(false)
+	m.recordTickerFire(true)
+	m.recordPTYOutDrop()
+
+	m.close()
+
+	// metrics.json should exist and parse, and reflect what we recorded.
+	raw, err := os.ReadFile(filepath.Join(dir, "metrics.json"))
+	if err != nil {
+		t.Fatalf("read metrics.json: %v", err)
+	}
+	var snap metricsSnapshot
+	if err := json.Unmarshal(raw, &snap); err != nil {
+		t.Fatalf("parse metrics.json: %v", err)
+	}
+	if snap.PTYChunks != 2 {
+		t.Errorf("PTYChunks = %d, want 2", snap.PTYChunks)
+	}
+	if snap.PTYBytes != 5120 {
+		t.Errorf("PTYBytes = %d, want 5120", snap.PTYBytes)
+	}
+	if snap.OnPTYOutMaxNs != (5 * time.Millisecond).Nanoseconds() {
+		t.Errorf("OnPTYOutMaxNs = %d, want %d",
+			snap.OnPTYOutMaxNs, (5 * time.Millisecond).Nanoseconds())
+	}
+	if snap.SidebarDraws != 2 {
+		t.Errorf("SidebarDraws = %d, want 2", snap.SidebarDraws)
+	}
+	if snap.SidebarCacheHits != 1 {
+		t.Errorf("SidebarCacheHits = %d, want 1", snap.SidebarCacheHits)
+	}
+	if snap.SidebarCacheHitRate != 0.5 {
+		t.Errorf("SidebarCacheHitRate = %v, want 0.5", snap.SidebarCacheHitRate)
+	}
+	if snap.EmuTitleCalls != 1 || snap.EmuTitleSkips != 1 {
+		t.Errorf("emu title accounting: calls=%d skips=%d, want 1/1",
+			snap.EmuTitleCalls, snap.EmuTitleSkips)
+	}
+	if snap.TickerFires != 2 || snap.TickerIdleFires != 1 {
+		t.Errorf("ticker accounting: fires=%d idle=%d, want 2/1",
+			snap.TickerFires, snap.TickerIdleFires)
+	}
+	if snap.OnPTYOutDrops != 1 {
+		t.Errorf("OnPTYOutDrops = %d, want 1", snap.OnPTYOutDrops)
+	}
+
+	// summary.txt should also be present and non-empty.
+	info, err := os.Stat(filepath.Join(dir, "summary.txt"))
+	if err != nil {
+		t.Fatalf("stat summary.txt: %v", err)
+	}
+	if info.Size() == 0 {
+		t.Fatal("summary.txt is empty")
+	}
+}
+
+func TestMetricsTrackerNilSafe(t *testing.T) {
+	// Every record* method must be safe to call on a nil receiver
+	// because the hot paths use that to avoid an enabled-check.
+	var m *metricsTracker
+	m.recordPTYOut(time.Millisecond, 100)
+	m.recordPTYOutDrop()
+	m.recordRender(time.Microsecond)
+	m.recordStdout(time.Microsecond, 50)
+	m.recordEmuWrite(time.Microsecond)
+	m.recordEmuTitle(time.Microsecond, false)
+	m.recordEmuTitle(0, true)
+	m.recordSidebar(time.Microsecond, true)
+	m.recordTabbar(time.Microsecond, false)
+	m.recordStatus(time.Microsecond, true)
+	m.recordSnapshot(time.Microsecond)
+	m.recordTickerFire(true)
+	m.close()
+}
--- a/internal/app/palette.go
+++ b/internal/app/palette.go
--- a/internal/app/palette_context_test.go
+++ b/internal/app/palette_context_test.go
@@ -31,8 +31,10 @@ func findItem(p *paletteState, want string) (int, *paletteItem) {

 func TestContextItemsScratchpad(t *testing.T) {
 	p := newPalette(nil, "", "notes.md", preset.Set{})
-	if i, _ := findItem(p, "pad-delete"); i != 0 {
-		t.Fatalf("pad-delete at %d; want top", i)
+	// pad-delete is the first selectable row; the Focused section header
+	// (a non-selectable row) sits above it.
+	if i, _ := findItem(p, "pad-delete"); i != 1 {
+		t.Fatalf("pad-delete at %d; want 1 (after Focused header)", i)
 	}
 	if _, it := findItem(p, "pad-rename-form"); it == nil || it.action.padName != "notes.md" {
 		t.Fatalf("pad-rename-form missing or wrong padName: %+v", it)
--- a/internal/app/palette_input_test.go
+++ b/internal/app/palette_input_test.go
@@ -47,36 +47,50 @@ func TestPaletteBareEscCancels(t *testing.T) {
 	}
 }

+// firstSelectable returns the lowest item index whose action is
+// selectable (not a section header), or -1 if the palette has no
+// selectable rows.
+func firstSelectable(p *paletteState) int {
+	for i, it := range p.items {
+		if it.action.kind != "header" {
+			return i
+		}
+	}
+	return -1
+}
+
 func TestPaletteKittyArrowsNavigate(t *testing.T) {
 	pr := []*preset.Preset{{Name: "a"}, {Name: "b"}, {Name: "c"}}
 	p := newPalette(nil, "", "", preset.Set{Agents: pr})
-	if p.cursor != 0 {
-		t.Fatalf("initial cursor %d", p.cursor)
+	first := firstSelectable(p)
+	if first < 0 || p.cursor != first {
+		t.Fatalf("initial cursor %d, want first selectable %d", p.cursor, first)
 	}
 	// Kitty functional Down arrow.
 	_, _, adv := p.handleInput([]byte("\x1b[57353u"), 0)
 	if adv != 8 {
 		t.Fatalf("advance %d", adv)
 	}
-	if p.cursor != 1 {
-		t.Fatalf("cursor %d after Down, want 1", p.cursor)
+	if p.cursor != first+1 {
+		t.Fatalf("cursor %d after Down, want %d", p.cursor, first+1)
 	}
 	// Kitty functional Up arrow.
 	_, _, _ = p.handleInput([]byte("\x1b[57352u"), 0)
-	if p.cursor != 0 {
-		t.Fatalf("cursor %d after Up, want 0", p.cursor)
+	if p.cursor != first {
+		t.Fatalf("cursor %d after Up, want %d", p.cursor, first)
 	}
 }

 func TestPaletteLegacyArrowsStillWork(t *testing.T) {
 	pr := []*preset.Preset{{Name: "a"}, {Name: "b"}}
 	p := newPalette(nil, "", "", preset.Set{Agents: pr})
+	first := firstSelectable(p)
 	_, _, adv := p.handleInput([]byte("\x1b[B"), 0)
 	if adv != 3 {
 		t.Fatalf("advance %d", adv)
 	}
-	if p.cursor != 1 {
-		t.Fatalf("cursor %d, want 1", p.cursor)
+	if p.cursor != first+1 {
+		t.Fatalf("cursor %d, want %d", p.cursor, first+1)
 	}
 }

--- a/internal/app/palette_ux_test.go
+++ b/internal/app/palette_ux_test.go
@@ -0,0 +1,385 @@
+package app
+
+import (
+	"strings"
+	"testing"
+
+	"github.com/hjbdev/patterm/internal/preset"
+)
+
+// -- Phase 1: naming & dropped global Close list ---------------------
+
+func TestPaletteVerbsAreUnified(t *testing.T) {
+	procs := []*preset.Preset{{Name: "dev"}}
+	agents := []*preset.Preset{{Name: "claude"}}
+	p := newPalette(nil, "", "", preset.Set{Agents: agents, Processes: procs})
+	gotLabels := make([]string, 0, len(p.items))
+	for _, it := range p.items {
+		if it.action.kind == "header" {
+			continue
+		}
+		gotLabels = append(gotLabels, it.label)
+	}
+	joined := strings.Join(gotLabels, "\n")
+
+	mustContain := []string{
+		"Spawn agent: claude",
+		"Spawn process: dev",
+		"Spawn terminal",
+		"Spawn process… (custom)",
+	}
+	for _, want := range mustContain {
+		if !strings.Contains(joined, want) {
+			t.Errorf("missing unified-verb label %q in:\n%s", want, joined)
+		}
+	}
+	// The pre-overhaul verb forms must not appear anywhere.
+	mustNotContain := []string{"Run process:", "New Terminal", "Spawn process… (custom)"}
+	for _, bad := range mustNotContain {
+		if strings.Contains(joined, bad) {
+			t.Errorf("leftover legacy verb %q present in:\n%s", bad, joined)
+		}
+	}
+}
+
+func TestPaletteDropsGlobalCloseList(t *testing.T) {
+	c1 := makeFakeChild("a", "claude", KindAgent)
+	c2 := makeFakeChild("b", "dev", KindCommand)
+	p := newPalette([]*Child{c1, c2}, "", "", preset.Set{})
+	// No focus → no Focused context, so no "kill" / "agent-close" /
+	// "proc-stop" rows should exist at all.
+	for _, kind := range []string{"kill", "agent-close", "proc-stop", "proc-delete"} {
+		if i, _ := findItem(p, kind); i != -1 {
+			t.Fatalf("kind %q present at %d; global Close list should be gone", kind, i)
+		}
+	}
+}
+
+// -- Phase 2: section headers and cursor skip ------------------------
+
+func TestPaletteSectionHeadersPresent(t *testing.T) {
+	c := makeFakeChild("a", "claude", KindAgent)
+	p := newPalette([]*Child{c}, "a", "", preset.Set{Agents: []*preset.Preset{{Name: "codex"}}})
+	wantSections := []string{"Focused", "Open", "Spawn", "Quit"}
+	for _, w := range wantSections {
+		found := false
+		for _, it := range p.items {
+			if it.action.kind == "header" && strings.Contains(it.label, w) {
+				found = true
+				break
+			}
+		}
+		if !found {
+			t.Errorf("section header %q missing from items", w)
+		}
+	}
+}
+
+func TestPaletteCursorSkipsHeaders(t *testing.T) {
+	pr := []*preset.Preset{{Name: "a"}, {Name: "b"}}
+	p := newPalette(nil, "", "", preset.Set{Agents: pr})
+	// Initial cursor must land on a selectable row, never a header.
+	if p.items[p.cursor].action.kind == "header" {
+		t.Fatalf("initial cursor sits on a header: %+v", p.items[p.cursor])
+	}
+	// Walk to the end with cursorDown; every stop must be selectable.
+	for i := 0; i < len(p.items)*2; i++ {
+		p.cursorDown()
+		if p.items[p.cursor].action.kind == "header" {
+			t.Fatalf("cursorDown landed on a header at index %d", p.cursor)
+		}
+	}
+	// Walk back to top.
+	for i := 0; i < len(p.items)*2; i++ {
+		p.cursorUp()
+		if p.items[p.cursor].action.kind == "header" {
+			t.Fatalf("cursorUp landed on a header at index %d", p.cursor)
+		}
+	}
+}
+
+func TestPaletteEnterOnHeaderIsNoOp(t *testing.T) {
+	pr := []*preset.Preset{{Name: "a"}}
+	p := newPalette(nil, "", "", preset.Set{Agents: pr})
+	// Force the cursor onto a header.
+	for i, it := range p.items {
+		if it.action.kind == "header" {
+			p.cursor = i
+			break
+		}
+	}
+	_, done, _ := p.handleInput([]byte("\r"), 0)
+	if done {
+		t.Fatalf("Enter on header closed palette; expected no-op")
+	}
+}
+
+// -- Phase 3: filter chips & macro coexistence -----------------------
+
+func TestPaletteTabCyclesChip(t *testing.T) {
+	p := newTestPalette()
+	// All → Open
+	_, _, _ = p.handleInput([]byte{'\t'}, 0)
+	if string(p.query) != "sw " {
+		t.Fatalf("Tab #1: query %q, want %q", string(p.query), "sw ")
+	}
+	// Open → Spawn
+	_, _, _ = p.handleInput([]byte{'\t'}, 0)
+	if string(p.query) != "sp " {
+		t.Fatalf("Tab #2: query %q, want %q", string(p.query), "sp ")
+	}
+	// Spawn → Close
+	_, _, _ = p.handleInput([]byte{'\t'}, 0)
+	if string(p.query) != "k " {
+		t.Fatalf("Tab #3: query %q, want %q", string(p.query), "k ")
+	}
+	// Close → All (wraps)
+	_, _, _ = p.handleInput([]byte{'\t'}, 0)
+	if string(p.query) != "" {
+		t.Fatalf("Tab #4 wrap: query %q, want empty", string(p.query))
+	}
+}
+
+func TestPaletteShiftTabCyclesBackwards(t *testing.T) {
+	p := newTestPalette()
+	// Shift-Tab via legacy CSI Z: All → Close
+	_, _, _ = p.handleInput([]byte("\x1b[Z"), 0)
+	if string(p.query) != "k " {
+		t.Fatalf("Shift-Tab: query %q, want %q", string(p.query), "k ")
+	}
+}
+
+func TestPaletteBackspaceThroughTrailingMacro(t *testing.T) {
+	p := newTestPalette()
+	p.query = []rune("sw ")
+	p.rebuild()
+	p.backspace()
+	if string(p.query) != "" {
+		t.Fatalf("backspace through 'sw ' left %q; want empty", string(p.query))
+	}
+}
+
+func TestPaletteMacroPreservesQueryCase(t *testing.T) {
+	// Tab cycling shouldn't downcase the user-typed search text.
+	p := newTestPalette()
+	p.query = []rune("Foo")
+	p.rebuild()
+	_, _, _ = p.handleInput([]byte{'\t'}, 0)
+	if string(p.query) != "sw Foo" {
+		t.Fatalf("query after Tab over 'Foo' = %q; want 'sw Foo'", string(p.query))
+	}
+}
+
+// -- Phase 4: scored matching ----------------------------------------
+
+func TestFuzzyScorePrefixBeatsBoundaryBeatsSubstring(t *testing.T) {
+	prefix, _ := fuzzyScore("spawn agent: foo", "", "spa")
+	boundary, _ := fuzzyScore("hello spam", "", "spa")
+	substring, _ := fuzzyScore("escapade", "", "spa")
+	if !(prefix > boundary && boundary > substring) {
+		t.Fatalf("score ordering wrong: prefix=%d boundary=%d substring=%d", prefix, boundary, substring)
+	}
+}
+
+func TestFuzzyScoreReturnsMatchPositions(t *testing.T) {
+	_, pos := fuzzyScore("spawn process: dev", "", "dev")
+	want := []int{15, 16, 17}
+	if len(pos) != len(want) {
+		t.Fatalf("positions = %v, want %v", pos, want)
+	}
+	for i, p := range pos {
+		if p != want[i] {
+			t.Fatalf("pos[%d] = %d, want %d (full %v)", i, p, want[i], pos)
+		}
+	}
+}
+
+func TestPaletteScoredResultsDropHeaders(t *testing.T) {
+	pr := []*preset.Preset{{Name: "claude"}, {Name: "codex"}}
+	p := newPalette(nil, "", "", preset.Set{Agents: pr})
+	// Type a needle that matches both.
+	p.query = []rune("c")
+	p.rebuild()
+	for _, it := range p.items {
+		if it.action.kind == "header" {
+			t.Fatalf("scored mode should not emit header rows; got %+v", it)
+		}
+	}
+}
+
+func TestPaletteScoringFloatsPrefixMatchToTop(t *testing.T) {
+	// "x" is a prefix of "xtest" preset; it's a scattered-fuzzy match
+	// against many other rows. Scoring should land the prefix match at
+	// the top regardless of group order.
+	pr := []*preset.Preset{
+		{Name: "alpha"},
+		{Name: "xtest"},
+		{Name: "beta"},
+	}
+	p := newPalette(nil, "", "", preset.Set{Agents: pr})
+	p.query = []rune("xt")
+	p.rebuild()
+	if len(p.items) == 0 {
+		t.Fatalf("no scored items for needle 'xt'")
+	}
+	if !strings.Contains(p.items[0].label, "xtest") {
+		t.Fatalf("expected xtest at top of scored list, got %q", p.items[0].label)
+	}
+}
+
+// -- Phase 5: power-user accelerators --------------------------------
+
+func TestPaletteCtrlXOnSwitchKills(t *testing.T) {
+	c := makeFakeChild("a", "claude", KindAgent)
+	p := newPalette([]*Child{c}, "", "", preset.Set{})
+	// Cursor should already be on the switch row (it's the first
+	// selectable item with no Focused section).
+	idx, _ := findItem(p, "switch")
+	if idx < 0 {
+		t.Fatalf("no switch item in palette")
+	}
+	p.cursor = idx
+	action, done, _ := p.handleInput([]byte{0x18}, 0)
+	if !done {
+		t.Fatalf("Ctrl-X on switch row didn't close palette: action=%+v", action)
+	}
+	if action.kind != "kill" || action.childID != "a" {
+		t.Fatalf("Ctrl-X action = %+v, want kill of 'a'", action)
+	}
+}
+
+func TestPaletteCtrlXOnNonSwitchIsNoOp(t *testing.T) {
+	p := newPalette(nil, "", "", preset.Set{})
+	// Cursor parks on Quit or Spawn entries — neither is a switch row.
+	_, done, _ := p.handleInput([]byte{0x18}, 0)
+	if done {
+		t.Fatalf("Ctrl-X on non-switch closed palette")
+	}
+}
+
+func TestPaletteHelpToggle(t *testing.T) {
+	p := newTestPalette()
+	// `?` with empty query opens help.
+	_, done, _ := p.handleInput([]byte("?"), 0)
+	if done {
+		t.Fatalf("? closed palette")
+	}
+	if !p.showHelp {
+		t.Fatalf("? didn't open help")
+	}
+	// Next keystroke dismisses.
+	_, _, _ = p.handleInput([]byte("a"), 0)
+	if p.showHelp {
+		t.Fatalf("help still showing after dismissing keystroke")
+	}
+}
+
+func TestPaletteHelpDoesNotInterceptInQuery(t *testing.T) {
+	p := newTestPalette()
+	p.query = []rune("dev")
+	p.rebuild()
+	_, _, _ = p.handleInput([]byte("?"), 0)
+	if p.showHelp {
+		t.Fatalf("? with non-empty query incorrectly opened help")
+	}
+	if string(p.query) != "dev?" {
+		t.Fatalf("? with non-empty query failed to append: %q", string(p.query))
+	}
+}
+
+func TestPaletteHomeEndJumpsOverHeaders(t *testing.T) {
+	pr := []*preset.Preset{{Name: "a"}, {Name: "b"}}
+	p := newPalette(nil, "", "", preset.Set{Agents: pr})
+	// End jumps to last selectable.
+	p.cursorEnd()
+	if p.items[p.cursor].action.kind == "header" {
+		t.Fatalf("End landed on header: %+v", p.items[p.cursor])
+	}
+	if p.items[p.cursor].action.kind != "quit" {
+		t.Fatalf("End on simple palette should park on Quit; got %+v", p.items[p.cursor])
+	}
+	// Home returns to first selectable.
+	p.cursorHome()
+	if p.items[p.cursor].action.kind == "header" {
+		t.Fatalf("Home landed on header: %+v", p.items[p.cursor])
+	}
+}
+
+func TestPaletteAltDigitQuickPick(t *testing.T) {
+	pr := []*preset.Preset{{Name: "first"}, {Name: "second"}}
+	p := newPalette(nil, "", "", preset.Set{Agents: pr})
+	// Alt-1 picks the first selectable item (Spawn agent: first).
+	action, done, adv := p.handleInput([]byte("\x1b1"), 0)
+	if adv != 2 {
+		t.Fatalf("Alt-1 advance %d, want 2", adv)
+	}
+	if !done {
+		t.Fatalf("Alt-1 didn't close palette")
+	}
+	if action.kind != "spawn-agent" || action.preset == nil || action.preset.Name != "first" {
+		t.Fatalf("Alt-1 action = %+v, want spawn-agent first", action)
+	}
+}
+
+func TestAutoSummaryCadenceCyclesSoloValues(t *testing.T) {
+	p := newPalette(nil, "", "", preset.Set{}, defaultSettings())
+	p.mode = paletteModeAutoSummary
+	for i, row := range autoSummaryRows() {
+		if row.key == "cadence" {
+			p.cursor = i
+			break
+		}
+	}
+	if p.settings.AutoSummary.Cadence != "1m" {
+		t.Fatalf("initial cadence = %q", p.settings.AutoSummary.Cadence)
+	}
+	p.activateAutoSummaryRow()
+	if p.settings.AutoSummary.Cadence != "15s" {
+		t.Fatalf("first cycle cadence = %q", p.settings.AutoSummary.Cadence)
+	}
+	p.activateAutoSummaryRow()
+	if p.settings.AutoSummary.Cadence != "30s" {
+		t.Fatalf("second cycle cadence = %q", p.settings.AutoSummary.Cadence)
+	}
+	p.activateAutoSummaryRow()
+	if p.settings.AutoSummary.Cadence != "1m" {
+		t.Fatalf("third cycle cadence = %q", p.settings.AutoSummary.Cadence)
+	}
+}
+
+func TestPaletteFormCtrlRTogglesRelaunchFromCommandField(t *testing.T) {
+	p := newPalette(nil, "", "", preset.Set{})
+	p.mode = paletteModeSpawnForm
+	p.form = &spawnProcessForm{}
+	// Type without leaving the command field, then Ctrl-R.
+	for _, b := range []byte("xyz") {
+		_, _, _ = p.handleInput([]byte{b}, 0)
+	}
+	if p.form.field != 0 {
+		t.Fatalf("field jumped to %d", p.form.field)
+	}
+	_, _, _ = p.handleInput([]byte{0x12}, 0)
+	if !p.form.relaunch {
+		t.Fatalf("Ctrl-R didn't toggle relaunch from command field")
+	}
+	// Second press toggles back.
+	_, _, _ = p.handleInput([]byte{0x12}, 0)
+	if p.form.relaunch {
+		t.Fatalf("second Ctrl-R didn't toggle off")
+	}
+}
+
+// -- Phase 6: counter / scroll indicator -----------------------------
+
+func TestPaletteFooterCounter(t *testing.T) {
+	pr := []*preset.Preset{{Name: "a"}, {Name: "b"}, {Name: "c"}}
+	p := newPalette(nil, "", "", preset.Set{Agents: pr})
+	total := p.visibleSelectableCount()
+	if total < 4 { // 3 spawn-agents + terminal + custom + quit
+		t.Fatalf("expected ≥4 selectables; got %d", total)
+	}
+	idx := p.selectableIndex()
+	if idx <= 0 {
+		t.Fatalf("selectable index = %d on freshly-built palette; want ≥1", idx)
+	}
+}
--- a/internal/app/session.go
+++ b/internal/app/session.go
@@ -50,6 +50,11 @@ type Session struct {
 	// JSON file so they can be re-spawned after patterm restarts.
 	// Optional; nil means "no persistence" (used by unit tests).
 	persistStore *persist.Store
+
+	// metrics is the optional performance tracker. nil when --profile
+	// is off. The pump goroutine reads it via atomic Load so installing
+	// metrics post-construction doesn't race with running children.
+	metrics atomic.Pointer[metricsTracker]
 }

 // SetPersistStore attaches a process-persistence store. Future Spawn /
@@ -61,6 +66,18 @@ func (s *Session) SetPersistStore(p *persist.Store) {
 	s.mu.Unlock()
 }

+// SetMetrics installs the per-session performance tracker. Safe to
+// call with nil to disable (the default). Reads on the hot path go
+// through atomic.Pointer.Load() with no lock; SetMetrics swaps the
+// pointer once at startup.
+func (s *Session) SetMetrics(m *metricsTracker) {
+	s.metrics.Store(m)
+}
+
+func (s *Session) loadMetrics() *metricsTracker {
+	return s.metrics.Load()
+}
+
 // ChildEventListener is implemented by the TUI to react to lifecycle
 // events without polling.
 type ChildEventListener interface {
@@ -392,17 +409,37 @@ func (s *Session) pumpChild(c *Child, runID uint64) {
 			}
 			chunk := buf[:n]
 			if em := c.Emulator(); em != nil {
+				m := s.loadMetrics()
+				wstart := time.Time{}
+				if m != nil {
+					wstart = time.Now()
+				}
 				if _, werr := em.Write(chunk); werr != nil {
 					logf("emulator.Write(child %s): %v", c.ID, werr)
 				}
+				if m != nil {
+					m.recordEmuWrite(time.Since(wstart))
+				}
 				// OSC 0/2 title updates ride on the same byte stream as
 				// the rest of the output. Polling the emulator after each
-				// Write is cheap (one cgo call returning a borrowed
-				// string) and lets the classifier treat title changes as
-				// an activity signal — even when the title isn't visible
-				// in the rendered grid.
-				if t, terr := em.Title(); terr == nil {
-					c.recordTitle(t)
+				// chunk is cheap on its own (one CGO call) but codex/
+				// ratatui sends so many small chunks that the per-chunk
+				// CGO cost becomes measurable. Skip the Title poll when
+				// the chunk doesn't carry an OSC start byte at all; the
+				// title can only change on chunks that include one.
+				if containsOSC(chunk) {
+					tstart := time.Time{}
+					if m != nil {
+						tstart = time.Now()
+					}
+					if t, terr := em.Title(); terr == nil {
+						c.recordTitle(t)
+					}
+					if m != nil {
+						m.recordEmuTitle(time.Since(tstart), false)
+					}
+				} else if m != nil {
+					m.recordEmuTitle(0, true)
 				}
 			}
 			c.recordWrite(chunk)
@@ -433,6 +470,23 @@ func (s *Session) reapChild(c *Child, runID uint64) {
 	if !c.restarting.Load() {
 		c.cleanupOwnedPaths()
 	}
+	// Terminals are ephemeral: unlike command entries (kept around for
+	// restart_process) and agents (which the user clears via close_process
+	// once they're done with the corpse), an exited terminal has nothing
+	// useful left to do. Drop it from the session so it disappears from
+	// the Processes sidebar / switch list immediately.
+	if c.Kind == KindTerminal && !c.restarting.Load() {
+		c.teardownPTY()
+		s.mu.Lock()
+		delete(s.children, c.ID)
+		for i, oid := range s.order {
+			if oid == c.ID {
+				s.order = append(s.order[:i], s.order[i+1:]...)
+				break
+			}
+		}
+		s.mu.Unlock()
+	}
 }

 // killDescendantsOf terminates every still-live direct child of
@@ -662,6 +716,24 @@ func (s *Session) Shutdown() {
 	}
 }

+// containsOSC reports whether chunk holds a sequence that could begin
+// an OSC. OSC starts as ESC ] (0x1b 0x5d) or the bare C1 ] (0x9d),
+// so a chunk without either cannot have changed the emulator's OSC
+// title state. Used to short-circuit the per-chunk Title() poll from
+// pumpChild, which otherwise pays a CGO call for every chunk even
+// when codex/ratatui is just emitting SGR-styled output.
+func containsOSC(chunk []byte) bool {
+	for i, b := range chunk {
+		if b == 0x9d {
+			return true
+		}
+		if b == 0x1b && i+1 < len(chunk) && chunk[i+1] == ']' {
+			return true
+		}
+	}
+	return false
+}
+
 func logf(format string, args ...any) {
 	if os.Getenv("PATTERM_DEBUG_LOG") == "" {
 		return
--- a/internal/app/settings.go
+++ b/internal/app/settings.go
@@ -0,0 +1,150 @@
+package app
+
+import (
+	"encoding/json"
+	"fmt"
+	"os"
+	"path/filepath"
+
+	"github.com/hjbdev/patterm/internal/preset"
+)
+
+const (
+	defaultSummaryProvider = "codex"
+	defaultCodexModel      = "gpt-5.4-mini"
+	defaultOpenCodeModel   = "opencode-go/minimax-m2.7"
+	defaultClaudeModel     = "claude-haiku-4-5"
+)
+
+type settings struct {
+	AutoSummary autoSummarySettings `json:"auto_summary"`
+}
+
+type autoSummarySettings struct {
+	Enabled         bool              `json:"enabled"`
+	Provider        string            `json:"provider"`
+	Models          map[string]string `json:"models"`
+	Cadence         string            `json:"cadence"`
+	QuietWindowMS   int               `json:"quiet_window_ms"`
+	MinInputChars   int               `json:"min_input_chars"`
+	MaxHistoryChars int               `json:"max_history_chars"`
+}
+
+func defaultSettings() settings {
+	return settings{
+		AutoSummary: autoSummarySettings{
+			Enabled:         true,
+			Provider:        defaultSummaryProvider,
+			Models:          defaultSummaryModels(),
+			Cadence:         "1m",
+			QuietWindowMS:   3000,
+			MinInputChars:   4,
+			MaxHistoryChars: 12000,
+		},
+	}
+}
+
+func defaultSummaryModels() map[string]string {
+	return map[string]string{
+		"codex":    defaultCodexModel,
+		"opencode": defaultOpenCodeModel,
+		"claude":   defaultClaudeModel,
+	}
+}
+
+func loadSettings() (settings, string, error) {
+	base, err := preset.ConfigDir()
+	if err != nil {
+		return settings{}, "", err
+	}
+	path := filepath.Join(base, "settings.json")
+	st := defaultSettings()
+	b, err := os.ReadFile(path)
+	if err != nil {
+		if os.IsNotExist(err) {
+			return st, path, nil
+		}
+		return st, path, fmt.Errorf("settings: read %s: %w", path, err)
+	}
+	if err := json.Unmarshal(b, &st); err != nil {
+		return defaultSettings(), path, fmt.Errorf("settings: parse %s: %w", path, err)
+	}
+	st.normalize()
+	return st, path, nil
+}
+
+func saveSettings(path string, st settings) error {
+	if path == "" {
+		return fmt.Errorf("settings: empty path")
+	}
+	st.normalize()
+	if err := os.MkdirAll(filepath.Dir(path), 0o700); err != nil {
+		return err
+	}
+	b, err := json.MarshalIndent(st, "", "  ")
+	if err != nil {
+		return err
+	}
+	b = append(b, '\n')
+	return os.WriteFile(path, b, 0o600)
+}
+
+func (st *settings) normalize() {
+	def := defaultSettings()
+	if st.AutoSummary.Provider == "" {
+		st.AutoSummary.Provider = def.AutoSummary.Provider
+	}
+	switch st.AutoSummary.Provider {
+	case "codex", "opencode", "claude":
+	default:
+		st.AutoSummary.Provider = def.AutoSummary.Provider
+	}
+	if st.AutoSummary.Models == nil {
+		st.AutoSummary.Models = defaultSummaryModels()
+	} else {
+		for k, v := range defaultSummaryModels() {
+			if st.AutoSummary.Models[k] == "" {
+				st.AutoSummary.Models[k] = v
+			}
+		}
+	}
+	if st.AutoSummary.Cadence == "" {
+		st.AutoSummary.Cadence = def.AutoSummary.Cadence
+	}
+	if st.AutoSummary.QuietWindowMS <= 0 {
+		st.AutoSummary.QuietWindowMS = def.AutoSummary.QuietWindowMS
+	}
+	if st.AutoSummary.MinInputChars <= 0 {
+		st.AutoSummary.MinInputChars = def.AutoSummary.MinInputChars
+	}
+	if st.AutoSummary.MaxHistoryChars <= 0 {
+		st.AutoSummary.MaxHistoryChars = def.AutoSummary.MaxHistoryChars
+	}
+}
+
+func (st settings) clone() settings {
+	st.normalize()
+	if st.AutoSummary.Models != nil {
+		models := make(map[string]string, len(st.AutoSummary.Models))
+		for k, v := range st.AutoSummary.Models {
+			models[k] = v
+		}
+		st.AutoSummary.Models = models
+	}
+	return st
+}
+
+func (a autoSummarySettings) clone() autoSummarySettings {
+	st := settings{AutoSummary: a}.clone()
+	return st.AutoSummary
+}
+
+func (a autoSummarySettings) modelFor(provider string) string {
+	if a.Models == nil {
+		return defaultSummaryModels()[provider]
+	}
+	if m := a.Models[provider]; m != "" {
+		return m
+	}
+	return defaultSummaryModels()[provider]
+}
--- a/internal/app/settings_test.go
+++ b/internal/app/settings_test.go
@@ -0,0 +1,72 @@
+package app
+
+import (
+	"os"
+	"path/filepath"
+	"testing"
+)
+
+func TestLoadSettingsDefaults(t *testing.T) {
+	t.Setenv("XDG_CONFIG_HOME", t.TempDir())
+	st, path, err := loadSettings()
+	if err != nil {
+		t.Fatalf("loadSettings: %v", err)
+	}
+	if filepath.Base(path) != "settings.json" {
+		t.Fatalf("settings path = %q", path)
+	}
+	if !st.AutoSummary.Enabled {
+		t.Fatal("auto-summary should default enabled")
+	}
+	if st.AutoSummary.Provider != "codex" {
+		t.Fatalf("provider = %q want codex", st.AutoSummary.Provider)
+	}
+	if st.AutoSummary.Cadence != "1m" {
+		t.Fatalf("cadence = %q want 1m", st.AutoSummary.Cadence)
+	}
+	if got := st.AutoSummary.modelFor("codex"); got != "gpt-5.4-mini" {
+		t.Fatalf("codex model = %q", got)
+	}
+	if got := st.AutoSummary.modelFor("opencode"); got != "opencode-go/minimax-m2.7" {
+		t.Fatalf("opencode model = %q", got)
+	}
+}
+
+func TestSettingsCloneDoesNotShareModelMap(t *testing.T) {
+	st := defaultSettings()
+	cp := st.clone()
+	cp.AutoSummary.Models["codex"] = "changed"
+	if st.AutoSummary.Models["codex"] == "changed" {
+		t.Fatal("clone shared Models map with original")
+	}
+	a := st.AutoSummary.clone()
+	a.Models["opencode"] = "changed"
+	if st.AutoSummary.Models["opencode"] == "changed" {
+		t.Fatal("autoSummarySettings clone shared Models map with original")
+	}
+}
+
+func TestSaveAndLoadSettings(t *testing.T) {
+	dir := t.TempDir()
+	t.Setenv("XDG_CONFIG_HOME", dir)
+	st := defaultSettings()
+	st.AutoSummary.Provider = "opencode"
+	st.AutoSummary.Models["opencode"] = "minimax/test"
+	path := filepath.Join(dir, "patterm", "settings.json")
+	if err := saveSettings(path, st); err != nil {
+		t.Fatalf("saveSettings: %v", err)
+	}
+	if _, err := os.Stat(path); err != nil {
+		t.Fatalf("settings file missing: %v", err)
+	}
+	got, _, err := loadSettings()
+	if err != nil {
+		t.Fatalf("loadSettings: %v", err)
+	}
+	if got.AutoSummary.Provider != "opencode" {
+		t.Fatalf("provider = %q", got.AutoSummary.Provider)
+	}
+	if got.AutoSummary.modelFor("opencode") != "minimax/test" {
+		t.Fatalf("opencode model = %q", got.AutoSummary.modelFor("opencode"))
+	}
+}
--- a/internal/app/sidebar.go
+++ b/internal/app/sidebar.go
@@ -12,6 +12,128 @@ const (
 	statusRows  = 1
 )

+// fitName returns name truncated to fit budget visible cells, with a
+// trailing "…" when it overflows. Operates on RAW (unstyled) input;
+// the caller wraps the result in SGR. Returns "" when budget <= 0.
+func fitName(name string, budget int) string {
+	if budget <= 0 {
+		return ""
+	}
+	runes := []rune(name)
+	if len(runes) <= budget {
+		return name
+	}
+	if budget == 1 {
+		return "…"
+	}
+	return string(runes[:budget-1]) + "…"
+}
+
+// marqueeWindow returns the window of name starting at offset, exactly
+// budget cells wide. Pre: caller has decided the name overflows budget
+// and offset is in [0, len([]rune(name))-budget]. Operates on RAW
+// (unstyled) input.
+func marqueeWindow(name string, budget, offset int) string {
+	if budget <= 0 {
+		return ""
+	}
+	runes := []rune(name)
+	if len(runes) <= budget {
+		return name
+	}
+	if offset < 0 {
+		offset = 0
+	}
+	end := offset + budget
+	if end > len(runes) {
+		end = len(runes)
+		offset = end - budget
+		if offset < 0 {
+			offset = 0
+		}
+	}
+	return string(runes[offset:end])
+}
+
+// clampVisible truncates s so that its visible (non-SGR) length is at
+// most width cells, preserving any active style by appending a reset.
+// Used as a defensive net by write() so a row whose decoration was
+// mis-sized still cannot spill past the sidebar band into the PTY area.
+func clampVisible(s string, width int) string {
+	if width <= 0 {
+		return ""
+	}
+	if visibleLen(s) <= width {
+		return s
+	}
+	var b strings.Builder
+	b.Grow(len(s))
+	visible := 0
+	inEsc := false
+	for _, r := range s {
+		if inEsc {
+			b.WriteRune(r)
+			if r == 'm' || r == 'H' {
+				inEsc = false
+			}
+			continue
+		}
+		if r == 0x1b {
+			inEsc = true
+			b.WriteRune(r)
+			continue
+		}
+		if visible >= width {
+			break
+		}
+		b.WriteRune(r)
+		visible++
+	}
+	b.WriteString(styleReset)
+	return b.String()
+}
+
+// chooseSidebarSuffix decides whether to keep or drop the trailing
+// timer indicator from a sidebar row's suffix. When the row's name
+// would have to ellipsise with the timer present, but the budget
+// freed by dropping the timer still leaves at least 6 cells for the
+// name, the timer is dropped. The name is the only identifier the
+// user has for that row; the timer is recoverable from the status
+// line and palette.
+func chooseSidebarSuffix(nameRuneLen, width int, prefix, suffix, timer string) (string, int) {
+	prefixCost := visibleLen(prefix)
+	budget := width - prefixCost - visibleLen(suffix)
+	if nameRuneLen <= budget || timer == "" {
+		return suffix, budget
+	}
+	slim := strings.TrimSuffix(suffix, timer)
+	if slim == suffix {
+		return suffix, budget
+	}
+	slimBudget := width - prefixCost - visibleLen(slim)
+	if slimBudget >= 6 {
+		return slim, slimBudget
+	}
+	return suffix, budget
+}
+
+// rowNameSlot returns the unstyled name cell for a sidebar row.
+// Unfocused (or focused-and-fitting) rows get fitName with a trailing
+// "…" on overflow. The focused row, when its name overflows the
+// budget, gets the current marquee window — exactly budget cells
+// wide so the surrounding row geometry stays put while it animates.
+func (st *uiState) rowNameSlot(id, rawName string, budget int, focused bool) string {
+	if budget <= 0 {
+		return ""
+	}
+	runes := []rune(rawName)
+	if !focused || len(runes) <= budget {
+		return fitName(rawName, budget)
+	}
+	off, _, _ := st.marquee.step(id, len(runes), budget, time.Now())
+	return marqueeWindow(rawName, budget, off)
+}
+
 // formatShortDuration renders a duration as a short, sidebar-friendly
 // suffix: ms under 1s, "12s" under 60s, "3m" otherwise.
 func formatShortDuration(d time.Duration) string {
@@ -38,6 +160,10 @@ func formatShortDuration(d time.Duration) string {
 // computed main viewport, so the sidebar region is outside the child's
 // cursor range. We can redraw freely without fighting the child for cells.
 func (st *uiState) drawSidebar() {
+	var entry time.Time
+	if st.metrics != nil {
+		entry = time.Now()
+	}
 	st.mu.Lock()
 	palOpen := st.palette != nil
 	focus := st.focusedID
@@ -69,6 +195,9 @@ func (st *uiState) drawSidebar() {
 		if row > maxRow {
 			return
 		}
+		if visibleLen(content) > width {
+			content = clampVisible(content, width)
+		}
 		pad := width - visibleLen(content)
 		if pad < 0 {
 			pad = 0
@@ -150,14 +279,19 @@ func (st *uiState) drawSidebar() {
 		if c.AutoRestart() {
 			marker = " " + styleDim + "⟳" + styleReset
 		}
-		var line string
+		timer := timerIndicator(c)
+		var prefix, openStyle string
 		if focused {
-			line = " " + styleAccent + "▎" + styleReset + " " + glyph + " " +
-				styleBold + c.DisplayName() + styleReset + marker + timerIndicator(c)
+			prefix = " " + styleAccent + "▎" + styleReset + " " + glyph + " "
+			openStyle = styleBold
 		} else {
-			line = "   " + glyph + " " + styleHint + c.DisplayName() + styleReset + marker + timerIndicator(c)
+			prefix = "   " + glyph + " "
+			openStyle = styleHint
 		}
-		write(line)
+		raw := c.DisplayName()
+		suffix, budget := chooseSidebarSuffix(len([]rune(raw)), width, prefix, marker+timer, timer)
+		nameCell := st.rowNameSlot(c.ID, raw, budget, focused)
+		write(prefix + openStyle + nameCell + styleReset + suffix)
 	}

 	// Agent Tree section — formerly "Session tree". Shows the active
@@ -182,14 +316,29 @@ func (st *uiState) drawSidebar() {
 		}
 		focused := c.ID == focus
 		glyph := statusGlyph(c, focused)
-		var line string
+		timer := timerIndicator(c)
+		var prefix, openStyle string
 		if focused {
-			line = " " + styleAccent + "▎" + styleReset + " " + indent + glyph + " " +
-				styleBold + c.DisplayName() + styleReset + timerIndicator(c)
+			prefix = " " + styleAccent + "▎" + styleReset + " " + indent + glyph + " "
+			openStyle = styleBold
 		} else {
-			line = "   " + indent + glyph + " " + styleHint + c.DisplayName() + styleReset + timerIndicator(c)
+			prefix = "   " + indent + glyph + " "
+			openStyle = styleHint
+		}
+		raw := c.DisplayName()
+		suffix, budget := chooseSidebarSuffix(len([]rune(raw)), width, prefix, timer, timer)
+		nameCell := st.rowNameSlot(c.ID, raw, budget, focused)
+		write(prefix + openStyle + nameCell + styleReset + suffix)
+	}
+
+	if summary := st.activeSummaryText(width - 4); summary != "" && row+2 <= maxRow {
+		write("")
+		for _, line := range wrapSidebarSummary(summary, width-4) {
+			if row > maxRow {
+				break
+			}
+			write("   " + styleDim + line + styleReset)
 		}
-		write(line)
 	}

 	// Scratchpads list — names only. The preview pane used to live
@@ -208,14 +357,18 @@ func (st *uiState) drawSidebar() {
 					if row > maxRow {
 						break
 					}
-					var line string
-					if e.Name == focusPad {
-						line = " " + styleAccent + "▎" + styleReset + " " +
-							styleBold + e.Name + styleReset
+					focused := e.Name == focusPad
+					var prefix, openStyle string
+					if focused {
+						prefix = " " + styleAccent + "▎" + styleReset + " "
+						openStyle = styleBold
 					} else {
-						line = "   " + styleHint + e.Name + styleReset
+						prefix = "   "
+						openStyle = styleHint
 					}
-					write(line)
+					budget := width - visibleLen(prefix)
+					nameCell := st.rowNameSlot("pad:"+e.Name, e.Name, budget, focused)
+					write(prefix + openStyle + nameCell + styleReset)
 				}
 			}
 		}
@@ -231,13 +384,58 @@ func (st *uiState) drawSidebar() {
 	st.chromeCacheMu.Lock()
 	if frame == st.sidebarCache {
 		st.chromeCacheMu.Unlock()
+		if st.metrics != nil {
+			st.metrics.recordSidebar(time.Since(entry), true)
+		}
 		return
 	}
 	st.sidebarCache = frame
 	st.chromeCacheMu.Unlock()
+	if st.metrics != nil {
+		defer func() { st.metrics.recordSidebar(time.Since(entry), false) }()
+	}

 	st.outMu.Lock()
 	// Save cursor; emit the sidebar; restore.
 	fmt.Fprintf(os.Stdout, "\x1b7%s\x1b8", frame)
 	st.outMu.Unlock()
 }
+
+func wrapSidebarSummary(s string, width int) []string {
+	if width < 1 {
+		width = 1
+	}
+	words := strings.Fields(s)
+	if len(words) == 0 {
+		return nil
+	}
+	var out []string
+	var cur string
+	for _, word := range words {
+		if visibleLen(word) > width {
+			if cur != "" {
+				out = append(out, cur)
+				cur = ""
+			}
+			out = append(out, clipRunes(word, width-1)+"…")
+			continue
+		}
+		if cur == "" {
+			cur = word
+			continue
+		}
+		if visibleLen(cur)+1+visibleLen(word) <= width {
+			cur += " " + word
+			continue
+		}
+		out = append(out, cur)
+		cur = word
+	}
+	if cur != "" {
+		out = append(out, cur)
+	}
+	if len(out) > 3 {
+		out = out[:3]
+	}
+	return out
+}
--- a/internal/app/spawn_focus_test.go
+++ b/internal/app/spawn_focus_test.go
@@ -0,0 +1,46 @@
+package app
+
+import (
+	"testing"
+)
+
+// TestOnChildSpawnedAgentChildKeepsFocus verifies that when a child is
+// spawned with a ParentID set (i.e. a patterm-managed agent caused the
+// spawn over MCP), OnChildSpawned does NOT steal viewport focus from
+// the currently focused child.
+func TestOnChildSpawnedAgentChildKeepsFocus(t *testing.T) {
+	sess := NewSession(t.TempDir(), "test")
+	st := &uiState{sess: sess}
+
+	parent := newChildEntry("p_parent", "parent", KindAgent, nil, nil, "", "", "")
+	st.focusedID = parent.ID
+	st.focusedName = parent.Name
+
+	subAgent := newChildEntry("p_sub", "sub", KindAgent, nil, nil, parent.ID, "", "")
+
+	st.OnChildSpawned(subAgent)
+
+	if got := st.focusedID; got != parent.ID {
+		t.Fatalf("agent-initiated spawn should not change focusedID: want %q, got %q", parent.ID, got)
+	}
+	if got := st.focusedName; got != parent.Name {
+		t.Fatalf("focusedName changed: want %q, got %q", parent.Name, got)
+	}
+}
+
+// TestOnChildSpawnedPaletteChildTakesFocus verifies the legacy path is
+// preserved: spawns with an empty ParentID (palette, restore, external
+// MCP caller) still auto-focus the new child.
+func TestOnChildSpawnedPaletteChildTakesFocus(t *testing.T) {
+	sess := NewSession(t.TempDir(), "test")
+	st := &uiState{sess: sess}
+	st.lastExit.Store(-1)
+
+	c := newChildEntry("p_new", "newchild", KindAgent, nil, nil, "", "", "")
+
+	st.OnChildSpawned(c)
+
+	if got := st.focusedID; got != c.ID {
+		t.Fatalf("palette-initiated spawn should auto-focus: want %q, got %q", c.ID, got)
+	}
+}
--- a/internal/app/summarizer.go
+++ b/internal/app/summarizer.go
@@ -0,0 +1,463 @@
+package app
+
+import (
+	"bytes"
+	"context"
+	"encoding/json"
+	"fmt"
+	"os/exec"
+	"strings"
+	"sync"
+	"time"
+	"unicode"
+
+	"github.com/hjbdev/patterm/internal/preset"
+)
+
+const (
+	summaryTickInterval = time.Second
+	summaryTimeout      = 90 * time.Second
+	summaryMaxLineCells = 240
+)
+
+type summaryState struct {
+	Text      string
+	State     IdleState
+	UpdatedAt time.Time
+	Error     string
+}
+
+type summaryManager struct {
+	sess       *Session
+	projectDir string
+	presets    preset.Set
+	settings   func() autoSummarySettings
+	onUpdate   func()
+	onResult   func(string, summaryState)
+
+	mu      sync.Mutex
+	tracked map[string]bool
+	entries map[string]*summaryEntry
+}
+
+type summaryEntry struct {
+	armed          bool
+	dirty          bool
+	running        bool
+	lastInputAt    time.Time
+	lastOutputAt   time.Time
+	lastAttemptAt  time.Time
+	lastSummarized int64
+	state          summaryState
+}
+
+type summarizerResponse struct {
+	Summary string `json:"summary"`
+	State   string `json:"state"`
+}
+
+func newSummaryManager(sess *Session, projectDir string, presets preset.Set, settingsFn func() autoSummarySettings, onUpdate func(), onResult func(string, summaryState)) *summaryManager {
+	return &summaryManager{
+		sess:       sess,
+		projectDir: projectDir,
+		presets:    presets,
+		settings:   settingsFn,
+		onUpdate:   onUpdate,
+		onResult:   onResult,
+		tracked:    make(map[string]bool),
+		entries:    make(map[string]*summaryEntry),
+	}
+}
+
+func (m *summaryManager) run(ctx context.Context) {
+	ticker := time.NewTicker(summaryTickInterval)
+	defer ticker.Stop()
+	for {
+		select {
+		case <-ctx.Done():
+			return
+		case <-ticker.C:
+			m.maybeStart(ctx, time.Now())
+		}
+	}
+}
+
+func (m *summaryManager) ObserveHumanInput(childID string, b []byte) {
+	if m == nil || !m.isTracked(childID) {
+		return
+	}
+	cfg := m.settings()
+	if len(strings.TrimSpace(string(b))) < cfg.MinInputChars {
+		return
+	}
+	m.mu.Lock()
+	e := m.entryLocked(childID)
+	e.armed = true
+	e.lastInputAt = time.Now()
+	m.mu.Unlock()
+}
+
+func (m *summaryManager) ObserveOutput(childID string) {
+	if m == nil || !m.isTracked(childID) {
+		return
+	}
+	m.mu.Lock()
+	e := m.entryLocked(childID)
+	if e.armed {
+		e.dirty = true
+		e.lastOutputAt = time.Now()
+	}
+	m.mu.Unlock()
+}
+
+func (m *summaryManager) RegisterChild(c *Child) {
+	if m == nil || c == nil {
+		return
+	}
+	m.mu.Lock()
+	defer m.mu.Unlock()
+	if isTopLevelSummarizedAgent(c) {
+		m.tracked[c.ID] = true
+	} else {
+		delete(m.tracked, c.ID)
+	}
+}
+
+func (m *summaryManager) UnregisterChild(id string) {
+	if m == nil || id == "" {
+		return
+	}
+	m.mu.Lock()
+	defer m.mu.Unlock()
+	delete(m.tracked, id)
+}
+
+func (m *summaryManager) isTracked(id string) bool {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+	return m.tracked[id]
+}
+
+func (m *summaryManager) Summary(childID string) summaryState {
+	if m == nil || childID == "" {
+		return summaryState{}
+	}
+	m.mu.Lock()
+	defer m.mu.Unlock()
+	if e := m.entries[childID]; e != nil {
+		return e.state
+	}
+	return summaryState{}
+}
+
+func (m *summaryManager) RunNow(ctx context.Context, childID string) {
+	if m == nil || childID == "" {
+		return
+	}
+	c := m.sess.FindChild(childID)
+	if !isTopLevelSummarizedAgent(c) {
+		return
+	}
+	m.mu.Lock()
+	e := m.entryLocked(c.ID)
+	if e.running {
+		m.mu.Unlock()
+		return
+	}
+	e.running = true
+	e.lastAttemptAt = time.Now()
+	m.mu.Unlock()
+	go m.runOne(ctx, c.ID, true)
+}
+
+func (m *summaryManager) Test(ctx context.Context) error {
+	cfg := m.settings()
+	return runSummarizerHealth(ctx, cfg, m.projectDir)
+}
+
+func (m *summaryManager) entryLocked(id string) *summaryEntry {
+	e := m.entries[id]
+	if e == nil {
+		e = &summaryEntry{}
+		m.entries[id] = e
+	}
+	return e
+}
+
+func (m *summaryManager) maybeStart(ctx context.Context, now time.Time) {
+	cfg := m.settings()
+	if !cfg.Enabled {
+		return
+	}
+	cadence, err := time.ParseDuration(cfg.Cadence)
+	if err != nil || cadence <= 0 {
+		cadence = time.Minute
+	}
+	quiet := time.Duration(cfg.QuietWindowMS) * time.Millisecond
+	var startID string
+	for _, c := range m.sess.Children() {
+		if !isTopLevelSummarizedAgent(c) {
+			continue
+		}
+		m.mu.Lock()
+		e := m.entryLocked(c.ID)
+		eligible := e.armed && e.dirty && !e.running &&
+			!e.lastOutputAt.IsZero() && now.Sub(e.lastOutputAt) >= quiet &&
+			(e.lastAttemptAt.IsZero() || now.Sub(e.lastAttemptAt) >= cadence) &&
+			c.ScreenVersion() != e.lastSummarized
+		if eligible {
+			e.running = true
+			e.lastAttemptAt = now
+			startID = c.ID
+		}
+		m.mu.Unlock()
+		if startID != "" {
+			go m.runOne(ctx, startID, false)
+			return
+		}
+	}
+}
+
+func (m *summaryManager) runOne(ctx context.Context, childID string, manual bool) {
+	c := m.sess.FindChild(childID)
+	if c == nil {
+		m.finish(childID, summaryState{Error: "process disappeared"}, 0)
+		return
+	}
+	cfg := m.settings()
+	snapshot := buildSummarySnapshot(c, cfg.MaxHistoryChars, m.chromeHintsFor(c.PresetRef))
+	if strings.TrimSpace(snapshot) == "" {
+		m.finish(childID, summaryState{Error: "empty snapshot"}, c.ScreenVersion())
+		return
+	}
+	runCtx, cancel := context.WithTimeout(ctx, summaryTimeout)
+	defer cancel()
+	resp, err := runSummarizer(runCtx, cfg, m.projectDir, snapshot)
+	st := summaryState{UpdatedAt: time.Now()}
+	if err != nil {
+		st.Error = err.Error()
+		m.finish(childID, st, c.ScreenVersion())
+		return
+	}
+	st.Text = strings.TrimSpace(resp.Summary)
+	st.State = summaryIdleState(resp.State)
+	if st.Text == "" {
+		st.Error = "empty summary"
+	}
+	if manual && st.Text != "" && st.State == StateUnknown {
+		st.State = c.IdleState()
+	}
+	m.finish(childID, st, c.ScreenVersion())
+}
+
+func (m *summaryManager) finish(childID string, st summaryState, version int64) {
+	m.mu.Lock()
+	e := m.entryLocked(childID)
+	e.running = false
+	if st.Text != "" || st.Error != "" {
+		if st.Text == "" && e.state.Text != "" {
+			st.Text = e.state.Text
+			st.State = e.state.State
+			st.UpdatedAt = e.state.UpdatedAt
+		}
+		e.state = st
+	}
+	if st.Text != "" {
+		e.armed = false
+		e.dirty = false
+		e.lastSummarized = version
+	}
+	m.mu.Unlock()
+	if m.onUpdate != nil {
+		m.onUpdate()
+	}
+	if m.onResult != nil && (st.Text != "" || st.Error != "") {
+		m.onResult(childID, st)
+	}
+}
+
+func isTopLevelSummarizedAgent(c *Child) bool {
+	return c != nil && c.Kind == KindAgent && c.ParentID == "" && c.Status() == StatusRunning
+}
+
+func (m *summaryManager) chromeHintsFor(presetName string) []string {
+	if presetName == "" {
+		return nil
+	}
+	for _, p := range m.presets.Agents {
+		if p.Name == presetName {
+			return p.ChromeTrimHints
+		}
+	}
+	return nil
+}
+
+func buildSummarySnapshot(c *Child, maxChars int, chromeHints []string) string {
+	if maxChars <= 0 {
+		maxChars = 12000
+	}
+	grid := ""
+	if em := c.Emulator(); em != nil {
+		if txt, err := em.PlainText(); err == nil {
+			grid = compactSummaryText(applyChromeTrim(txt, chromeHints))
+		}
+	}
+	tailBytes := max(maxChars*4, maxChars)
+	b := c.tailBytes(tailBytes)
+	history := compactSummaryText(applyChromeTrim(string(stripANSIBytes(nil, b)), chromeHints))
+	history = tailString(history, maxChars)
+	var out strings.Builder
+	if history != "" {
+		out.WriteString("Recent rendered history:\n")
+		out.WriteString(history)
+		out.WriteString("\n\n")
+	}
+	if grid != "" && !strings.Contains(history, grid) {
+		out.WriteString("Current visible grid:\n")
+		out.WriteString(grid)
+	}
+	return tailString(out.String(), maxChars)
+}
+
+func compactSummaryText(in string) string {
+	in = string(stripANSIBytes(nil, []byte(in)))
+	in = strings.ReplaceAll(in, "\r\n", "\n")
+	in = strings.ReplaceAll(in, "\r", "\n")
+	lines := strings.Split(in, "\n")
+	out := make([]string, 0, len(lines))
+	blank := false
+	for _, line := range lines {
+		line = strings.TrimRightFunc(line, unicode.IsSpace)
+		line = strings.Map(func(r rune) rune {
+			if r == '\t' || r == '\n' {
+				return r
+			}
+			if r < 0x20 || r == 0x7f {
+				return -1
+			}
+			return r
+		}, line)
+		line = truncateSummaryLine(line, summaryMaxLineCells)
+		if strings.TrimSpace(line) == "" {
+			if blank {
+				continue
+			}
+			blank = true
+			out = append(out, "")
+			continue
+		}
+		blank = false
+		out = append(out, line)
+	}
+	return strings.TrimSpace(strings.Join(out, "\n"))
+}
+
+func truncateSummaryLine(s string, max int) string {
+	if max <= 0 || visibleLen(s) <= max {
+		return s
+	}
+	return clipRunes(s, max-1) + "…"
+}
+
+func tailString(s string, max int) string {
+	rs := []rune(s)
+	if len(rs) <= max {
+		return s
+	}
+	return string(rs[len(rs)-max:])
+}
+
+func runSummarizer(ctx context.Context, cfg autoSummarySettings, projectDir, snapshot string) (summarizerResponse, error) {
+	prompt := summaryPrompt(snapshot)
+	out, err := runSummarizerCommand(ctx, cfg, projectDir, prompt)
+	if err != nil {
+		return summarizerResponse{}, err
+	}
+	resp, err := parseSummarizerResponse(out)
+	if err != nil {
+		return summarizerResponse{}, err
+	}
+	if summaryIdleState(resp.State) == StateUnknown {
+		return summarizerResponse{}, fmt.Errorf("invalid summary state %q", resp.State)
+	}
+	return resp, nil
+}
+
+func runSummarizerHealth(ctx context.Context, cfg autoSummarySettings, projectDir string) error {
+	out, err := runSummarizerCommand(ctx, cfg, projectDir, "Reply with exactly: patterm okay")
+	if err != nil {
+		return err
+	}
+	if strings.TrimSpace(out) != "patterm okay" {
+		return fmt.Errorf("health check did not return patterm okay")
+	}
+	return nil
+}
+
+func runSummarizerCommand(ctx context.Context, cfg autoSummarySettings, projectDir, prompt string) (string, error) {
+	provider := cfg.Provider
+	model := cfg.modelFor(provider)
+	var cmd *exec.Cmd
+	switch provider {
+	case "opencode":
+		cmd = exec.CommandContext(ctx, "opencode", "run", "--model", model, "--dir", projectDir, prompt)
+	case "claude":
+		cmd = exec.CommandContext(ctx, "claude", "--print", "--model", model, prompt)
+	default:
+		cmd = exec.CommandContext(ctx, "codex", "exec", "--ephemeral", "--skip-git-repo-check", "--sandbox", "read-only", "--ask-for-approval", "never", "--model", model, "-")
+		cmd.Stdin = strings.NewReader(prompt)
+	}
+	cmd.Dir = projectDir
+	var stderr bytes.Buffer
+	cmd.Stderr = &stderr
+	out, err := cmd.Output()
+	if err != nil {
+		msg := strings.TrimSpace(stderr.String())
+		if msg == "" {
+			msg = err.Error()
+		}
+		return "", fmt.Errorf("%s summarizer: %s", provider, msg)
+	}
+	return string(out), nil
+}
+
+func summaryPrompt(snapshot string) string {
+	return "Summarize this terminal/agent snapshot for a compact UI catch-up aid.\n" +
+		"Return only JSON with keys summary and state. State must be one of IDLE, PERMISSION, THINKING, WORKING, ERROR.\n" +
+		"Keep summary under 180 characters, concrete, and avoid mentioning that you are summarizing.\n\n" +
+		snapshot
+}
+
+func parseSummarizerResponse(out string) (summarizerResponse, error) {
+	var resp summarizerResponse
+	if err := json.Unmarshal([]byte(strings.TrimSpace(out)), &resp); err == nil {
+		return resp, nil
+	}
+	for _, line := range strings.Split(out, "\n") {
+		line = strings.TrimSpace(line)
+		if !strings.HasPrefix(line, "{") || !strings.HasSuffix(line, "}") {
+			continue
+		}
+		if err := json.Unmarshal([]byte(line), &resp); err == nil {
+			return resp, nil
+		}
+	}
+	return resp, fmt.Errorf("summary output was not JSON")
+}
+
+func summaryIdleState(s string) IdleState {
+	switch strings.ToUpper(strings.TrimSpace(s)) {
+	case "IDLE":
+		return StateIdle
+	case "PERMISSION":
+		return StatePermission
+	case "THINKING":
+		return StateThinking
+	case "WORKING":
+		return StateWorking
+	case "ERROR":
+		return StateError
+	default:
+		return StateUnknown
+	}
+}
--- a/internal/app/summarizer_test.go
+++ b/internal/app/summarizer_test.go
@@ -0,0 +1,85 @@
+package app
+
+import (
+	"strings"
+	"testing"
+
+	"github.com/hjbdev/patterm/internal/preset"
+)
+
+func TestParseSummarizerResponseAllowsWrappedJSON(t *testing.T) {
+	resp, err := parseSummarizerResponse("log\n{\"summary\":\"Waiting for tests\",\"state\":\"WORKING\"}\n")
+	if err != nil {
+		t.Fatalf("parseSummarizerResponse: %v", err)
+	}
+	if resp.Summary != "Waiting for tests" || summaryIdleState(resp.State) != StateWorking {
+		t.Fatalf("response = %+v", resp)
+	}
+}
+
+func TestCompactSummaryTextDropsControlAndRedundantWhitespace(t *testing.T) {
+	got := compactSummaryText("hello\x00 world  \n\n\n\x1b[31mred\x1b[0m\n")
+	if strings.ContainsRune(got, '\x00') {
+		t.Fatalf("control byte survived: %q", got)
+	}
+	if strings.Contains(got, "\n\n\n") {
+		t.Fatalf("redundant blanks survived: %q", got)
+	}
+	if strings.Contains(got, "\x1b") {
+		t.Fatalf("ansi survived: %q", got)
+	}
+}
+
+func TestWrapSidebarSummaryKeepsWordBoundaries(t *testing.T) {
+	got := wrapSidebarSummary("alpha beta gamma delta", 12)
+	want := []string{"alpha beta", "gamma delta"}
+	if len(got) != len(want) {
+		t.Fatalf("lines = %#v", got)
+	}
+	for i := range want {
+		if got[i] != want[i] {
+			t.Fatalf("line %d = %q want %q", i, got[i], want[i])
+		}
+	}
+	long := wrapSidebarSummary("supercalifragilistic short", 8)
+	if len(long) == 0 || !strings.HasSuffix(long[0], "…") {
+		t.Fatalf("long word should clip with ellipsis: %#v", long)
+	}
+}
+
+func TestSummaryManagerArmsOnlyTrackedTopLevelAgents(t *testing.T) {
+	sess := NewSession(t.TempDir(), "test")
+	c := newChildEntry("a1", "agent", KindAgent, []string{"fake"}, nil, "", "", "")
+	running := StatusRunning
+	c.status.Store(&running)
+	sess.children[c.ID] = c
+	sess.order = append(sess.order, c.ID)
+	cfg := defaultSettings().AutoSummary
+	m := newSummaryManager(sess, t.TempDir(), preset.Set{}, func() autoSummarySettings {
+		return cfg.clone()
+	}, nil, nil)
+	m.ObserveHumanInput(c.ID, []byte("please summarize"))
+	if got := m.Summary(c.ID); got.Text != "" {
+		t.Fatalf("untracked agent should not update summary state: %+v", got)
+	}
+	m.RegisterChild(c)
+	m.ObserveHumanInput(c.ID, []byte("please summarize"))
+	m.ObserveOutput(c.ID)
+	m.mu.Lock()
+	e := m.entries[c.ID]
+	m.mu.Unlock()
+	if e == nil || !e.armed || !e.dirty {
+		t.Fatalf("tracked top-level agent not armed/dirty: %+v", e)
+	}
+
+	sub := newChildEntry("a2", "sub", KindAgent, []string{"fake"}, nil, c.ID, "", "")
+	sub.status.Store(&running)
+	m.RegisterChild(sub)
+	m.ObserveHumanInput(sub.ID, []byte("please summarize"))
+	m.mu.Lock()
+	_, ok := m.entries[sub.ID]
+	m.mu.Unlock()
+	if ok {
+		t.Fatal("sub-agent should not get a summary entry")
+	}
+}
--- a/internal/app/tabbar.go
+++ b/internal/app/tabbar.go
@@ -4,12 +4,13 @@ import (
 	"fmt"
 	"os"
 	"strings"
+	"time"
 	"unicode/utf8"
 )

-// Two-row tab bar: labels row, underline row. The PTY viewport's top
+// Three-row tab bar: labels row, active-thread summary row, underline row. The PTY viewport's top
 // row is therefore mainTop == tabBarRows + 1.
-const tabBarRows = 2
+const tabBarRows = 3

 // drawTabBar renders the top tab strip across the full host width.
 // Tabs share the available width with a flex layout — each visible
@@ -17,9 +18,17 @@ const tabBarRows = 2
 // to the leftmost tabs so the strip fills the screen edge-to-edge.
 // A trailing "+ new" hint sits in the rightmost reserved slot.
 func (st *uiState) drawTabBar() {
+	var entry time.Time
+	if st.metrics != nil {
+		entry = time.Now()
+	}
 	st.mu.Lock()
 	palOpen := st.palette != nil
-	focus := st.focusedID
+	// Highlight the top-level agent tab even when focus has stepped
+	// into a sub-agent (or a Processes pane entry). activeAgentID walks
+	// the parent chain to the root, so the user always sees which tab
+	// their current thread belongs to.
+	focus := st.activeAgentID
 	st.mu.Unlock()
 	if palOpen {
 		return
@@ -130,7 +139,8 @@ func (st *uiState) drawTabBar() {
 	}

 	var b strings.Builder
-	// Clear both rows so a stale label from the previous frame can't
+	// Clear all tab-bar rows so stale labels or summaries from the
+	// previous frame can't
 	// bleed through. Use ECH clamped to `width` (= childCols) instead of
 	// `\x1b[2K`: 2K wipes the entire line including the sidebar columns,
 	// and if drawSidebar's chrome cache is fresh it won't repaint to
@@ -138,6 +148,7 @@ func (st *uiState) drawTabBar() {
 	// and content should be.
 	fmt.Fprintf(&b, "\x1b[1;1H\x1b[%dX", width)
 	fmt.Fprintf(&b, "\x1b[2;1H\x1b[%dX", width)
+	fmt.Fprintf(&b, "\x1b[3;1H\x1b[%dX", width)

 	for _, t := range tabs {
 		// Row 1: centre-ish label inside the tab cell.
@@ -161,9 +172,9 @@ func (st *uiState) drawTabBar() {
 		b.WriteString(strings.Repeat(" ", rightPad))
 		b.WriteString(styleReset)

-		// Row 2: underline. Thick accent for the active tab, faint
+		// Row 3: underline. Thick accent for the active tab, faint
 		// border for the rest.
-		fmt.Fprintf(&b, "\x1b[2;%dH", t.startCol)
+		fmt.Fprintf(&b, "\x1b[3;%dH", t.startCol)
 		if t.active {
 			b.WriteString(styleAccent)
 			b.WriteString(strings.Repeat("━", t.width))
@@ -180,18 +191,28 @@ func (st *uiState) drawTabBar() {
 		fmt.Fprintf(&b, "\x1b[1;%dH %s%s%s ", hintCol, styleDim, newHint, styleReset)
 		// Underline continues faintly under the hint so the strip
 		// reads as one bar.
-		fmt.Fprintf(&b, "\x1b[2;%dH%s%s%s",
+		fmt.Fprintf(&b, "\x1b[3;%dH%s%s%s",
 			hintCol, styleBorder, strings.Repeat("─", newHintW), styleReset)
 	}

+	if summary := st.activeSummaryText(width - 2); summary != "" {
+		fmt.Fprintf(&b, "\x1b[2;1H %s%s%s", styleDim, summary, styleReset)
+	}
+
 	frame := b.String()
 	st.chromeCacheMu.Lock()
 	if frame == st.tabBarCache {
 		st.chromeCacheMu.Unlock()
+		if st.metrics != nil {
+			st.metrics.recordTabbar(time.Since(entry), true)
+		}
 		return
 	}
 	st.tabBarCache = frame
 	st.chromeCacheMu.Unlock()
+	if st.metrics != nil {
+		defer func() { st.metrics.recordTabbar(time.Since(entry), false) }()
+	}

 	st.outMu.Lock()
 	defer st.outMu.Unlock()
--- a/internal/app/tree.go
+++ b/internal/app/tree.go
@@ -96,17 +96,24 @@ func firstRunningAgentID(children []*Child) string {
 }

 // processList returns every top-level command/terminal entry in spawn
-// order, regardless of running state. The Processes sidebar section
-// keeps showing exited entries so the user can see what just died (and
-// because Session retains KindCommand entries for restart).
+// order. Exited KindCommand entries remain visible so the user can see
+// what just died and reach restart_process; exited KindTerminal entries
+// are filtered out because terminals are ephemeral and have no restart
+// path (Session also drops them in reapChild — this filter is defensive
+// for any window between exit and deletion).
 func processList(children []*Child) []*Child {
 	out := make([]*Child, 0, len(children))
 	for _, c := range children {
 		if c.ParentID != "" {
 			continue
 		}
-		if c.Kind == KindCommand || c.Kind == KindTerminal {
+		switch c.Kind {
+		case KindCommand:
 			out = append(out, c)
+		case KindTerminal:
+			if c.Status() == StatusRunning {
+				out = append(out, c)
+			}
 		}
 	}
 	return out
--- a/internal/app/viewport_renderer.go
+++ b/internal/app/viewport_renderer.go
@@ -33,6 +33,14 @@ type viewportRenderer struct {
 	// cache so the next drawSidebar repaints over the clobber.
 	scrolled bool

+	// childOnAlt tracks whether the focused child has entered its
+	// alternate screen (via ?47 / ?1047 / ?1049). Used to gate mouse-
+	// tracking-mode forwarding to the host: filter on primary so
+	// patterm's wheel-scrollback stays armed, forward on alt so codex
+	// (which disables mouse) lets the user select text and vim (which
+	// enables it) still gets mouse events.
+	childOnAlt bool
+
 	// skipUTF8 is set when the current multi-byte UTF-8 character started
 	// past the viewport's right edge. The starter byte was dropped, so
 	// the remaining continuation bytes must be dropped too instead of
@@ -65,6 +73,16 @@ func newViewportRenderer(l terminalLayout) *viewportRenderer {
 	return vr
 }

+// SetChildOnAlt seeds the renderer's view of the focused child's screen
+// side. Used when a new renderer is constructed for an already-running
+// child whose alt-screen transition we missed, so subsequent mouse-mode
+// toggles are filtered/forwarded according to the right side.
+func (vr *viewportRenderer) SetChildOnAlt(onAlt bool) {
+	vr.mu.Lock()
+	defer vr.mu.Unlock()
+	vr.childOnAlt = onAlt
+}
+
 func (vr *viewportRenderer) SetLayout(l terminalLayout) {
 	vr.mu.Lock()
 	defer vr.mu.Unlock()
@@ -236,15 +254,36 @@ func (vr *viewportRenderer) emitCSI() {
 			return
 		}
 		if isAltScreenMode(params) {
+			// Track the child's screen side so we know whether to filter
+			// or forward subsequent mouse-mode toggles. Entering alt
+			// disables host mouse reporting by default so codex (and
+			// any other alt-screen TUI that doesn't request mouse)
+			// allows the user to click-drag to select text. Alt-screen
+			// TUIs that want mouse (vim, less with -X) re-enable it
+			// via ?1000h after switching to alt — the forwarder below
+			// passes that through. Leaving alt re-arms host mouse for
+			// primary-screen wheel-scrollback.
+			wasAlt := vr.childOnAlt
+			vr.childOnAlt = final == 'h'
+			if !wasAlt && vr.childOnAlt {
+				vr.pending.WriteString("\x1b[?1000l\x1b[?1006l")
+			}
+			if wasAlt && !vr.childOnAlt {
+				vr.pending.WriteString("\x1b[?1000h\x1b[?1006h")
+			}
 			return
 		}
 		if isMouseTrackingMode(params) {
-			// Patterm owns mouse reporting on the host so wheel events keep
-			// flowing for scroll-viewport. The child's own emulator still
-			// observes the mode set/reset (it processes the same bytes we
-			// hand to ghostty_terminal_vt_write), so we know whether the
-			// child wants mouse input — we just don't let it disarm our
-			// host listener.
+			// On the child's primary screen patterm owns mouse reporting so
+			// wheel events keep flowing for in-pane scrollback — drop the
+			// child's toggle. On the alt screen the child should be free
+			// to enable mouse (vim, less) or disable it (codex); we forward
+			// the toggle to the host so click-and-drag selection works for
+			// alt-screen TUIs that don't want mouse, and mouse-aware ones
+			// still see the events they need.
+			if vr.childOnAlt {
+				vr.pending.Write(vr.buf)
+			}
 			return
 		}
 	}
--- a/internal/app/viewport_renderer_test.go
+++ b/internal/app/viewport_renderer_test.go
@@ -16,7 +16,7 @@ func bytesRepeat(b byte, n int) []byte {
 func TestViewportRendererShiftsCursor(t *testing.T) {
 	vr := newViewportRenderer(newTerminalLayout(120, 40))
 	got := string(vr.Render([]byte("\x1b[H")))
-	if got != "\x1b[3;1H" {
+	if got != "\x1b[4;1H" {
 		t.Fatalf("CUP home: got %q", got)
 	}
 }
@@ -24,8 +24,36 @@ func TestViewportRendererShiftsCursor(t *testing.T) {
 func TestViewportRendererSwallowsAltScreenToggles(t *testing.T) {
 	vr := newViewportRenderer(newTerminalLayout(120, 40))
 	got := string(vr.Render([]byte("a\x1b[?1049hb\x1b[?1049lc")))
+	// The ?1049h/l toggles themselves must not reach the host (patterm
+	// owns its own alt screen). On the transition we re-sync host mouse
+	// reporting so codex (which doesn't request mouse) lets the user
+	// drag-select; leaving alt re-arms it for primary-screen wheel
+	// scrollback.
+	want := "a\x1b[?1000l\x1b[?1006lb\x1b[?1000h\x1b[?1006hc"
+	if got != want {
+		t.Fatalf("alt-screen toggles: got %q want %q", got, want)
+	}
+}
+
+func TestViewportRendererMouseTrackingFilteredOnPrimary(t *testing.T) {
+	vr := newViewportRenderer(newTerminalLayout(120, 40))
+	got := string(vr.Render([]byte("a\x1b[?1000lb\x1b[?1000hc")))
 	if got != "abc" {
-		t.Fatalf("alt-screen toggles: got %q", got)
+		t.Fatalf("mouse mode on primary should be filtered: got %q", got)
+	}
+}
+
+func TestViewportRendererMouseTrackingForwardedOnAlt(t *testing.T) {
+	vr := newViewportRenderer(newTerminalLayout(120, 40))
+	// Enter alt; subsequent mouse-mode toggles should reach the host so
+	// alt-screen TUIs (vim, less) can run with mouse on, and selection-
+	// using ones (codex) stay with mouse off.
+	got := string(vr.Render([]byte("\x1b[?1049h\x1b[?1000lx\x1b[?1000hy")))
+	if !strings.Contains(got, "\x1b[?1000l") {
+		t.Fatalf("alt-screen mouse disable should reach host: %q", got)
+	}
+	if !strings.Contains(got, "\x1b[?1000h") {
+		t.Fatalf("alt-screen mouse enable should reach host: %q", got)
 	}
 }

@@ -38,7 +66,7 @@ func TestViewportRendererSwallowsOriginModeToggles(t *testing.T) {
 	if !strings.Contains(got, "a") || !strings.Contains(got, "b") || !strings.Contains(got, "c") {
 		t.Fatalf("origin-mode toggles should not drop surrounding text: got %q", got)
 	}
-	if strings.Count(got, "\x1b[3;1H") != 2 {
+	if strings.Count(got, "\x1b[4;1H") != 2 {
 		t.Fatalf("origin-mode set/reset should home inside the viewport twice: got %q", got)
 	}
 }
@@ -60,23 +88,23 @@ func TestViewportRendererOriginModeCUPUsesScrollTop(t *testing.T) {
 	if strings.Contains(got, "\x1b[?6h") {
 		t.Fatalf("origin-mode set leaked to host: %q", got)
 	}
-	if !strings.Contains(got, "\x1b[7;1H") {
-		t.Fatalf("CUP row 1 in origin mode should land at scrollTop row 5 shifted to host row 7: got %q", got)
+	if !strings.Contains(got, "\x1b[8;1H") {
+		t.Fatalf("CUP row 1 in origin mode should land at scrollTop row 5 shifted to host row 8: got %q", got)
 	}
 }

 func TestViewportRendererClearScreenIsViewportOnly(t *testing.T) {
-	// hostRows=7 leaves four viewport rows after the 2-row tab bar and
+	// hostRows=7 leaves three viewport rows after the 3-row tab bar and
 	// 1-row status reservation.
 	vr := newViewportRenderer(newTerminalLayout(20, 7))
 	got := string(vr.Render([]byte("\x1b[2J")))
 	if strings.Contains(got, "\x1b[2J") {
 		t.Fatalf("host clear-screen leaked through: %q", got)
 	}
-	if strings.Count(got, "\x1b[20X") != 4 {
+	if strings.Count(got, "\x1b[20X") != 3 {
 		t.Fatalf("clear rows: got %q", got)
 	}
-	if !strings.Contains(got, "\x1b[3;1H") || !strings.Contains(got, "\x1b[6;1H") {
+	if !strings.Contains(got, "\x1b[4;1H") || !strings.Contains(got, "\x1b[6;1H") {
 		t.Fatalf("clear did not target viewport rows: %q", got)
 	}
 }
@@ -112,13 +140,12 @@ func TestViewportRendererClearToEndIsViewportOnly(t *testing.T) {
 		t.Fatalf("host clear-to-end leaked through: %q", got)
 	}
 	// childCols == 19 (40 cols - 28 sidebar - 1 gap - 0-index fudge).
-	// Each of the 4 viewport rows should get a 19-cell erase.
 	// childCols == 11 with hostCols=40 (28 sidebar + 1 gap reserved).
-	// 4 viewport rows, but the cursor row uses ECH at cursor (col 1),
-	// so we expect 4 erases of 11 cells each.
+	// 3 viewport rows, but the cursor row uses ECH at cursor (col 1),
+	// so we expect 3 erases of 11 cells each.
 	count := strings.Count(got, "\x1b[11X")
-	if count != 4 {
-		t.Fatalf("expected 4 ECH-11 sequences, got %d in %q", count, got)
+	if count != 3 {
+		t.Fatalf("expected 3 ECH-11 sequences, got %d in %q", count, got)
 	}
 }

@@ -154,7 +181,7 @@ func TestViewportRendererClampsCUPColumn(t *testing.T) {
 	// column so the host cursor never lands in the sidebar.
 	vr := newViewportRenderer(newTerminalLayout(120, 40))
 	got := string(vr.Render([]byte("\x1b[5;95H")))
-	if !strings.Contains(got, "\x1b[7;91H") {
+	if !strings.Contains(got, "\x1b[8;91H") {
 		t.Fatalf("CUP col 95 should clamp to 91 (childCols): got %q", got)
 	}
 }
@@ -249,7 +276,7 @@ func TestViewportRendererFlagsScrollVerbs(t *testing.T) {

 func TestViewportRendererFlagsLineFeedAtViewportBottomAsScrolling(t *testing.T) {
 	vr := newViewportRenderer(newTerminalLayout(120, 40))
-	_ = vr.Render([]byte("\x1b[37;1H\n"))
+	_ = vr.Render([]byte("\x1b[36;1H\n"))
 	if !vr.TookScrollAction() {
 		t.Fatalf("LF at viewport bottom should flag scroll")
 	}
@@ -257,7 +284,7 @@ func TestViewportRendererFlagsLineFeedAtViewportBottomAsScrolling(t *testing.T)

 func TestViewportRendererDoesNotFlagLineFeedBeforeViewportBottom(t *testing.T) {
 	vr := newViewportRenderer(newTerminalLayout(120, 40))
-	_ = vr.Render([]byte("\x1b[36;1H\n"))
+	_ = vr.Render([]byte("\x1b[35;1H\n"))
 	if vr.TookScrollAction() {
 		t.Fatalf("LF before viewport bottom should not flag scroll")
 	}
@@ -284,7 +311,7 @@ func TestViewportRendererClampsCUUAtViewportTop(t *testing.T) {
 	vr := newViewportRenderer(newTerminalLayout(120, 40))
 	// CUP to viewport row 1 then CUU by 50.
 	got := string(vr.Render([]byte("\x1b[1;1H\x1b[50ACLOBBER")))
-	if !strings.Contains(got, "\x1b[3;1H") {
+	if !strings.Contains(got, "\x1b[4;1H") {
 		t.Fatalf("expected CUP shifted to mainTop: got %q", got)
 	}
 	// The CUU should have been swallowed (n clamped to 0 from row 1).
@@ -311,10 +338,10 @@ func TestViewportRendererClampsCUUPartial(t *testing.T) {
 }

 func TestViewportRendererClampsCUDAtViewportBottom(t *testing.T) {
-	// childRows=37 for layout(120, 40). Park cursor at row 37, ask for
+	// childRows=36 for layout(120, 40). Park cursor at row 36, ask for
 	// 10 down → safe step is 0.
 	vr := newViewportRenderer(newTerminalLayout(120, 40))
-	got := string(vr.Render([]byte("\x1b[37;1H\x1b[10B")))
+	got := string(vr.Render([]byte("\x1b[36;1H\x1b[10B")))
 	if strings.Contains(got, "\x1b[10B") {
 		t.Fatalf("CUD past viewport bottom should be dropped: got %q", got)
 	}
@@ -335,10 +362,10 @@ func TestViewportRendererClampsCPLAndHomesColumn(t *testing.T) {

 func TestViewportRendererClampsCNL(t *testing.T) {
 	vr := newViewportRenderer(newTerminalLayout(120, 40))
-	// CUP to row 35 then CNL by 50 → safe step is 2 (childRows-35).
-	got := string(vr.Render([]byte("\x1b[35;10H\x1b[50E")))
+	// CUP to row 34 then CNL by 50 → safe step is 2 (childRows-34).
+	got := string(vr.Render([]byte("\x1b[34;10H\x1b[50E")))
 	if !strings.Contains(got, "\x1b[2E") {
-		t.Fatalf("CNL 50 from row 35 should clamp to 2: got %q", got)
+		t.Fatalf("CNL 50 from row 34 should clamp to 2: got %q", got)
 	}
 }

--- a/internal/harness/scenarios/error_flash_preserves_focused_pane.json
+++ b/internal/harness/scenarios/error_flash_preserves_focused_pane.json
@@ -0,0 +1,37 @@
+{
+  "name": "error_flash_preserves_focused_pane",
+  "presets": {
+    "processes": [
+      {
+        "name": "steady",
+        "argv": ["sh", "-lc", "printf 'STEADY READY\\n'; sleep 5"]
+      }
+    ]
+  },
+  "trust": ["steady"],
+  "steps": [
+    {
+      "type": "mcp_call",
+      "method": "spawn_process",
+      "params": {"kind": "command", "preset": "steady", "name": "steady"},
+      "save_as": "proc"
+    },
+    { "type": "wait_text", "contains": "STEADY READY", "timeout_ms": 5000 },
+    { "type": "send_chord", "chord": "ctrl-k" },
+    { "type": "send_text", "text": "Open Settings" },
+    { "type": "send_chord", "chord": "enter" },
+    { "type": "send_chord", "chord": "enter" },
+    { "type": "send_chord", "chord": "ctrl-n" },
+    { "type": "send_chord", "chord": "ctrl-n" },
+    { "type": "send_chord", "chord": "ctrl-n" },
+    { "type": "send_chord", "chord": "ctrl-n" },
+    { "type": "send_chord", "chord": "ctrl-n" },
+    { "type": "send_chord", "chord": "ctrl-n" },
+    { "type": "send_chord", "chord": "ctrl-n" },
+    { "type": "send_chord", "chord": "enter" },
+    { "type": "wait_text", "contains": "no active top-level agent to summarize", "timeout_ms": 5000 },
+    { "type": "wait_text", "contains": "STEADY READY", "timeout_ms": 5000 },
+    { "type": "assert_contains", "contains": "STEADY READY" },
+    { "type": "assert_not_contains", "contains": "Press Ctrl-K to spawn an agent or process" }
+  ]
+}
--- a/internal/harness/scenarios/rename_process_via_palette.json
+++ b/internal/harness/scenarios/rename_process_via_palette.json
@@ -16,7 +16,7 @@
    { "type": "send_chord", "chord": "ctrl-k" },
    { "type": "send_text", "text": "Rename process" },
    { "type": "send_chord", "chord": "enter" },
-    { "type": "wait_text", "contains": "Rename process", "timeout_ms": 3000 },
+    { "type": "wait_text", "contains": "process: original", "timeout_ms": 3000 },
    { "type": "send_chord", "chord": "ctrl-u" },
    { "type": "send_text", "text": "renamed-pane" },
    { "type": "send_chord", "chord": "enter" },
--- a/internal/harness/scenarios/sidebar_survives_linefeed_scroll.json
+++ b/internal/harness/scenarios/sidebar_survives_linefeed_scroll.json
@@ -5,7 +5,7 @@
  "scripts": [
    {
      "name": "linefeed-scroll",
-      "body": "#!/bin/sh\n# Plain LF at the bottom of the child viewport scrolls the host's\n# DECSTBM region. Because that region spans every column, enough LFs\n# drag the sidebar border and section labels out of the visible region\n# unless patterm invalidates and repaints the sidebar cache.\ni=0\nwhile [ $i -lt 12 ]; do\n  printf 'warmup %02d\\n' \"$i\"\n  i=$((i + 1))\n  sleep 0.05\ndone\nprintf 'LINEFEED READY\\n'\nIFS= read -r _\nprintf '\\033[1;37r'\nprintf '\\033[37;1H'\ni=0\nwhile [ $i -lt 45 ]; do\n  printf 'scroll line %02d\\n' \"$i\"\n  i=$((i + 1))\ndone\nprintf 'LINEFEED DONE\\n'\nsleep 5\n"
+      "body": "#!/bin/sh\n# Plain LF at the bottom of the child viewport scrolls the host's\n# DECSTBM region. Because that region spans every column, enough LFs\n# drag the sidebar border and section labels out of the visible region\n# unless patterm invalidates and repaints the sidebar cache.\ni=0\nwhile [ $i -lt 12 ]; do\n  printf 'warmup %02d\\n' \"$i\"\n  i=$((i + 1))\n  sleep 0.05\ndone\nprintf 'LINEFEED READY\\n'\nIFS= read -r _\nprintf '\\033[1;36r'\nprintf '\\033[36;1H'\ni=0\nwhile [ $i -lt 45 ]; do\n  printf 'scroll line %02d\\n' \"$i\"\n  i=$((i + 1))\ndone\nprintf 'LINEFEED DONE\\n'\nsleep 5\n"
    }
  ],
  "steps": [
@@ -19,13 +19,13 @@
    { "type": "mark_raw", "save_as": "before_scroll" },
    { "type": "send_chord", "chord": "enter" },
    { "type": "wait_text", "contains": "LINEFEED DONE", "timeout_ms": 5000 },
+    { "type": "wait_stable", "timeout_ms": 2000 },
    {
      "type": "assert_raw_since_regex",
      "from": "before_scroll",
-      "regex": "Agent Tree",
+      "regex": "LINEFEED DONE",
      "timeout_ms": 2000
    },
-    { "type": "wait_stable", "timeout_ms": 2000 },
    { "type": "assert_contains", "contains": "Processes" },
    { "type": "assert_contains", "contains": "Agent Tree" },
    { "type": "assert_contains", "contains": "Scratchpads" },
--- a/internal/mcp/protocol.go
+++ b/internal/mcp/protocol.go
@@ -27,6 +27,24 @@ var serverInfo = map[string]any{
 	"version": "0.1.0",
 }

+// serverInstructions is returned in the MCP `initialize` response. MCP
+// clients show this to the underlying LLM as context for how to use
+// the server. Failure modes we've seen and want to head off:
+//   - The agent assumes patterm is something it has to launch (running
+//     `patterm` or `patterm mcp-stdio` from its own shell). It's
+//     already attached — it just calls the tools.
+//   - The agent reaches for shell tools (perl / nc / socat / curl) to
+//     poke patterm's Unix socket directly. That socket connection
+//     carries no caller identity, so any sub-agent the agent spawns
+//     that way ends up as a stray top-level tab instead of a child
+//     under the spawning agent. Always go through the MCP tools.
+//   - The agent shells out to `claude` / `codex` / `opencode` to start
+//     a peer instead of calling `spawn_agent`. Those peers won't show
+//     up as sub-agents and won't be tied into the patterm lifecycle.
+//
+// Keep this short — clients vary in how much they surface to the LLM.
+const serverInstructions = "You are already running INSIDE patterm; the `patterm` MCP server is connected over the same stdio MCP transport you use for any other MCP server. Use the MCP tools you see in tools/list — do NOT (a) try to launch `patterm` or `patterm mcp-stdio` yourself, (b) poke the Unix socket through perl / nc / socat / curl, or (c) shell out to `claude` / `codex` / `opencode` to start a peer. Any of those bypasses caller-identity and the new agent will land as a stray top-level tab instead of a child under you. Start with `whoami` for your role and the full tool list, then `help('topics')` for orientation. `spawn_agent` is the only correct way to start a sub-agent; `spawn_process` is for non-LLM commands; `list_processes` / `get_process_output` inspect them; `send_input` / `send_message` drive them. Whatever you spawn is yours to `close_process` when done. When you `send_message` a sub-agent, its reply comes back into YOUR pane as `[sub-agent:<name>] …`, not into the sub-agent's output — to wait for it, use `timer_fire_when_idle_any([sub_agent])` and then read your own pane; do NOT `wait_for_pattern` on the sub-agent, that will deadlock until timeout."
+
 // toolDescriptor is the shape returned by `tools/list`. inputSchema is
 // a JSON Schema object — we provide a minimal `{type: "object"}` schema
 // for each tool, which lets MCP clients accept arbitrary arguments and
@@ -88,7 +106,7 @@ func toolCatalog() []toolDescriptor {
 	return []toolDescriptor{
 		{
 			Name:        "spawn_agent",
-			Description: "Spawn a sub-agent from an agent preset and optionally seed it with initial instructions. Caller owns lifecycle: when the sub-agent's work is done (it reports back via send_message, or you no longer need it), call close_process on its process_id to free the pane and tear down the PTY. See help('lifecycle').",
+			Description: "Spawn a sub-agent from an agent preset and optionally seed it with initial instructions. This is the ONLY correct way to start a sub-agent under you — do not shell out to `claude` / `codex` / `opencode` and do not poke patterm's Unix socket via perl / nc / socat. Either bypasses caller identity and the new agent lands as a stray top-level tab instead of your child. Caller owns lifecycle: when the sub-agent's work is done (it reports back via send_message, or you no longer need it), call close_process on its process_id to free the pane and tear down the PTY. See help('spawning') and help('lifecycle').",
 			InputSchema: objectSchema(map[string]any{
 				"agent":              stringProp("Preset name (e.g. \"claude\", \"codex\")."),
 				"agent_instructions": stringProp("Initial prompt typed into the agent after it's ready."),
@@ -201,7 +219,7 @@ func toolCatalog() []toolDescriptor {
 		},
 		{
 			Name:        "wait_for_pattern",
-			Description: "Block until pattern appears in process output or timeout elapses.",
+			Description: "Block until pattern appears in the TARGET process's own output, or timeout elapses. Use this for waiting on text the target itself will emit (a shell prompt, a build's \"tests passed\" line, etc.). Anti-pattern: do NOT use this to wait for a sub-agent's reply to send_message — replies are routed into the CALLER's pane tagged `[sub-agent:<name>]`, not into the sub-agent's output, so this call will spin to timeout. For sub-agent coordination use `timer_fire_when_idle_any` and then read your own pane.",
 			InputSchema: objectSchema(map[string]any{
 				"process_id":      stringProp("Target process id."),
 				"pattern":         stringProp("Regex pattern."),
@@ -231,7 +249,7 @@ func toolCatalog() []toolDescriptor {
 		},
 		{
 			Name:        "send_message",
-			Description: "Deliver a text message to another process as orchestrator-owned input.",
+			Description: "Deliver a text message to another process as orchestrator-owned input. Fire-and-forget: returns immediately, without waiting for the recipient to read or act. If the recipient replies via send_message, that reply arrives in YOUR pane tagged `[sub-agent:<name>]` (child→parent) or `[orchestrator]` (parent→child) — NOT in the recipient's output. To wait for a sub-agent's reply, schedule `timer_fire_when_idle_any([sub_agent_id], body=…)` and then read your own pane when the timer fires. Do not `wait_for_pattern` on the recipient for a reply; it will deadlock.",
 			InputSchema: objectSchema(map[string]any{
 				"target_process_id": stringProp("Recipient process id."),
 				"message":           stringProp("Message body."),
@@ -265,7 +283,7 @@ func toolCatalog() []toolDescriptor {
 		},
 		{
 			Name:        "timer_fire_when_idle_any",
-			Description: "Schedule a timer that fires when any watched process enters idle (already-idle entries excluded), or when max_wait_seconds elapses.",
+			Description: "Canonical way to wait for a sub-agent to finish working: send_message the sub-agent, then schedule this with watched=[sub_agent_id]; when it fires, the reply is already sitting in your own pane tagged `[sub-agent:<name>]`. Schedules a timer that fires when any watched process enters idle (already-idle entries excluded), or when max_wait_seconds elapses.",
 			InputSchema: objectSchema(map[string]any{
 				"watched":          arrayOfStringsProp("Process ids to watch."),
 				"body":             stringProp("Message delivered verbatim to the owning agent when the timer fires."),
@@ -276,7 +294,7 @@ func toolCatalog() []toolDescriptor {
 		},
 		{
 			Name:        "timer_fire_when_idle_all",
-			Description: "Schedule a timer that fires when all watched processes are idle (already-idle entries count as satisfied), or when max_wait_seconds elapses.",
+			Description: "Canonical way to wait for several sub-agents to finish working in parallel: send_message each one, then schedule this with watched=[…ids]; when it fires, each reply is in your own pane tagged `[sub-agent:<name>]`. Schedules a timer that fires when all watched processes are idle (already-idle entries count as satisfied), or when max_wait_seconds elapses.",
 			InputSchema: objectSchema(map[string]any{
 				"watched":          arrayOfStringsProp("Process ids to watch."),
 				"body":             stringProp("Message delivered verbatim to the owning agent when the timer fires."),
@@ -377,7 +395,8 @@ func (s *Server) handleProtocolMethod(callerID, method string, params json.RawMe
 			"capabilities": map[string]any{
 				"tools": map[string]any{"listChanged": false},
 			},
-			"serverInfo": serverInfo,
+			"serverInfo":   serverInfo,
+			"instructions": serverInstructions,
 		}
 		return result, true, 0, "", nil

--- a/internal/mcp/protocol_test.go
+++ b/internal/mcp/protocol_test.go
@@ -36,6 +36,13 @@ func TestInitializeReturnsCapabilities(t *testing.T) {
 	if caps["tools"] == nil {
 		t.Fatalf("tools capability missing: %+v", caps)
 	}
+	// patterm-specific orientation: clients show this to the underlying
+	// LLM, so it's our primary hook for steering vendor TUIs (codex in
+	// particular) toward the MCP tool surface instead of shell-ing out.
+	instructions, ok := parsed.Result["instructions"].(string)
+	if !ok || instructions == "" {
+		t.Fatalf("instructions missing or wrong type: %+v", parsed.Result)
+	}
 }

 func TestInitializedNotificationSuppressesResponse(t *testing.T) {
--- a/internal/preset/preset.go
+++ b/internal/preset/preset.go
@@ -300,14 +300,6 @@ func ensureDefaults(base string) error {
    "^\\s*>_"
  ]
 }
-`,
-		},
-		{
-			"presets/processes/shell.json",
-			`{
-  "name": "shell",
-  "argv": ["__SHELL__"]
-}
 `,
 		},
 	}
@@ -319,15 +311,7 @@ func ensureDefaults(base string) error {
 		if err := os.MkdirAll(filepath.Dir(full), 0o700); err != nil {
 			return err
 		}
-		body := d.body
-		if strings.Contains(body, "__SHELL__") {
-			shell := os.Getenv("SHELL")
-			if shell == "" {
-				shell = "/bin/sh"
-			}
-			body = strings.ReplaceAll(body, "__SHELL__", shell)
-		}
-		if err := os.WriteFile(full, []byte(body), 0o600); err != nil {
+		if err := os.WriteFile(full, []byte(d.body), 0o600); err != nil {
 			return err
 		}
 	}
Author	SHA1	Message	Date
Harry Bayliss	c1ecba0624	Use mise to install zig + go in release CI; cut 0.0.4 All checks were successful release / build-linux-amd64 (push) Successful in 13m7s Details `mlugg/setup-zig` was chasing mirrors for ~4 minutes on every run (see v0.0.1 / v0.0.2 logs) and `actions/setup-go` was spending another ~4 minutes downloading Go before patterm started building. mise already manages the project's zig pin; adding `go = "1.26.3"` to `.mise.toml` (matching go.mod) lets `jdx/mise-action@v2` install both with one cached step. Subsequent runs reuse the mise cache instead of re-resolving mirror URLs and re-downloading toolchains. Also adds an `actions/cache@v4` step for `~/.cache/go-build` and `~/go/pkg/mod` keyed on `go.sum` so `go build` itself doesn't re-pull modules every tag push.	2026-05-15 19:38:13 +01:00
Harry Bayliss	878e9370bc	Fix error flashes replacing focused pane	2026-05-15 19:27:42 +01:00
Harry Bayliss	fd9c19e5c2	Fix release CI: upgrade mlugg/setup-zig to v2 and cut 0.0.3 Some checks failed release / build-linux-amd64 (push) Has been cancelled Details `mlugg/setup-zig@v1` is deprecated and only knows the pre-0.14 tarball name (`zig-linux-x86_64-<ver>.tar.xz`), so every mirror — and the official ziglang.org/builds — returned 404 for Zig 0.15.2 on both the v0.0.1 and v0.0.2 release runs. v2 uses the new `zig-x86_64-linux-<ver>.tar.xz` layout that Zig switched to in 0.14+. Also rolls the existing CHANGELOG `[Unreleased]` work into a dated `[0.0.3]` section and adds the CI fix to its Fixed list.	2026-05-15 19:14:21 +01:00
Harry Bayliss	6d90cd7185	Match Solo summary cadence options	2026-05-15 19:13:54 +01:00
Harry Bayliss	d648d5b775	Add auto-summary settings	2026-05-15 19:09:21 +01:00
harry	1bf51bb784	Merge pull request 'Overhaul command palette UX' (#4 ) from feat/palette-ux-overhaul into main Reviewed-on: #4	2026-05-15 18:25:38 +01:00
Harry Bayliss	81bc77366f	Overhaul command palette UX Six-phase sweep: section headers (Focused / Open / Spawn / Quit) with header-skip cursor; chip strip mirroring sw/sp/k macros, driven by Tab; unified Spawn verbs across agent / process / terminal / custom; dropped duplicate global Close list in favor of Ctrl-X inline close on a Switch row plus the [Close] chip; scored matching (prefix > word-boundary > substring > fuzzy) with matched-char highlighting; title bar surfaces focus subject; rename forms split long subject onto its own row; new Alt-1..9 quick-pick, Home/End, ? help overlay, and Ctrl-R relaunch toggle inside the spawn-process form. Scroll indicator and cursor/total counter round out the footer.	2026-05-15 16:41:44 +01:00
Harry Bayliss	0c960fa859	Clarify sub-agent reply routing in MCP tool descriptions A sub-agent's reply to send_message lands in the caller's own pane tagged [sub-agent:<name>], not in the sub-agent's output. The descriptions for wait_for_pattern, send_message, both timer_fire_when_idle_*, and the server-instructions preamble now spell this out, along with the canonical send_message → timer_fire_when_idle_any → read-own-pane pattern. help('readiness') and help('coordination') updated to match. Previously agents reached for wait_for_pattern on the sub-agent and deadlocked until timeout because the reply had already been delivered to their own pane.	2026-05-15 16:08:07 +01:00
Harry Bayliss	b05065a601	Sync TODO.md perf-audit review pass Removed low/marginal items from the original sweep; remaining items have measured or workflow evidence to justify action.	2026-05-15 16:07:58 +01:00
Harry Bayliss	08187aed77	Don't steal focus when an agent spawns a child via MCP	2026-05-15 15:53:50 +01:00
Harry Bayliss	24c8183832	Auto-snap child viewport to bottom when typing into scrollback Typing into a focused child while its emulator viewport was scrolled up left the keystroke heading to the PTY but the input box invisible below the visible region — it looked like typing did nothing. processStdin's flushForward now sets pendingViewportBottom whenever bytes are actually injected, so the existing post-loop handler snaps the viewport and repaints. Wheel events and Ctrl-B paths are untouched: both are intercepted before reaching forward, so wheel still scrolls into history and Ctrl-B is still the explicit escape hatch. Only bytes that would actually reach the child PTY trigger the snap.	2026-05-15 15:34:00 +01:00
Harry Bayliss	b5dfaf39c4	Marquee long sidebar names; truncate with ellipsis otherwise Sidebar rows that overflow the rail width used to spill characters into the main viewport. They now truncate with a trailing "…" when unfocused (or when the focused name still fits). The focused row whose name overflows runs a pause-scroll-pause marquee: 1 s hold on the head, ~150 ms per cell scroll, 1 s hold on the tail, snap back. The row's geometry never moves while it animates, so nothing below shifts. A dedicated 150 ms goroutine flips sidebarDirty only while a row is actively animating; the chrome ticker does the actual repaint. Idle is a single cheap wakeup. focus / spawn / exit / restart all reset the marquee state so the new focused row starts from frame zero. When the row's budget is tight, the trailing timer indicator drops before the name ellipses since the name is the only identifier the row carries. clampVisible() is a defensive net inside write(): even if a row's decoration size were mis-computed, it will not spill past the sidebar band into the PTY area.	2026-05-15 15:33:39 +01:00
Harry Bayliss	1fb919c22a	Keep parent tab highlighted when focus is on a sub-agent The top tab bar compared against focusedID, so stepping into a sub-agent dropped the parent tab's highlight even though the user was still inside that thread. activeAgentID already walks the parent chain to the top-level root for the sidebar's agent tree — reuse it for the tab strip too.	2026-05-15 15:26:06 +01:00
Harry Bayliss	4b4e7543e8	Release v0.0.2 Some checks failed release / build-linux-amd64 (push) Failing after 10m12s Details Bundles the in-flight work into the second tagged release. See CHANGELOG.md `[0.0.2] - 2026-05-15` for the full per-change list. Highlights: - libghostty-vt was building in zig's silent Debug default, capping the full pipeline at 34-63 fps. Makefile now defaults to ReleaseFast (.mise.toml pins zig 0.15.2 so the build is reproducible). End-to-end pipeline now runs at 930-2030 fps — 27-32× faster, with 7-16× headroom over a 120 fps target. - --debug[=DIR] and --profile[=DIR] flags capture full PTY logs, pprof data, and per-hot-path metrics (chunks/sec, mean/max latencies, cache hit rates) for offline analysis. Nothing pollutes stdout/stderr. - ASCII-video benchmark suite (8-colour / truecolor / Bad-Apple patterns at 30/60/120 fps) plus a renderer microbenchmark set for stable A/B comparisons across changes. - Click-and-drag text selection from alt-screen TUIs (codex) now works — host mouse mode follows the focused child's screen side instead of being permanently armed. - Long claude session resume + codex steady-state rendering pay less per chunk: drawSidebar deferred to the chrome ticker, emulator.Title CGO poll gated on a containsOSC scan. - Vendor-TUI orientation: MCP initialize.instructions, the spawn_agent tool description, and help('spawning') all spell out the anti-patterns (shell-out, perl-into-socket) that produced codex's stray top-level tabs.	2026-05-15 14:22:59 +01:00
Harry Bayliss	bda799a3c6	mise-pin zig 0.15.2; rebuild libghostty-vt ReleaseFast — 27-32x pipeline speedup Added .mise.toml pinning zig = "0.15.2" (the minimum the vendored Ghostty commit requires) and taught the Makefile to resolve zig through mise when available, falling back to PATH. Contributors run `mise install` once and `make deps` just works. Re-ran the pipeline benchmarks after rebuilding libghostty-vt with ReleaseFast (same hardware, AMD Ryzen 7 7800X3D): Debug ReleaseFast speedup Pipeline 8-colour @120fps 63 fps 2030 fps 32x Pipeline truecolor @120fps 34 fps 931 fps 27x Emulator-only truecolor 34 fps 2051 fps 60x 7-16x headroom over 120 fps for the heaviest workload (truecolor full-screen redraws). Static library size 33 MiB -> 13 MiB. TODO.md baseline numbers updated to reflect post-fix throughput; the "Debug-mode lib" finding is folded into the result it produced rather than left as an open item.	2026-05-15 13:54:48 +01:00
Harry Bayliss	2f109a84fa	Stress-test ASCII video at 30/60/120 fps; fix libghostty-vt Debug build Added a full ASCII-video benchmark suite that hammers the renderer with 30 KiB / 70 KiB full-screen frames at 30, 60, and 120 fps targets — both renderer-only and full-pipeline (em.Write + renderer + stdout). Each stream benchmark reports µs/frame, fps_ceiling, and percent of the per-frame budget consumed. The pipeline benchmarks revealed we were missing 120 fps by a wide margin (190%-350% of budget at 120fps, 60-90 fps ceiling). Isolating em.Write confirmed libghostty-vt is the bottleneck — 16-29 ms per truecolor frame, library file at 33 MiB. Root cause: the Makefile invoked `zig build` with no -Doptimize, and Zig's standardOptimizeOption defaults to Debug. So the shipped libghostty-vt was unoptimised. Fixed by pinning ReleaseFast in the Makefile (override via GHOSTTY_VT_OPTIMIZE for debug builds of the upstream lib). Existing checkouts need `make clean-deps && make deps` to pick up the rebuild.	2026-05-15 13:43:31 +01:00
Harry Bayliss	1c590f8e32	Concrete perf metrics: live counters in --profile + benchmark suite Live metrics (--profile): - New metricsTracker instruments OnPTYOut, viewport renderer, stdout writes, libghostty-vt Write/Title CGO calls, sidebar / tabbar / status draws (with cache-hit accounting), snapshot replays, and the chrome ticker (so we can see ticker fires that did nothing). - Writes metrics.jsonl (one snapshot per second) and metrics.json + summary.txt on exit, alongside the existing pprof files. - All record* methods are nil-safe so disabled paths pay only a cheap nil check; counters are atomic so the per-PTY-chunk hot path stays lock-free. Benchmark suite (go test -bench=.): - Three workload fixtures — plain ASCII, SGR-styled lines, and a ratatui-style cursor-shuffling burst — plus a containsOSC microbenchmark. Reports ns/op, MB/s, allocs/op, B/op. - Initial baseline numbers added to TODO under the perf-audit section, alongside two new findings (renderer allocs ~1 per 4 bytes on styled chunks; styled throughput tops out near 90 MB/s) those benchmarks surfaced.	2026-05-15 13:31:37 +01:00
Harry Bayliss	442eed605c	Add auto-generated perf audit findings to TODO Codebase sweep for perf issues outside the per-PTY-chunk path that recent CHANGELOG work already covered. Ten findings under a new "Perf Audit (auto-generated)" section in TODO.md — anchored to file:line, classified MEDIUM/LOW, with a sketched fix per entry. None landed as code changes; review pending.	2026-05-15 12:46:42 +01:00
Harry Bayliss	c120342709	Clear TODO backlog: --debug/--profile, codex selection, MCP orientation, perf - Add --debug[=DIR] / --profile[=DIR] flags that write run artefacts (patterm.log, events.jsonl, per-child raw PTY captures, CPU + heap + goroutine pprof) to a dir without polluting stdout/stderr. - Strengthen vendor-TUI orientation in three places (MCP initialize.instructions, the spawn_agent tool description, and help('spawning')) to head off codex's habits of poking the Unix socket via perl and shelling out to launch peers — both bypass caller identity and produce orphaned top-level tabs. - Fix click-and-drag text selection from alt-screen TUIs. Host SGR mouse reporting now follows the focused child's screen side instead of being permanently armed; alt-screen TUIs that need mouse re-enable it themselves and the toggle is forwarded. - Move drawSidebar() off the per-PTY-chunk hot path. Long claude session resume was paying a full sidebar rebuild for every scrolled chunk; the chrome ticker now drains a dirty flag at 60 Hz. - Gate the per-chunk Title() CGO poll on a containsOSC scan so codex/ratatui's many SGR-only chunks no longer pay a CGO call each.	2026-05-15 12:41:47 +01:00
Harry Bayliss	01fc108086	Rename Kill to Close, add New Terminal palette entry, clean up exited terminals - Palette's per-child "Kill <name>" action is now labelled "Close <name>" (action kind unchanged; still SIGTERM). Matches the existing "Close agent: …" context entry and reads less violent for a graceful term. - New "New Terminal" palette entry spawns a bare interactive $SHELL pane via LaunchTerminal (kind=terminal). Replaces the default "shell" process preset that was seeded on first run. - Exited KindTerminal entries are now dropped from the session in reapChild — terminals have no restart path, so leaving them behind as greyed rows in the Processes sidebar was just clutter. processList also filters defensively.	2026-05-15 11:30:46 +01:00
harry	24696305d6	Merge pull request 'Add idle-state classifier and Solo-parity timer tools' (#3 ) from feat/idle-detection into main	2026-05-15 11:21:41 +01:00