This is a spec for a terminal project I have. I think we could probably use libghostty for the terminal emulation as it seems the go ecosystem is quite sparse. # patterm — v1 Spec *Working title: **patterm**. Used throughout this document.* ## 1. Overview A terminal-based agent orchestration shell. The user opens patterm in a project directory (e.g. `~/Dev/foo`). patterm presents a multi-tab TUI where each top tab is a **session** — a long-running PTY launched from a user-defined **preset**. Presets come in two flavours: **agent presets** (e.g. claude, codex, opencode — vendor LLM CLIs with patterm's MCP wired up) and **process presets** (e.g. `bun run dev`, `vitest --watch` — raw commands with no MCP). Each session has a sidebar of **children**: more presets spawned by that session, again either agents or processes, all PTY-backed. The right rail also surfaces project-scoped **scratchpads** (markdown files) for human readability. An MCP server, in-process, exposes tools that let orchestrator agents spawn and drive children, run processes, set timers, message peers, and read/write scratchpads. The orchestrator is a real LLM CLI driving another LLM CLI as if it were a human user — keystroke injection in, rendered-grid scraping out. The orchestrator fully owns the content of what it sends; patterm only handles the plumbing. **Goal:** Let one SOTA agent orchestrate other agents of different types (claude → codex, codex → opencode, …) without subagent APIs, while keeping the whole thing steerable and observable by a human at any moment. **Non-goal:** Hosting any LLM. patterm only manages CLIs the user already has installed. patterm also doesn't ship hard-coded knowledge of any specific vendor CLI — agent presets are user-editable JSON; the three common ones (claude, codex, opencode) ship as defaults. --- ## 2. Architecture and lifecycle **Single foreground process. No daemon, no detach.** The tool is one Go process that owns: the TUI, all PTYs, vt-emulated grids, session state, child state, scratchpad files, and an in-process MCP server. Killing the process kills everything inside it. There is no attach/detach, no project-keyed singleton, no socket-based reattachment. **Lifecycle:** 1. User runs `patterm` in a project directory. 2. The process starts the TUI as a **blank canvas** — no sessions, no children, no scratchpad preview. Just the empty frame with the palette hint in the status line. The in-process MCP server initializes (bound to a per-PID unix socket for spawned children — see §10) and scratchpad metadata is loaded from disk, but nothing is rendered until the user opens a preset. 3. The user opens the palette (`Ctrl-K`), selects a preset, and the first session/process is launched. Subsequent sessions and children are spawned the same way (or by orchestrators via MCP). 4. On exit (Ctrl-D, `:quit`, terminal window close, SIGTERM, SIGHUP): the process sends SIGTERM to every child PTY with a short grace window, then SIGKILL, then exits. Scratchpads on disk are the only thing that survives. **Multiple invocations:** Running `patterm` twice in the same project starts two independent processes. They share scratchpad files on disk but nothing else. If this turns out to be a footgun in practice, a per-project lockfile can be added later — out of scope for v1. **Implications:** Closing the terminal window (or SSH dropping) ends the session and tears down every child. This is the deliberate trade — no orphan daemons, no socket discovery, no stale-state recovery, no multi-client coordination. The user's terminal window *is* the lifetime boundary. --- ## 3. Project state layout Scratchpads (user data) live under `$XDG_DATA_HOME`; presets and config live under `$XDG_CONFIG_HOME`. ``` $XDG_DATA_HOME/patterm/ └── projects/ └── / ├── meta.json # project path, last-opened, version └── scratchpads/ ├── notes.md ├── todos.md └── .md $XDG_CONFIG_HOME/patterm/ ├── config.json # global settings (theme, default keymap, etc.) └── presets/ ├── agents/ │ ├── claude.json # ships as default │ ├── codex.json # ships as default │ ├── opencode.json # ships as default │ └── .json └── processes/ ├── dev.json # e.g. { "name": "bun run dev", "argv": ["bun", "run", "dev"] } ├── test.json └── .json ``` Both preset directories are scanned at startup; every file found becomes a palette entry ("Spawn agent: claude", "Run process: bun run dev", …). Presets are project-agnostic in v1 — the same set is available in every project. Per-project overrides can be added later. Project key = `sha256(realpath(project_dir))[:16]`. Used only as a scratchpad directory name — there is no daemon to look up. Internal MCP socket (for spawned children to talk to the running process): `$XDG_RUNTIME_DIR/patterm/.sock`, falling back to `/tmp/patterm-.sock` if `XDG_RUNTIME_DIR` is unset. Created on startup, removed on exit. Per-PID, not per-project — it is a private IPC channel, not a discovery point. Scratchpads persist across runs. Sessions and child processes do not. --- ## 4. UI / Client ``` ┌────────────────────────────────────────────────────────┬──────────────────┐ │ [codex-1] [codex-2] [claude-1] + │ Session tree │ ├────────────────────────────────────────────────────────┤ ────── │ │ │ ▶ codex-1 │ │ │ │ │ │ │ ├─ ◉ claude-2 │ │ │ ├─ ◉ claude-3 │ │ (focused pane's PTY) │ ├─ ◉ claude-4 │ │ │ └─ ◉ bun-dev │ │ │ │ │ │ Scratchpads │ │ │ ────── │ │ │ todos.md │ │ │ notes.md │ │ │ api-plan.md │ │ │ │ │ │ ┌────────────┐ │ │ │ │ todos.md │ │ │ │ │ preview… │ │ │ │ └────────────┘ │ ├────────────────────────────────────────────────────────┴──────────────────┤ │ [orchestrator driving] Ctrl-K command palette │ └───────────────────────────────────────────────────────────────────────────┘ ``` - **Top tab bar:** one per top-level session. `+` opens the palette pre-filtered to "Spawn…" entries. - **Main area:** the focused pane's PTY, rendered identically to viewing it in a regular terminal. The focused pane is either the orchestrator (root of the active session's tree) or one of its children, whichever the user last selected from the sidebar. - **Right rail, top half — session tree:** the active session's process hierarchy, drawn as an indented tree with box-drawing connectors (`├─`, `└─`). The orchestrator is the root (`▶`); each child appears one level deeper with a status glyph (`◉` running, `✓` exited cleanly, `✗` errored). Selecting an entry (palette, arrow keys, or click) makes it the focused pane. v1 only has two levels because of the §8 two-level-tree rule, but the renderer should be tree-shaped from day one so a future depth bump doesn't require UI surgery. - **Right rail, bottom half:** scratchpad list and a preview of the selected scratchpad. - **Status line:** input-ownership toast ("orchestrator driving" / "you have control") on the left, palette hint on the right. **Empty state:** Until the user spawns their first preset, the top tab bar, main area, and sidebar all sit empty with a centred hint ("Press Ctrl-K to spawn an agent or process"). No "default session" is created. **Switching:** Clicking a top tab (or selecting one via the palette) switches the active session — the sidebar tree swaps to that session's hierarchy. Clicking a sidebar entry switches the focused pane within the current session. **Command palette (v1 input model):** Almost all application functions are driven through a single command palette opened with `Ctrl-K`. The palette is a fuzzy-searchable list of commands, scoped to whatever makes sense for the current focus. Two kinds of entries appear: - **Built-in commands** — "Switch to session…", "Focus pane…", "Take input control", "Release control to orchestrator", "Open scratchpad…", "Kill child…", "Quit", etc. - **Preset commands** — one entry per file under `$XDG_CONFIG_HOME/patterm/presets/`. Agent presets surface as "Spawn agent: codex" / "Spawn agent: claude" / …; process presets surface as "Run process: bun run dev" / "Run process: vitest" / …. The label comes from the preset's `name` field; the action is "launch this preset into a new pane." Selecting a preset either launches it immediately (no required args) or opens a sub-palette for optional args — namely an **initial prompt** (agent presets only), which patterm injects into the spawned PTY's input after the agent is ready (§8). The orchestrator equivalent of this — `spawn_agent` / `spawn_process` MCP tools — uses the exact same machinery: pick a preset by name, optionally supply an initial prompt, patterm handles the rest. Rationale: the keybinding surface for sessions + children + scratchpads + control transfer + spawning gets large fast. A palette lets us ship the full feature set without committing to a key map yet, and gives the user a discoverable index of every action. Dedicated keybindings can be layered on top later for the few actions a user does often enough to memorize — they should be configured by binding to palette command IDs, not by re-implementing the action. Only two keybindings are reserved at the application level in v1: | Action | Binding | |---|---| | Open command palette | `Ctrl-K` | | Pass-through prefix (everything else after this goes to the focused PTY untouched, e.g. for nested tmux/Ctrl-K-using TUIs) | `Ctrl-K Ctrl-K` | Everything else — session switching, child cycling, control transfer, quitting — lives in the palette for v1. --- ## 5. PTY layer One PTY per session orchestrator and one per child. For each PTY the tool maintains: - The underlying process (pid, status, exit code on death). - A raw byte ring buffer (default 1 MiB) for stream-mode reads. - A vt-emulated character grid representing current visible state. - Alt-screen flag (whether the process is in alternate-buffer mode, i.e. a TUI). - Last-write timestamp (used for the idle heuristic). **Terminal emulator:** Go has limited options. Start with `vt10x` or a maintained fork. Budget real time — this is the load-bearing component for grid mode `read_output`. The emulator must handle: SGR colours (then strip them on read), cursor movement, alt-screen entry/exit, scroll regions, basic mouse passthrough where needed. **Resize:** On startup and on SIGWINCH, the tool reads its own terminal dimensions, computes per-pane winsize (accounting for tab bar, sidebar, status line), and `ioctl(TIOCSWINSZ)` each PTY. Children get SIGWINCH automatically. One process, one viewport — no multi-client resize negotiation. --- ## 6. Input ownership Each pane has an owner flag: `user` or `orchestrator`. A toast / status-line glyph reflects current owner. - When the orchestrator spawns a child, that child defaults to orchestrator-owned. - When the user focuses a pane and presses any key, ownership flips to `user`. The orchestrator can still write — bytes interleave. A warning toast appears: "Orchestrator is also driving this pane." - The user explicitly returns ownership with the release key. No locking. The user's call if they collide. The visual indicator is the only protection. --- ## 7. MCP tool surface The tool embeds an MCP server in-process. Each spawned agent gets an MCP config injected at spawn time (see §10) pointing at a stdio proxy subcommand of the same binary, which forwards JSON-RPC over the per-PID unix socket to the running process. Tool calls carry an implicit caller identity (which session / which child) derived from the connection. ### Tools available to orchestrators only #### `spawn_agent` - **Args:** `preset` (string — name of an agent preset under `$XDG_CONFIG_HOME/patterm/presets/agents/`), `initial_prompt` (string), `name?` (display name, defaults to `-`) - **Behaviour:** Launches the agent preset in a new PTY as a child of the calling session. Wires MCP per the preset's injection strategy (§10). Waits for the preset's ready signal (default: 1s idle). Then types `initial_prompt` into the TUI input box and submits. patterm does not inject any other text — the caller's `initial_prompt` is the agent's first turn. If the caller wants the agent to know about the message-tag conventions (§8), tool availability, or its orchestrator role, the caller must say so in `initial_prompt`. - **Returns:** `child_id`. - **Error:** Returns an error if `preset` isn't a known agent preset. patterm has no built-in knowledge of vendor CLIs — everything is preset-driven. #### `send_message_to` - **Args:** `target` (child_id), `message` (string) - **Behaviour:** Types `[orchestrator] \n` into the target child's PTY. - **Returns:** `ok`. #### `request_human_attention` - **Args:** `child_id`, `reason` (string) - **Behaviour:** Surfaces a notification in the TUI, blinks the sidebar entry for the child, optionally auto-focuses if the user setting allows it. Used by orchestrator when it wants to punt a decision (e.g. ambiguous permission prompt) to the human. - **Returns:** `ok`. ### Tools available to all agents #### `spawn_process` - **Args:** One of: - `preset` (string — name of a process preset under `$XDG_CONFIG_HOME/patterm/presets/processes/`), plus optional `working_dir?` / `env?` overrides; **or** - `argv` (array of strings — freeform launch), with optional `working_dir?`, `env?`, and `shell?` (default `false`; when `true`, `argv` is interpreted as `["sh", "-lc", argv[0]]`-style). - **Behaviour:** Launches the command in a new PTY, attached as a child of the calling agent's session. Presets are the preferred path; freeform `argv` is the escape hatch for one-offs the user hasn't pre-configured. No MCP injection (process children aren't agents). - **Returns:** `child_id`. #### `read_output` - **Args:** `child_id`, `mode` (`grid` | `stream`), `since_offset?` (stream mode only) - **Behaviour:** - `grid` mode: returns the current rendered visible grid as plain text, ANSI stripped, with best-effort trimming of detectable vendor chrome (top banner, bottom input box, status line) per agent-type heuristics. Use for TUI children. - `stream` mode: returns raw byte content from `since_offset` to current write head, ANSI stripped. Use for line-mode processes. - **Returns:** `{ content: string, new_offset: int, mode: "grid" | "stream" }`. - **Note in tool description (visible to the calling agent):** "The grid result is the entire visible pane. You are responsible for locating the response to your last prompt within it." #### `send_input` - **Args:** `child_id`, `input` (string), `append_newline?` (default `true`) - **Behaviour:** Writes bytes to the child PTY's stdin. Used both for free-form input and for single-key confirmations (`y`, `n`). - **Returns:** `ok`. #### `kill` - **Args:** `child_id`, `signal?` (default `SIGTERM`) - **Returns:** `ok`. #### `wait_for_pattern` - **Args:** `child_id`, `pattern` (regex), `timeout_seconds` - **Behaviour:** Blocks the calling agent until the rendered grid matches the regex or the timeout expires. Polls the grid at ~50ms intervals. - **Returns:** `{ matched: bool, snippet?: string }`. #### `timer_wait` - **Args:** `seconds`, `label?` (default auto-generated) - **Behaviour:** Returns immediately with a `timer_id`. After `seconds`, the tool injects `[system] Your timer [