commit 69ef09aac4b5ac7d67b023d5765d3e60e5b4b619 Author: Harry Bayliss Date: Thu May 14 13:37:20 2026 +0100 Initial patterm project diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..ce72d8e --- /dev/null +++ b/.gitignore @@ -0,0 +1,8 @@ +/third_party/libghostty-vt/source/ +/third_party/libghostty-vt/install/ +*.bytes +*.crash +spike-report-*.txt +/.zig-cache/ +/bin/ +/spike diff --git a/Makefile b/Makefile new file mode 100644 index 0000000..799570a --- /dev/null +++ b/Makefile @@ -0,0 +1,42 @@ +SHELL := /bin/bash + +ROOT := $(abspath .) +VENDOR := $(ROOT)/third_party/libghostty-vt +SOURCE := $(VENDOR)/source +INSTALL := $(VENDOR)/install +COMMIT := $(shell cat $(VENDOR)/COMMIT) + +.PHONY: deps deps-fetch deps-build clean-deps spike patterm test + +# `make deps` fetches and builds libghostty-vt at the pinned commit. +# Re-runs are idempotent on success; touch $(VENDOR)/COMMIT to force a rebuild. +deps: $(INSTALL)/lib/libghostty-vt.a + +$(SOURCE)/.git/HEAD: + @echo ">> cloning ghostty-org/ghostty @ $(COMMIT)" + @rm -rf $(SOURCE) + @git clone --filter=blob:none https://github.com/ghostty-org/ghostty.git $(SOURCE) + @cd $(SOURCE) && git checkout --detach $(COMMIT) + +deps-fetch: $(SOURCE)/.git/HEAD + +$(INSTALL)/lib/libghostty-vt.a: $(SOURCE)/.git/HEAD + @command -v zig >/dev/null || { echo "ERROR: zig not on PATH (need >=0.15.2 to build libghostty-vt)"; exit 1; } + @echo ">> building libghostty-vt with zig" + @cd $(SOURCE) && zig build -Demit-lib-vt --prefix $(INSTALL) + @test -f $(INSTALL)/lib/libghostty-vt.a || { echo "ERROR: expected static lib at $(INSTALL)/lib/libghostty-vt.a"; exit 1; } + @echo ">> libghostty-vt installed under $(INSTALL)" + +deps-build: $(INSTALL)/lib/libghostty-vt.a + +clean-deps: + rm -rf $(SOURCE) $(INSTALL) + +spike: deps + go build -o ./bin/spike ./cmd/spike + +patterm: deps + go build -o ./bin/patterm ./cmd/patterm + +test: deps + go test ./... diff --git a/SPEC.md b/SPEC.md new file mode 100644 index 0000000..fbf7ca3 --- /dev/null +++ b/SPEC.md @@ -0,0 +1,542 @@ +This is a spec for a terminal project I have. + +I think we could probably use libghostty for the terminal emulation as it seems the go ecosystem is quite sparse. + +# patterm — v1 Spec + +*Working title: **patterm**. Used throughout this document.* + +## 1. Overview + +A terminal-based agent orchestration shell. The user opens patterm in a project directory (e.g. `~/Dev/foo`). patterm presents a multi-tab TUI where each top tab is a **session** — a long-running PTY launched from a user-defined **preset**. Presets come in two flavours: **agent presets** (e.g. claude, codex, opencode — vendor LLM CLIs with patterm's MCP wired up) and **process presets** (e.g. `bun run dev`, `vitest --watch` — raw commands with no MCP). Each session has a sidebar of **children**: more presets spawned by that session, again either agents or processes, all PTY-backed. The right rail also surfaces project-scoped **scratchpads** (markdown files) for human readability. + +An MCP server, in-process, exposes tools that let orchestrator agents spawn and drive children, run processes, set timers, message peers, and read/write scratchpads. The orchestrator is a real LLM CLI driving another LLM CLI as if it were a human user — keystroke injection in, rendered-grid scraping out. The orchestrator fully owns the content of what it sends; patterm only handles the plumbing. + +**Goal:** Let one SOTA agent orchestrate other agents of different types (claude → codex, codex → opencode, …) without subagent APIs, while keeping the whole thing steerable and observable by a human at any moment. + +**Non-goal:** Hosting any LLM. patterm only manages CLIs the user already has installed. patterm also doesn't ship hard-coded knowledge of any specific vendor CLI — agent presets are user-editable JSON; the three common ones (claude, codex, opencode) ship as defaults. + +--- + +## 2. Architecture and lifecycle + +**Single foreground process. No daemon, no detach.** + +The tool is one Go process that owns: the TUI, all PTYs, vt-emulated grids, session state, child state, scratchpad files, and an in-process MCP server. Killing the process kills everything inside it. There is no attach/detach, no project-keyed singleton, no socket-based reattachment. + +**Lifecycle:** + +1. User runs `patterm` in a project directory. +2. The process starts the TUI as a **blank canvas** — no sessions, no children, no scratchpad preview. Just the empty frame with the palette hint in the status line. The in-process MCP server initializes (bound to a per-PID unix socket for spawned children — see §10) and scratchpad metadata is loaded from disk, but nothing is rendered until the user opens a preset. +3. The user opens the palette (`Ctrl-K`), selects a preset, and the first session/process is launched. Subsequent sessions and children are spawned the same way (or by orchestrators via MCP). +4. On exit (Ctrl-D, `:quit`, terminal window close, SIGTERM, SIGHUP): the process sends SIGTERM to every child PTY with a short grace window, then SIGKILL, then exits. Scratchpads on disk are the only thing that survives. + +**Multiple invocations:** Running `patterm` twice in the same project starts two independent processes. They share scratchpad files on disk but nothing else. If this turns out to be a footgun in practice, a per-project lockfile can be added later — out of scope for v1. + +**Implications:** Closing the terminal window (or SSH dropping) ends the session and tears down every child. This is the deliberate trade — no orphan daemons, no socket discovery, no stale-state recovery, no multi-client coordination. The user's terminal window *is* the lifetime boundary. + +--- + +## 3. Project state layout + +Scratchpads (user data) live under `$XDG_DATA_HOME`; presets and config live under `$XDG_CONFIG_HOME`. + +``` +$XDG_DATA_HOME/patterm/ +└── projects/ + └── / + ├── meta.json # project path, last-opened, version + └── scratchpads/ + ├── notes.md + ├── todos.md + └── .md + +$XDG_CONFIG_HOME/patterm/ +├── config.json # global settings (theme, default keymap, etc.) +└── presets/ + ├── agents/ + │ ├── claude.json # ships as default + │ ├── codex.json # ships as default + │ ├── opencode.json # ships as default + │ └── .json + └── processes/ + ├── dev.json # e.g. { "name": "bun run dev", "argv": ["bun", "run", "dev"] } + ├── test.json + └── .json +``` + +Both preset directories are scanned at startup; every file found becomes a palette entry ("Spawn agent: claude", "Run process: bun run dev", …). Presets are project-agnostic in v1 — the same set is available in every project. Per-project overrides can be added later. + +Project key = `sha256(realpath(project_dir))[:16]`. Used only as a scratchpad directory name — there is no daemon to look up. + +Internal MCP socket (for spawned children to talk to the running process): `$XDG_RUNTIME_DIR/patterm/.sock`, falling back to `/tmp/patterm-.sock` if `XDG_RUNTIME_DIR` is unset. Created on startup, removed on exit. Per-PID, not per-project — it is a private IPC channel, not a discovery point. + +Scratchpads persist across runs. Sessions and child processes do not. + +--- + +## 4. UI / Client + +``` +┌────────────────────────────────────────────────────────┬──────────────────┐ +│ [codex-1] [codex-2] [claude-1] + │ Session tree │ +├────────────────────────────────────────────────────────┤ ────── │ +│ │ ▶ codex-1 │ +│ │ │ │ +│ │ ├─ ◉ claude-2 │ +│ │ ├─ ◉ claude-3 │ +│ (focused pane's PTY) │ ├─ ◉ claude-4 │ +│ │ └─ ◉ bun-dev │ +│ │ │ +│ │ Scratchpads │ +│ │ ────── │ +│ │ todos.md │ +│ │ notes.md │ +│ │ api-plan.md │ +│ │ │ +│ │ ┌────────────┐ │ +│ │ │ todos.md │ │ +│ │ │ preview… │ │ +│ │ └────────────┘ │ +├────────────────────────────────────────────────────────┴──────────────────┤ +│ [orchestrator driving] Ctrl-K command palette │ +└───────────────────────────────────────────────────────────────────────────┘ +``` + +- **Top tab bar:** one per top-level session. `+` opens the palette pre-filtered to "Spawn…" entries. +- **Main area:** the focused pane's PTY, rendered identically to viewing it in a regular terminal. The focused pane is either the orchestrator (root of the active session's tree) or one of its children, whichever the user last selected from the sidebar. +- **Right rail, top half — session tree:** the active session's process hierarchy, drawn as an indented tree with box-drawing connectors (`├─`, `└─`). The orchestrator is the root (`▶`); each child appears one level deeper with a status glyph (`◉` running, `✓` exited cleanly, `✗` errored). Selecting an entry (palette, arrow keys, or click) makes it the focused pane. v1 only has two levels because of the §8 two-level-tree rule, but the renderer should be tree-shaped from day one so a future depth bump doesn't require UI surgery. +- **Right rail, bottom half:** scratchpad list and a preview of the selected scratchpad. +- **Status line:** input-ownership toast ("orchestrator driving" / "you have control") on the left, palette hint on the right. + +**Empty state:** Until the user spawns their first preset, the top tab bar, main area, and sidebar all sit empty with a centred hint ("Press Ctrl-K to spawn an agent or process"). No "default session" is created. + +**Switching:** Clicking a top tab (or selecting one via the palette) switches the active session — the sidebar tree swaps to that session's hierarchy. Clicking a sidebar entry switches the focused pane within the current session. + +**Command palette (v1 input model):** + +Almost all application functions are driven through a single command palette opened with `Ctrl-K`. The palette is a fuzzy-searchable list of commands, scoped to whatever makes sense for the current focus. Two kinds of entries appear: + +- **Built-in commands** — "Switch to session…", "Focus pane…", "Take input control", "Release control to orchestrator", "Open scratchpad…", "Kill child…", "Quit", etc. +- **Preset commands** — one entry per file under `$XDG_CONFIG_HOME/patterm/presets/`. Agent presets surface as "Spawn agent: codex" / "Spawn agent: claude" / …; process presets surface as "Run process: bun run dev" / "Run process: vitest" / …. The label comes from the preset's `name` field; the action is "launch this preset into a new pane." + +Selecting a preset either launches it immediately (no required args) or opens a sub-palette for optional args — namely an **initial prompt** (agent presets only), which patterm injects into the spawned PTY's input after the agent is ready (§8). The orchestrator equivalent of this — `spawn_agent` / `spawn_process` MCP tools — uses the exact same machinery: pick a preset by name, optionally supply an initial prompt, patterm handles the rest. + +Rationale: the keybinding surface for sessions + children + scratchpads + control transfer + spawning gets large fast. A palette lets us ship the full feature set without committing to a key map yet, and gives the user a discoverable index of every action. Dedicated keybindings can be layered on top later for the few actions a user does often enough to memorize — they should be configured by binding to palette command IDs, not by re-implementing the action. + +Only two keybindings are reserved at the application level in v1: + +| Action | Binding | +|---|---| +| Open command palette | `Ctrl-K` | +| Pass-through prefix (everything else after this goes to the focused PTY untouched, e.g. for nested tmux/Ctrl-K-using TUIs) | `Ctrl-K Ctrl-K` | + +Everything else — session switching, child cycling, control transfer, quitting — lives in the palette for v1. + +--- + +## 5. PTY layer + +One PTY per session orchestrator and one per child. For each PTY the tool maintains: + +- The underlying process (pid, status, exit code on death). +- A raw byte ring buffer (default 1 MiB) for stream-mode reads. +- A vt-emulated character grid representing current visible state. +- Alt-screen flag (whether the process is in alternate-buffer mode, i.e. a TUI). +- Last-write timestamp (used for the idle heuristic). + +**Terminal emulator:** Go has limited options. Start with `vt10x` or a maintained fork. Budget real time — this is the load-bearing component for grid mode `read_output`. The emulator must handle: SGR colours (then strip them on read), cursor movement, alt-screen entry/exit, scroll regions, basic mouse passthrough where needed. + +**Resize:** On startup and on SIGWINCH, the tool reads its own terminal dimensions, computes per-pane winsize (accounting for tab bar, sidebar, status line), and `ioctl(TIOCSWINSZ)` each PTY. Children get SIGWINCH automatically. One process, one viewport — no multi-client resize negotiation. + +--- + +## 6. Input ownership + +Each pane has an owner flag: `user` or `orchestrator`. A toast / status-line glyph reflects current owner. + +- When the orchestrator spawns a child, that child defaults to orchestrator-owned. +- When the user focuses a pane and presses any key, ownership flips to `user`. The orchestrator can still write — bytes interleave. A warning toast appears: "Orchestrator is also driving this pane." +- The user explicitly returns ownership with the release key. + +No locking. The user's call if they collide. The visual indicator is the only protection. + +--- + +## 7. MCP tool surface + +The tool embeds an MCP server in-process. Each spawned agent gets an MCP config injected at spawn time (see §10) pointing at a stdio proxy subcommand of the same binary, which forwards JSON-RPC over the per-PID unix socket to the running process. Tool calls carry an implicit caller identity (which session / which child) derived from the connection. + +### Tools available to orchestrators only + +#### `spawn_agent` +- **Args:** `preset` (string — name of an agent preset under `$XDG_CONFIG_HOME/patterm/presets/agents/`), `initial_prompt` (string), `name?` (display name, defaults to `-`) +- **Behaviour:** Launches the agent preset in a new PTY as a child of the calling session. Wires MCP per the preset's injection strategy (§10). Waits for the preset's ready signal (default: 1s idle). Then types `initial_prompt` into the TUI input box and submits. patterm does not inject any other text — the caller's `initial_prompt` is the agent's first turn. If the caller wants the agent to know about the message-tag conventions (§8), tool availability, or its orchestrator role, the caller must say so in `initial_prompt`. +- **Returns:** `child_id`. +- **Error:** Returns an error if `preset` isn't a known agent preset. patterm has no built-in knowledge of vendor CLIs — everything is preset-driven. + +#### `send_message_to` +- **Args:** `target` (child_id), `message` (string) +- **Behaviour:** Types `[orchestrator] \n` into the target child's PTY. +- **Returns:** `ok`. + +#### `request_human_attention` +- **Args:** `child_id`, `reason` (string) +- **Behaviour:** Surfaces a notification in the TUI, blinks the sidebar entry for the child, optionally auto-focuses if the user setting allows it. Used by orchestrator when it wants to punt a decision (e.g. ambiguous permission prompt) to the human. +- **Returns:** `ok`. + +### Tools available to all agents + +#### `spawn_process` +- **Args:** One of: + - `preset` (string — name of a process preset under `$XDG_CONFIG_HOME/patterm/presets/processes/`), plus optional `working_dir?` / `env?` overrides; **or** + - `argv` (array of strings — freeform launch), with optional `working_dir?`, `env?`, and `shell?` (default `false`; when `true`, `argv` is interpreted as `["sh", "-lc", argv[0]]`-style). +- **Behaviour:** Launches the command in a new PTY, attached as a child of the calling agent's session. Presets are the preferred path; freeform `argv` is the escape hatch for one-offs the user hasn't pre-configured. No MCP injection (process children aren't agents). +- **Returns:** `child_id`. + +#### `read_output` +- **Args:** `child_id`, `mode` (`grid` | `stream`), `since_offset?` (stream mode only) +- **Behaviour:** + - `grid` mode: returns the current rendered visible grid as plain text, ANSI stripped, with best-effort trimming of detectable vendor chrome (top banner, bottom input box, status line) per agent-type heuristics. Use for TUI children. + - `stream` mode: returns raw byte content from `since_offset` to current write head, ANSI stripped. Use for line-mode processes. +- **Returns:** `{ content: string, new_offset: int, mode: "grid" | "stream" }`. +- **Note in tool description (visible to the calling agent):** "The grid result is the entire visible pane. You are responsible for locating the response to your last prompt within it." + +#### `send_input` +- **Args:** `child_id`, `input` (string), `append_newline?` (default `true`) +- **Behaviour:** Writes bytes to the child PTY's stdin. Used both for free-form input and for single-key confirmations (`y`, `n`). +- **Returns:** `ok`. + +#### `kill` +- **Args:** `child_id`, `signal?` (default `SIGTERM`) +- **Returns:** `ok`. + +#### `wait_for_pattern` +- **Args:** `child_id`, `pattern` (regex), `timeout_seconds` +- **Behaviour:** Blocks the calling agent until the rendered grid matches the regex or the timeout expires. Polls the grid at ~50ms intervals. +- **Returns:** `{ matched: bool, snippet?: string }`. + +#### `timer_wait` +- **Args:** `seconds`, `label?` (default auto-generated) +- **Behaviour:** Returns immediately with a `timer_id`. After `seconds`, the tool injects `[system] Your timer [