Rework §7 MCP tool surface
Replaces the original child_id-keyed tool set with a soloterm-inspired process-entry model: opaque process_ids, three kinds (agent/terminal/command), session-persistent command entries with disk-persisted trust grants, and a single bidirectional send_message in place of the send_message_to / report_to_parent split. Adds whoami, help, get_project_status, rename_process, select_process, close_process, search_output, get_process_ports, and a richer send_input with key/paste support and optional wait_ms tail. Updates §3 (trust.json), §8 (self-discovery via whoami/help), §9 (renamed tools in the permissions loop), §11 (idle exposed via list_processes), §14 (resolves the scratchpad-revision and trust-persistence questions, opens content-hashed trust), and §15 (build-order tool names).
This commit is contained in:
217
SPEC.md
217
SPEC.md
@@ -46,6 +46,7 @@ $XDG_DATA_HOME/patterm/
|
||||
└── projects/
|
||||
└── <project-key>/
|
||||
├── meta.json # project path, last-opened, version
|
||||
├── trust.json # persisted command-preset trust grants (§7)
|
||||
└── scratchpads/
|
||||
├── notes.md
|
||||
├── todos.md
|
||||
@@ -71,7 +72,7 @@ Project key = `sha256(realpath(project_dir))[:16]`. Used only as a scratchpad di
|
||||
|
||||
Internal MCP socket (for spawned children to talk to the running process): `$XDG_RUNTIME_DIR/patterm/<pid>.sock`, falling back to `/tmp/patterm-<pid>.sock` if `XDG_RUNTIME_DIR` is unset. Created on startup, removed on exit. Per-PID, not per-project — it is a private IPC channel, not a discovery point.
|
||||
|
||||
Scratchpads persist across runs. Sessions and child processes do not.
|
||||
Scratchpads and command-preset trust grants persist across runs. Sessions and child processes do not — every patterm run starts with an empty process tree.
|
||||
|
||||
---
|
||||
|
||||
@@ -165,80 +166,163 @@ No locking. The user's call if they collide. The visual indicator is the only pr
|
||||
|
||||
## 7. MCP tool surface
|
||||
|
||||
The tool embeds an MCP server in-process. Each spawned agent gets an MCP config injected at spawn time (see §10) pointing at a stdio proxy subcommand of the same binary, which forwards JSON-RPC over the per-PID unix socket to the running process. Tool calls carry an implicit caller identity (which session / which child) derived from the connection.
|
||||
The tool embeds an MCP server in-process. Each spawned agent gets an MCP config injected at spawn time (see §10) pointing at a stdio proxy subcommand of the same binary, which forwards JSON-RPC over the per-PID unix socket to the running process. Tool calls carry an implicit caller identity (which session / which process) derived from the connection.
|
||||
|
||||
### Tools available to orchestrators only
|
||||
### Concepts shared by all tools
|
||||
|
||||
#### `spawn_agent`
|
||||
- **Args:** `preset` (string — name of an agent preset under `$XDG_CONFIG_HOME/patterm/presets/agents/`), `initial_prompt` (string), `name?` (display name, defaults to `<preset>-<n>`)
|
||||
- **Behaviour:** Launches the agent preset in a new PTY as a child of the calling session. Wires MCP per the preset's injection strategy (§10). Waits for the preset's ready signal (default: 1s idle). Then types `initial_prompt` into the TUI input box and submits. patterm does not inject any other text — the caller's `initial_prompt` is the agent's first turn. If the caller wants the agent to know about the message-tag conventions (§8), tool availability, or its orchestrator role, the caller must say so in `initial_prompt`.
|
||||
- **Returns:** `child_id`.
|
||||
- **Error:** Returns an error if `preset` isn't a known agent preset. patterm has no built-in knowledge of vendor CLIs — everything is preset-driven.
|
||||
- **Process IDs.** Every spawnable thing — agents, terminals, commands — is addressed by an opaque short token (e.g. `p_a1b2c3`), not by OS PID. IDs are stable for the lifetime of the entry: they survive stop/restart for stored command entries; they are released when an agent or terminal exits and is `close_process`'d. Each entry also has a human-readable display name (default `<kind>-<n>`, settable via `rename_process` or the `name` arg on spawn).
|
||||
- **Process kinds.**
|
||||
- `agent` — a vendor LLM CLI launched from an agent preset (§10). MCP-wired. Ephemeral: lost when the underlying PTY exits.
|
||||
- `terminal` — a bare interactive shell. Defaults to `$SHELL -i`. Ephemeral.
|
||||
- `command` — a process preset (e.g. `bun run dev`, `vitest --watch`) or freeform argv. **Session-persistent**: a command entry survives PTY exit so it can be `restart_process`'d, and is removed only when `close_process` is called or patterm exits.
|
||||
- **Trust gating.** Command entries that were authored as presets are *not* trusted by default. The first time an agent attempts to `spawn_process(kind: 'command', preset: …)`, `start_process`, or `restart_process` against an untrusted command preset, the call returns a `needs_trust` error and patterm surfaces a UI confirmation in the focused tab. Once the user confirms, the trust grant is **persisted to disk** (`$XDG_DATA_HOME/patterm/projects/<key>/trust.json`, see §3), so the user only confirms each preset once per project — not once per patterm run. Trust is keyed by `(project, preset name)` in v1; content-hashed trust (re-confirming on edit) is a v2 question (§14). Freeform-argv command entries are trusted implicitly at spawn time because the agent already had to compose the argv, and they are not written to the trust file.
|
||||
- **Caller role.** Every connection has a role: `orchestrator` (root of a session tree), `sub-agent` (an agent spawned by an orchestrator), or `process` (commands/terminals — these don't talk MCP, but they appear as targets). Role gates which tools the caller may invoke. Calls disallowed by role return a structured error explaining why, so the agent can adapt rather than silently fail.
|
||||
- **Idle / readiness.** Every PTY-backed entry tracks `idle_ms` (ms since last write to its master). Tools that read state surface this so callers can decide when a target is "done" without polling raw bytes themselves (§11).
|
||||
|
||||
#### `send_message_to`
|
||||
- **Args:** `target` (child_id), `message` (string)
|
||||
- **Behaviour:** Types `[orchestrator] <message>\n` into the target child's PTY.
|
||||
- **Returns:** `ok`.
|
||||
### Lifecycle and spawning
|
||||
|
||||
#### `request_human_attention`
|
||||
- **Args:** `child_id`, `reason` (string)
|
||||
- **Behaviour:** Surfaces a notification in the TUI, blinks the sidebar entry for the child, optionally auto-focuses if the user setting allows it. Used by orchestrator when it wants to punt a decision (e.g. ambiguous permission prompt) to the human.
|
||||
- **Returns:** `ok`.
|
||||
|
||||
### Tools available to all agents
|
||||
#### `spawn_agent` — orchestrator-only
|
||||
- **Args:** `agent` (preset name under `presets/agents/`), `agent_instructions` (string — the first turn typed into the agent's TUI after ready), `name?` (display name).
|
||||
- **Behaviour:** Launches the agent preset in a new PTY as a child of the calling session. Wires MCP per the preset's injection strategy (§10). Waits for the preset's `ready_signal` (default: 1s idle), then types `agent_instructions` into the input box and submits. patterm injects nothing else — the spawned agent learns its role and conventions either from `agent_instructions` or by calling `whoami` / `help` itself.
|
||||
- **Returns:** `{ process_id, name }`.
|
||||
- **Errors:** `unknown_agent` if the preset is missing; `role_forbidden` if a sub-agent calls it (with a message pointing the caller at its parent or at vendor-native subagent tooling).
|
||||
|
||||
#### `spawn_process`
|
||||
- **Args:** One of:
|
||||
- `preset` (string — name of a process preset under `$XDG_CONFIG_HOME/patterm/presets/processes/`), plus optional `working_dir?` / `env?` overrides; **or**
|
||||
- `argv` (array of strings — freeform launch), with optional `working_dir?`, `env?`, and `shell?` (default `false`; when `true`, `argv` is interpreted as `["sh", "-lc", argv[0]]`-style).
|
||||
- **Behaviour:** Launches the command in a new PTY, attached as a child of the calling agent's session. Presets are the preferred path; freeform `argv` is the escape hatch for one-offs the user hasn't pre-configured. No MCP injection (process children aren't agents).
|
||||
- **Returns:** `child_id`.
|
||||
- **Args:** `kind` (`terminal` | `command`), one of `preset` (name under `presets/processes/`) or `argv` (string array), `name?`, `working_dir?` (default project root), `env?`, `shell?` (only valid with `argv`; default `false`).
|
||||
- For `kind: terminal`, `argv` is optional — defaults to `$SHELL -i`.
|
||||
- For `kind: command`, exactly one of `preset` or `argv` must be supplied.
|
||||
- **Behaviour:** Creates a process entry, attached as a child of the calling agent's session, and starts it. No MCP injection (these aren't agents). `command` entries are persisted to the session for later `restart_process` / `start_process`; `terminal` entries are ephemeral.
|
||||
- **Returns:** `{ process_id, name }`.
|
||||
- **Errors:** `needs_trust` if `kind: command` references an untrusted preset.
|
||||
|
||||
#### `read_output`
|
||||
- **Args:** `child_id`, `mode` (`grid` | `stream`), `since_offset?` (stream mode only)
|
||||
- **Behaviour:**
|
||||
- `grid` mode: returns the current rendered visible grid as plain text, ANSI stripped, with best-effort trimming of detectable vendor chrome (top banner, bottom input box, status line) per agent-type heuristics. Use for TUI children.
|
||||
- `stream` mode: returns raw byte content from `since_offset` to current write head, ANSI stripped. Use for line-mode processes.
|
||||
- **Returns:** `{ content: string, new_offset: int, mode: "grid" | "stream" }`.
|
||||
- **Note in tool description (visible to the calling agent):** "The grid result is the entire visible pane. You are responsible for locating the response to your last prompt within it."
|
||||
#### `start_process`
|
||||
- **Args:** `process_id`.
|
||||
- **Behaviour:** Starts a stored `command` entry that is currently in `stopped` or `exited` state. No-op on a running entry (returns the existing state).
|
||||
- **Returns:** `{ process_id, status }`.
|
||||
- **Errors:** `not_found`, `wrong_kind` (only command entries are start-able post-creation), `needs_trust`.
|
||||
|
||||
#### `send_input`
|
||||
- **Args:** `child_id`, `input` (string), `append_newline?` (default `true`)
|
||||
- **Behaviour:** Writes bytes to the child PTY's stdin. Used both for free-form input and for single-key confirmations (`y`, `n`).
|
||||
#### `restart_process`
|
||||
- **Args:** `process_id`, `signal?` (default `SIGTERM` for the stop phase).
|
||||
- **Behaviour:** Stops the entry if running (grace window then SIGKILL), then starts it again with the same argv/env/working_dir. Valid for `command` entries; valid for `agent` and `terminal` entries only while their PTY is still live (since they have no stored definition to rehydrate from).
|
||||
- **Returns:** `{ process_id, status }`.
|
||||
- **Errors:** `not_found`, `needs_trust` (command presets), `wrong_kind` (trying to restart an exited agent/terminal).
|
||||
|
||||
#### `stop_process`
|
||||
- **Args:** `process_id`, `signal?` (default `SIGTERM`).
|
||||
- **Behaviour:** Sends the signal to the entry's PTY, with the standard grace window before SIGKILL.
|
||||
- **Returns:** `{ process_id, status }`.
|
||||
|
||||
#### `close_process`
|
||||
- **Args:** `process_id`.
|
||||
- **Behaviour:** Removes the entry from the session entirely. If still running, stops it first. Used to clear stored command entries the orchestrator no longer needs, and to clean up exited agent/terminal ghosts from the sidebar.
|
||||
- **Returns:** `ok`.
|
||||
|
||||
#### `kill`
|
||||
- **Args:** `child_id`, `signal?` (default `SIGTERM`)
|
||||
#### `rename_process`
|
||||
- **Args:** `process_id`, `name`.
|
||||
- **Returns:** `ok`. Updates the display name in the sidebar and tab bar.
|
||||
|
||||
#### `select_process`
|
||||
- **Args:** `process_id`.
|
||||
- **Behaviour:** Asks the UI to focus the given pane (switches session tab if needed). Non-blocking, advisory — distinct from `request_human_attention`, which raises a notification and expects a human decision.
|
||||
- **Returns:** `ok`.
|
||||
|
||||
### Inspection
|
||||
|
||||
#### `list_processes`
|
||||
- **Args:** `kind?` (filter by `agent` | `terminal` | `command`).
|
||||
- **Returns:** Array of `{ process_id, name, kind, status, parent_process_id, exit_code?, idle_ms?, trusted? }` for the caller's session. `status ∈ { starting, running, stopped, exited, errored }`.
|
||||
|
||||
#### `get_process_status`
|
||||
- **Args:** `process_id`.
|
||||
- **Returns:** `{ process_id, name, kind, status, parent_process_id, working_dir, argv?, exit_code?, started_at, idle_ms, active_screen: "main" | "alternate", rows, cols, cursor: { x, y }, trusted?, screen_version }`.
|
||||
|
||||
#### `get_project_status`
|
||||
- **Args:** none.
|
||||
- **Returns:** `{ project: { path, key }, caller: { process_id, role, name, parent_process_id?, available_tools: [string] }, processes: [<list_processes entry>], scratchpads: [{ name, size, modified_at, revision }] }`. Everything an agent needs to orient itself in one call.
|
||||
|
||||
#### `get_process_output`
|
||||
- **Args:** `process_id`, `mode` (`grid` | `stream`), `since_offset?` (stream mode only).
|
||||
- **Behaviour:** `grid` returns the current visible pane as plain text, ANSI stripped, with best-effort vendor-chrome trim per preset hints (§10). `stream` returns ANSI-stripped bytes from `since_offset` to the current write head.
|
||||
- **Returns:** `{ content, mode, new_offset?, active_screen, rows, cols, cursor, idle_ms, status, screen_version }`.
|
||||
- **Tool-description note (shown to the calling agent):** "The grid result is the entire visible pane. You are responsible for locating the response to your last prompt within it. Use `search_output` if you have a specific marker to find."
|
||||
|
||||
#### `get_process_raw_output`
|
||||
- **Args:** `process_id`, `since_offset?`.
|
||||
- **Behaviour:** Returns raw bytes from `since_offset`, escape sequences preserved. Used when the agent needs to inspect control codes (rare).
|
||||
- **Returns:** `{ content, new_offset, status }`.
|
||||
|
||||
#### `search_output`
|
||||
- **Args:** `process_id`, `pattern` (regex), `kind` (`rendered` | `raw`), `limit?` (default 20).
|
||||
- **Returns:** `{ matches: [{ line_no, text }], truncated: bool }`. Searches scrollback (not just the visible grid).
|
||||
|
||||
#### `wait_for_pattern`
|
||||
- **Args:** `child_id`, `pattern` (regex), `timeout_seconds`
|
||||
- **Behaviour:** Blocks the calling agent until the rendered grid matches the regex or the timeout expires. Polls the grid at ~50ms intervals.
|
||||
- **Returns:** `{ matched: bool, snippet?: string }`.
|
||||
- **Args:** `process_id`, `pattern` (regex), `timeout_seconds`, `scope?` (`grid` | `scrollback`, default `grid`).
|
||||
- **Behaviour:** Blocks the calling agent until the chosen surface matches the regex, or the timeout expires. Polls at ~50ms.
|
||||
- **Returns:** `{ matched: bool, snippet?: string }`. Used in the §9 permissions-prompt-clear flow.
|
||||
|
||||
#### `get_process_ports`
|
||||
- **Args:** `process_id`.
|
||||
- **Returns:** `{ ports: [{ port, url?, first_seen_at }] }`. Best-effort: patterm watches the stream for `:NNNN` and `http://…` patterns and reports what it has seen. No probing.
|
||||
|
||||
### I/O
|
||||
|
||||
#### `send_input`
|
||||
- **Args:** `process_id`, `kind` (`text` | `paste` | `key`), and:
|
||||
- For `text`: `text` (string), `submit?` (default `true` — appends Enter).
|
||||
- For `paste`: `text` (string). Sent via bracketed paste (`\e[200~ … \e[201~`) when the target's emulator state indicates support; otherwise falls back to chunked text writes without trailing newline.
|
||||
- For `key`: `key` (one of `enter`, `tab`, `escape`, `backspace`, `ctrl-c`, `ctrl-d`, `up`, `down`, `left`, `right`, `home`, `end`, `page-up`, `page-down`, `f1`…`f12`). Encoded via the emulator's key-encoding (Kitty keyboard protocol where negotiated, legacy escapes otherwise).
|
||||
- **Optional tail:** `wait_ms?` (default `0`), `tail_mode?` (`none` | `stream` | `grid`, default `stream` when `wait_ms > 0`). When `wait_ms > 0`, the call blocks for that many milliseconds after sending and then returns the tail in the chosen mode.
|
||||
- **Returns:** `{ ok: true, tail?: { content, mode, new_offset?, active_screen, idle_ms, screen_version } }`.
|
||||
|
||||
### Coordination
|
||||
|
||||
#### `send_message`
|
||||
- **Args:** `target_process_id`, `message` (string).
|
||||
- **Behaviour:** Delivers a tagged message into the target's PTY. Direction is inferred from the relationship between caller and target:
|
||||
- parent → child: prepended with `[orchestrator] `.
|
||||
- child → parent: prepended with `[sub-agent:<caller_name>] `.
|
||||
- **Returns:** `ok`.
|
||||
- **Errors:** `not_related` if the target is neither the caller's parent nor a child of the caller (siblings must route through the parent in v1).
|
||||
|
||||
#### `request_human_attention`
|
||||
- **Args:** `process_id`, `reason` (string).
|
||||
- **Behaviour:** Notification in the TUI, blinks the sidebar entry, optionally auto-focuses per user setting. The escape hatch when the orchestrator can't safely decide.
|
||||
- **Returns:** `ok`.
|
||||
|
||||
#### `timer_wait`
|
||||
- **Args:** `seconds`, `label?` (default auto-generated)
|
||||
- **Behaviour:** Returns immediately with a `timer_id`. After `seconds`, the tool injects `[system] Your timer [<label>] has completed.\n` into the calling agent's pane.
|
||||
- **Returns:** `{ timer_id: string }`.
|
||||
- **Args:** `seconds`, `label?`.
|
||||
- **Behaviour:** Returns a `timer_id` immediately. After `seconds`, injects `[system] Your timer [<label>] has completed.\n` into the caller's pane.
|
||||
- **Returns:** `{ timer_id }`.
|
||||
|
||||
#### `list_children`
|
||||
- **Args:** none
|
||||
- **Returns:** Array of `{ child_id, name, type, status, exit_code? }` for the calling agent's session.
|
||||
### Scratchpads
|
||||
|
||||
All scratchpad reads return a `revision` token (an opaque short hash of the file contents at read time). Writes may optionally supply `expected_revision` for last-write-wins-with-detection; mismatches return `{ ok: false, current_revision }` without writing, so the caller can re-read and merge.
|
||||
|
||||
#### `scratchpad_list`
|
||||
- **Returns:** Array of `{ name, size, modified_at }`.
|
||||
- **Returns:** `[{ name, size, modified_at, revision }]`.
|
||||
|
||||
#### `scratchpad_read`
|
||||
- **Args:** `name`
|
||||
- **Returns:** `{ content: string }`.
|
||||
- **Args:** `name`.
|
||||
- **Returns:** `{ content, revision }`.
|
||||
|
||||
#### `scratchpad_write`
|
||||
- **Args:** `name`, `content` (full replacement)
|
||||
- **Returns:** `ok`.
|
||||
- **Args:** `name`, `content`, `expected_revision?`.
|
||||
- **Returns:** `{ ok: true, revision } | { ok: false, current_revision }`.
|
||||
|
||||
#### `scratchpad_append`
|
||||
- **Args:** `name`, `content`
|
||||
- **Returns:** `ok`.
|
||||
- **Args:** `name`, `content`.
|
||||
- **Returns:** `{ ok: true, revision }`. Appends are unconditional — concurrent appends interleave at write time but never lose data.
|
||||
|
||||
### Meta
|
||||
|
||||
#### `whoami`
|
||||
- **Args:** none.
|
||||
- **Returns:** `{ process_id, name, role, parent_process_id?, project: { path, key }, available_tools: [string] }`. The `available_tools` field is the authoritative answer to "what can I call from here" — agents should consult it rather than guessing from their training distribution.
|
||||
|
||||
#### `help`
|
||||
- **Args:** `topic?` (string).
|
||||
- **Behaviour:** Returns topic-specific guidance for the caller's role. With no argument, returns the list of topics plus a one-line orientation. Topics in v1: `spawning`, `inspection`, `io`, `coordination`, `scratchpads`, `timers`, `readiness`, `permissions`, `conventions`, `topics`. The `conventions` topic documents the `[orchestrator]` / `[sub-agent:<name>]` / `[system]` tag protocol so a sub-agent that wasn't briefed by its parent can still learn it.
|
||||
- **Returns:** `{ topic, content, related_tools: [string] }`.
|
||||
|
||||
---
|
||||
|
||||
@@ -248,14 +332,16 @@ patterm does **not** inject any framing or system-prompt text into spawned agent
|
||||
|
||||
That said, when patterm relays messages programmatically between agents or surfaces lifecycle events, it tags them so the receiving agent can distinguish sources. These tags are the patterm convention; agents will encounter them in their input and are expected to recognize them from context (or because their parent explained them in the initial prompt).
|
||||
|
||||
- `[orchestrator] <msg>` — prepended when `send_message_to` delivers a message from a parent to a child.
|
||||
- `[sub-agent:<name>] <msg>` — prepended when `report_to_parent` delivers a message from a child to its parent.
|
||||
- `[orchestrator] <msg>` — prepended when `send_message` delivers a message from a parent to a child.
|
||||
- `[sub-agent:<name>] <msg>` — prepended when `send_message` delivers a message from a child to its parent.
|
||||
- `[system] <msg>` — patterm itself (timer fires, child exited, etc.).
|
||||
- Direct user typing is **not** prefixed. The user sees the pane and types normally; the agent receives the keystrokes as-is.
|
||||
|
||||
No "ready" handshake. patterm treats the agent as ready once its PTY hits the preset's `ready_signal` (default: 1s idle after launch — see §10). The very first thing the agent receives after that point is whatever the caller passed as `initial_prompt`.
|
||||
Agents that weren't briefed by their parent can self-discover their role, parent, project, and the tag conventions by calling `whoami` and `help('conventions')` (§7). This is the supported substitute for the SPEC having no system-prompt injection — the conventions live in the tool surface, not in an injected preamble.
|
||||
|
||||
Two-level tree only. Sub-agents cannot call `spawn_agent`.
|
||||
No "ready" handshake. patterm treats the agent as ready once its PTY hits the preset's `ready_signal` (default: 1s idle after launch — see §10). The very first thing the agent receives after that point is whatever the caller passed as `agent_instructions`.
|
||||
|
||||
Two-level tree only. Sub-agents cannot call `spawn_agent` — the call returns a `role_forbidden` error that explains the rule and points at vendor-native subagent tooling.
|
||||
|
||||
---
|
||||
|
||||
@@ -265,11 +351,11 @@ Sub-agents are launched with vendor permissions **on** — the orchestrator driv
|
||||
|
||||
Loop:
|
||||
|
||||
1. Orchestrator sends a message to a sub-agent via `send_message_to`.
|
||||
1. Orchestrator sends a message to a sub-agent via `send_message`.
|
||||
2. Sub-agent runs, eventually hits a tool-use confirmation in its TUI ("Allow Bash(rm -rf foo)? [y/N]").
|
||||
3. Sub-agent goes idle (cursor stops animating, no byte writes for 1s).
|
||||
4. Orchestrator's loop calls `read_output(child_id, mode="grid")`, sees the prompt, decides, and calls `send_input(child_id, "y")` or `"n"`.
|
||||
5. If the orchestrator can't safely decide, it calls `request_human_attention(child_id, "Sub-agent wants to run X, looks destructive, need your call")`. The orchestrator then waits (using `wait_for_pattern` or repeated reads) until the prompt is no longer on screen.
|
||||
3. Sub-agent goes idle (cursor stops animating, no byte writes for 1s — exposed as `idle_ms` on `get_process_status` / `list_processes`).
|
||||
4. Orchestrator's loop calls `get_process_output(process_id, mode="grid")`, sees the prompt, decides, and calls `send_input(process_id, kind="key", key="y")` or `"n"` (or `kind="text"` with `text="y"`, `submit=true`).
|
||||
5. If the orchestrator can't safely decide, it calls `request_human_attention(process_id, "Sub-agent wants to run X, looks destructive, need your call")`. The orchestrator then waits (using `wait_for_pattern` or repeated reads) until the prompt is no longer on screen.
|
||||
|
||||
Risks acknowledged: the orchestrator's reading of the prompt is a vision/parsing problem on rendered text. We trust a SOTA model to handle this correctly. The `request_human_attention` punt is the safety valve.
|
||||
|
||||
@@ -323,7 +409,7 @@ Caveats and mitigations:
|
||||
|
||||
- LLM provider hiccups can cause >1s gaps mid-stream. Per-agent tuning of the idle threshold is allowed in the preset.
|
||||
- Orchestrators should treat idle as a signal to *read*, not as a guarantee of completion. If the read returns something ambiguous, they can `wait_for_pattern` with a known terminal marker (e.g. the agent's input prompt) for stronger evidence.
|
||||
- The tool exposes idle state via `list_children` so orchestrators don't need to poll byte streams directly.
|
||||
- The tool exposes idle state via `list_processes` / `get_process_status` so orchestrators don't need to poll byte streams directly.
|
||||
|
||||
---
|
||||
|
||||
@@ -331,7 +417,7 @@ Caveats and mitigations:
|
||||
|
||||
| Failure | Behaviour |
|
||||
|---|---|
|
||||
| Sub-agent process exits unexpectedly | Sidebar marks child as exited, exit code preserved. Orchestrator's next `read_output` returns final grid + exit metadata. |
|
||||
| Sub-agent process exits unexpectedly | Sidebar marks child as exited, exit code preserved. Orchestrator's next `get_process_output` returns final grid + exit metadata. |
|
||||
| Vendor CLI hangs without exiting | Looks idle. Orchestrator must use `wait_for_pattern` or `request_human_attention` to escape. |
|
||||
| Tool process crashes | All PTYs are children of the tool's process group; OS cleans them up (process-group SIGHUP on terminal close, PTY master close, parent-death signal on Linux). On macOS treat cleanup as best-effort; scratchpads on disk survive. |
|
||||
| User closes the terminal window / SSH drops | Process receives SIGHUP, cascades SIGTERM → SIGKILL to every child, exits. Everything inside the tool dies with it. This is the intended model. |
|
||||
@@ -361,7 +447,8 @@ Caveats and mitigations:
|
||||
|
||||
- **Vt emulator library.** Resolved in the closing note — `libghostty-vt` is the bet, with `vt10x` / `charmbracelet/x/vt` as fallback only.
|
||||
- **MCP transport.** Resolved — in-process MCP core with a `mcp-stdio` proxy subcommand for spawned children (see §7 and §10). Streamable HTTP can be added later.
|
||||
- **Scratchpad concurrency.** Two agents writing the same scratchpad: last-write-wins with a revision token (see addendum item 7 in the closing note). Agents are expected to coordinate.
|
||||
- **Scratchpad concurrency.** Resolved — `scratchpad_read` / `scratchpad_write` carry an opaque `revision` token; writes may supply `expected_revision` for optimistic last-write-wins (see §7). Appends are unconditional.
|
||||
- **Cross-restart trust persistence for command presets.** Resolved — trust state is persisted to disk (see §3) so the user doesn't re-confirm every patterm run. Open: whether trust should be tied to the preset *contents* (hash) so editing a trusted preset re-triggers confirmation. v1 keys trust by preset name; v2 may upgrade to content-hashed trust.
|
||||
- **Default presets that ship in the box.** claude / codex / opencode is the working set; trimming to two for the first cut is fine since presets are user-editable anyway.
|
||||
- **Per-project preset overrides.** v1 has a single global preset directory. Whether `./.patterm/presets/` should override per-project is a v2 question.
|
||||
|
||||
@@ -372,9 +459,9 @@ Caveats and mitigations:
|
||||
1. Single-process skeleton: TUI bootstraps, owns the terminal, handles SIGWINCH / SIGHUP / SIGTERM, exits cleanly.
|
||||
2. Single PTY per session + vt emulator + tab bar UI + basic input/render.
|
||||
3. Multi-session, multi-child (sidebar) with raw process spawning, process groups, kill cascade on exit (no MCP yet).
|
||||
4. In-process MCP server + `mcp-stdio` proxy subcommand + per-PID unix socket + `spawn_process` / `read_output` / `send_input` / `kill` / `wait_for_pattern`.
|
||||
5. `spawn_agent` preset for one agent (probably claude), conversation tag conventions, `initial_prompt` injection (typed into the TUI input after ready).
|
||||
6. Scratchpads, `timer_wait`, `request_human_attention`, `send_message_to`, `report_to_parent`.
|
||||
4. In-process MCP server + `mcp-stdio` proxy subcommand + per-PID unix socket + `spawn_process` / `get_process_output` / `send_input` / `stop_process` / `wait_for_pattern` / `list_processes` / `whoami` / `help`.
|
||||
5. `spawn_agent` for one agent (probably claude), conversation tag conventions, `agent_instructions` injection (typed into the TUI input after ready).
|
||||
6. Scratchpads (with revisions), `timer_wait`, `request_human_attention`, `send_message`.
|
||||
7. Second and third agent presets, chrome-trim heuristics.
|
||||
8. Polish: command palette, status indicators, error UX.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user