docs: add daemon client implementation plan

This commit is contained in:
2026-05-27 13:25:59 +01:00
parent ec0c148164
commit 08c7405c79

266
docs/daemon-client-plan.md Normal file
View File

@@ -0,0 +1,266 @@
# patterm: persistent daemon + thin networked client — implementation plan
Status: proposed (for peer review). Branch: `feat/daemon-client-split`.
## Goal
Turn patterm from a single foreground process into a persistent background
**daemon** that owns all process/project state, plus a thin **client** that
renders and forwards input. A client on another LAN device can attach,
navigate projects via the command palette, detach, and reconnect — with child
processes surviving across client disconnects.
## Locked decisions
1. **Scope:** build all phases; land as one PR off this branch.
2. **Remote access:** human UI clients only. MCP for agents stays local
(per-daemon unix socket); no remote MCP transport in this work.
3. **Multi-client = per-client independent view.** The daemon holds pure
process/project state. Each client connection owns a `ClientView`
(selected project, focused pane/pad, scroll offset, palette state,
terminal size). Two clients may sit on different projects at once.
4. **Daemon lifecycle:** auto-start on demand (tmux/docker model). `patterm`
starts the daemon if absent and attaches; `patterm daemon stop|ls` manage it.
5. **Durability:** "persistent" = survive client disconnect while the daemon
process lives. Daemon restart only rehydrates today's persist model
(top-level commands, fresh IDs). No attempt to resurrect live PTYs/agents
after daemon death.
6. **Auth (trusted-network stance):** Harry runs this on a trusted LAN and is
fine with LAN exposure. Keep it lightweight: localhost default, opt-in LAN
bind (`--listen`), a simple pairing/bearer token to prevent accidental
drive-by access. TLS/cert-pinning is NOT required now but the transport must
stay pluggable so TLS can be layered in later.
7. **Detach gesture:** explicit detach via a palette command and/or a dedicated
host chord. Ctrl-D stays as PTY input (shell EOF), as today. Quit-project and
stop-daemon are explicit actions.
## Current architecture (baseline facts — verify before editing)
- `app.Run` (`internal/app/app.go:49`) wires the entire process: presets,
settings, scratchpad/trust/persist stores, in-process MCP server, ONE
`Session`, the `uiState` TUI, classifier, SIGWINCH, 60Hz chrome ticker,
blocking `stdinLoop`.
- **The seam:** `ChildEventListener` (`internal/app/session.go:83`) —
`OnChildSpawned`/`OnChildExited`/`OnPTYOut`/`OnChildStateChanged`/
`OnChildClosed`. Today `uiState` is the only real listener (subscribed at
`app.go:198`). A remote client = a serialized listener + reverse command
channel.
- One `Session` (`session.go:28`) holds a flat `children map[string]*Child` +
`order`. Tabs are derived: `KindAgent` children with `ParentID==""`
(`tree.go` `runningTopLevels`). The whole tree is reconstructed from
`Child.ParentID`.
- `Child` (`child.go:72`) owns `*pty.PTY`, `*vt.GhosttyEmulator`, raw ring,
status/owner atomics. Lifecycle: `Session.Spawn` (`session.go:222`) →
`startPTY``pumpChild` (`session.go:423`, PTY→emulator→ring→`emitPTYOut`)
+ `reapChild` (`session.go:488`, exit→`killDescendantsOf`).
- Stores already keyed by projectKey on `Open`
(`scratchpad`/`trust`/`persist`); `projectkey.Key(dir)` =
`sha256(realpath)[:16]`.
- `SerializeChild` (`session.go:687`) already yields a full VT snapshot for
stateless repaint.
- Rendering writes ANSI to `os.Stdout` under `outMu`; `viewportRenderer`
(`internal/app/viewport_renderer.go`) is a stateful ANSI rewriter confining
child output to the viewport. Input: raw `os.Stdin` via `stdinLoop`
(`app.go:1433`)/`processStdin`.
- MCP: in-process `Server` (`internal/mcp/mcp.go:26`), newline-JSON over a
per-PID unix socket `$XDG_RUNTIME_DIR/patterm/<pid>.sock`. Agents launch
`patterm mcp-stdio --socket S --identity T`. Identity → `callerID` via
`host.ResolveCallerIdentity``Session.FindChildByIdentity`.
- **No TCP/TLS anywhere today.** All `net.Listen`/`net.Dial` are unix sockets.
- **Must-fix:** `pty.Start` (`internal/pty/pty.go:26`) does not set `cmd.Dir`;
today the process `os.Chdir`s once. A daemon can't chdir globally, so
`SpawnSpec.WorkDir` must propagate to `exec.Cmd.Dir`.
## Target component model
| Component | Owns |
|---|---|
| `internal/daemon` (`pattermd`) | Project registry (N `Session`s), all PTYs, emulators, MCP server, per-project stores, classifier, timers. No TTY. |
| `internal/client` (`patterm`) | Real terminal: raw mode, alt-screen, SIGWINCH, stdin/stdout; `uiState`, `viewportRenderer`, chrome draws, palette, input. Holds `ClientView`. |
| `internal/transport` | `Transport` interface + framing; loopback, unix, TCP/TLS impls; auth handshake. |
| `internal/protocol` | Wire message types shared by daemon + client. |
### `Transport` interface (migration linchpin)
```go
type Transport interface {
Send(Frame) error // client→daemon command, or daemon→client push
Recv() (Frame, error)
Close() error
}
```
- **Loopback impl:** in-process channels, zero serialization. Default
`patterm` = client + loopback daemon in one process → today's UX preserved
exactly, single binary.
- **Net impl:** framed JSON-per-line over `net.Conn`, reusing the
`mcp.go:handleConn` pattern; unix socket first, then TCP/TLS.
### Per-client state vs daemon state
```go
// daemon-side, pure process/project state
type Registry struct { projects map[string]*Project } // key = projectKey
type Project struct {
Key, Dir, Name string
Session *Session
Pads *scratchpad.Store
Trust *trust.Store
Persist *persist.Store
Launcher *Launcher
Host *ToolHost
}
// per-connection, client-owned view state (lives client-side; daemon tracks
// only what it must to size emulators + route subscriptions)
type ClientView struct {
ID string
ProjectKey string // which project this client is looking at
FocusedID string // pane (Child) or pad
ScrollOff int
Cols, Rows uint16
// palette state is fully client-local
}
```
Project switch = re-point this client's subscription to another `Project`'s
Session + send `chrome` + `pane_snapshot`. No process teardown.
### Wire protocol (control + UI channel)
Bidirectional framed JSON-per-line.
Daemon → client:
- `hello` / `auth_challenge` / `auth_ok` — handshake.
- `project_list``[{key, path, name, last_active, tab_count}]` for the
palette switcher.
- `chrome` — semantic model for the client's current project+view: tab list
(`runningTopLevels`), sidebar tree (`sidebarNav`), status/owner, toasts,
scratchpad list + selected preview. Client draws chrome locally
(reuses `tabbar.go`/`sidebar.go`).
- `pane_snapshot{paneID, vtBytes}` — full repaint on focus/attach/switch via
`SerializeChild`.
- `pane_chunk{paneID, bytes}` — live focused-pane PTY output (serialized
`OnPTYOut`).
- `lifecycle{spawned|exited|closed|stateChanged,...}` — serialized listener.
- `attention` / `trust_prompt` — human-facing surfaces; render on the client
whose view owns the relevant project.
Client → daemon:
- `attach{token, term_size, project_key?}` / `detach`.
- `input{paneID, bytes}` (the `InjectAsUser` path).
- `focus{paneID|pad}`, `switch_project{key}`, `open_project{path}`.
- `palette_command{...}` (spawn/kill/rename/quit-project), `trust_response`,
`resize{cols,rows}`.
**Encoding decision:** ship raw focused-pane PTY bytes + periodic
`SerializeChild` snapshots; client runs its own `viewportRenderer`. No
daemon-side pre-render (keeps daemon size-agnostic), no grid diffs in v1.
Requires in-order delivery only (TCP gives it). Diffs are a later optimization.
### Emulator sizing with per-client views
Each `Child` emulator has one size. Rules:
- A pane is sized by the client(s) viewing it. If exactly one client focuses a
pane, that client's cols/rows drive `ResizeAll` for that pane.
- If two clients focus the **same** pane, one is the **display owner** (first
to focus, or explicit take-control); the owner's size drives the emulator;
the other letterboxes/clips. Surface a toast.
- Because clients are usually on different projects/panes, contention is rare.
### Security (human clients, LAN — trusted-network stance)
Harry runs this on a trusted LAN (decision #6). Keep it lightweight but not
wide open:
- localhost-only by default. LAN bind (`--listen 0.0.0.0:PORT`) is explicit
opt-in, never default.
- A simple pairing/bearer token gates network attach so a stray host on the LAN
can't drive-by-attach. Daemon prints the token on `--listen`; client presents
it in `attach`; store a per-client token after first pairing.
- Local unix-socket clients keep `0600` perms (sufficient for same-user).
- Keep the transport pluggable so TLS + cert pinning can be layered in later
without reworking the protocol. Not building TLS now.
- Trust prompts may now be approved from another device — deliberate; route to
the client whose view owns the project.
### Daemon lifecycle (auto-start)
- Well-known local socket `$XDG_RUNTIME_DIR/patterm/daemon.sock` +
pidfile/lockfile (single daemon per user).
- `patterm [dir]`: dial the socket; if absent, fork-exec the daemon, wait for
readiness, attach. `--project`/dir selects the initial project for the view.
- `patterm daemon` (foreground), `patterm daemon stop`, `patterm ls`.
- **Detach = explicit** palette command and/or a dedicated host chord; PTYs keep
running. Ctrl-D stays as PTY input (shell EOF). Quitting a project / killing
the daemon are explicit palette/CLI actions.
- Idle-shutdown policy: configurable; default keep alive until explicit stop.
## Package-by-package changes
- **`cmd/patterm`** (`main.go`): add `daemon` subcommand (headless core);
default invocation becomes client (auto-start/attach); `mcp-stdio` dials the
shared daemon socket (not per-PID); `debug-harness` drives a daemon (or
loopback).
- **`internal/app` split:**
- new **`internal/daemon`**: headless half — move `session.go`, `child.go`,
`host.go`, `tree.go`, `launch.go`, classifier, timers, `Shutdown`,
kill-cascade. Add `Registry`/`Project`.
- **`internal/client`**: TTY half — `uiState`, `viewport_renderer.go`,
`screen_renderer.go`, `tabbar.go`, `sidebar.go`, status, `palette.go`,
`stdinLoop`/`processStdin`, SIGWINCH/chrome ticker, markdown/marquee/toast.
Consumes events + chrome over `Transport` instead of `sess.Subscribe`.
- **new `internal/transport` + `internal/protocol`**: messages, framing,
loopback/unix/TCP-TLS impls, auth handshake.
- **`internal/mcp`**: `SocketPath` per-daemon (not per-PID);
`ResolveCallerIdentity` becomes daemon-wide across projects (token already
carries `PATTERM_PROJECT_KEY` via `ChildEnv`).
- **`internal/pty`**: set `cmd.Dir` from `SpawnSpec.WorkDir`; add process-group
handling for reliable tree teardown.
- **`internal/vt`**: unchanged grid source of truth; enforce per-child
serialization around emulator access (interface isn't concurrency-safe) since
clients + MCP + pump all snapshot.
- **`internal/{scratchpad,trust,persist}`**: per-`Project` instances in the
registry (already keyed by projectKey).
- **`internal/preset`**: project-agnostic; daemon loads once, shares.
- **`internal/projectkey`**: doc update (key is now load-bearing for routing).
- **`internal/harness`**: add daemon/loopback mode; assert child survives client
disconnect/reconnect, project-switch preserves each project's tree, two
clients on different projects, unauth TCP rejected.
## Backpressure
`pumpChild`'s listener calls are synchronous (`session.go:149`). A slow network
client must not block the PTY pump. Introduce a per-client event bus with a
bounded buffer that coalesces/ drops to a snapshot under pressure, decoupled
from `pumpChild`.
## Phased roadmap (all phases land on this branch)
0. **Extract headless core behind loopback transport.** `daemon.Core` +
`client` over in-process `Transport`. Zero behavior change; harness green.
1. **Multi-project registry + per-client view scaffolding.** Registry, per-
project stores, `ClientView`, palette "Switch/Open project…", project tier
in chrome. Still single local process.
2. **Out-of-process daemon over unix socket.** Auto-start/attach; PTYs survive
client exit; reconnect + snapshot-on-attach; Ctrl-D = detach; pidfile/lock.
3. **TCP + TLS + auth.** localhost TCP, then opt-in LAN bind; pairing token /
cert pinning; remote trust-prompt routing.
4. **Per-client view fully realized + emulator sizing/display-owner.**
Independent focus/scroll/palette per client; multi-client on same/different
projects; resize negotiation + letterbox.
5. **Hardening.** systemd/launchd autostart, `daemon stop|ls`, idle-shutdown,
backpressure, security review, CHANGELOG.
## Risks / open questions for review
- Heterogeneous client sizes vs one-PTY-one-size (display-owner + letterbox is
the v1 answer — is it sufficient?).
- Security escalation: a network client spawns processes / runs shell / injects
input. Auth/TLS scope adequate?
- Ctrl-D semantics flip — acceptable UX?
- Backpressure design — bounded bus + snapshot-on-pressure correct?
- MCP identity uniqueness across projects after per-PID socket removal.
- Is per-client view (decision #3) worth doing from Phase 1, or staged after a
shared-focus interim that's faster to ship?
- Splitting `uiState` (focus/palette/render caches/trust prompt/dims/outMu) out
of the daemon is the largest refactor — sequencing concerns?