14 KiB
patterm: persistent daemon + thin networked client — implementation plan
Status: implemented — Phases 0–4 landed on this branch. Branch: feat/daemon-client-split.
Implemented: pty workdir/process-group + protocol/Transport/loopback foundation; multi-project
ProjectRegistry; out-of-process unix-socket daemon with auto-start,daemon stop/ls, detach (Ctrl-]) + reconnect; opt-in LAN TCP listener with a lightweight bearer token +patterm connect; per-pane display-owner sizing for multi-client viewing. Deferred (not built): TLS (transport kept pluggable), remote MCP, durable restore of live PTYs across daemon restart.
Goal
Turn patterm from a single foreground process into a persistent background daemon that owns all process/project state, plus a thin client that renders and forwards input. A client on another LAN device can attach, navigate projects via the command palette, detach, and reconnect — with child processes surviving across client disconnects.
Locked decisions
- Scope: build all phases; land as one PR off this branch.
- Remote access: human UI clients only. MCP for agents stays local (per-daemon unix socket); no remote MCP transport in this work.
- Multi-client = per-client independent view. The daemon holds pure
process/project state. Each client connection owns a
ClientView(selected project, focused pane/pad, scroll offset, palette state, terminal size). Two clients may sit on different projects at once. - Daemon lifecycle: auto-start on demand (tmux/docker model).
pattermstarts the daemon if absent and attaches;patterm daemon stop|lsmanage it. - Durability: "persistent" = survive client disconnect while the daemon process lives. Daemon restart only rehydrates today's persist model (top-level commands, fresh IDs). No attempt to resurrect live PTYs/agents after daemon death.
- Auth (trusted-network stance): Harry runs this on a trusted LAN and is
fine with LAN exposure. Keep it lightweight: localhost default, opt-in LAN
bind (
--listen), a simple pairing/bearer token to prevent accidental drive-by access. TLS/cert-pinning is NOT required now but the transport must stay pluggable so TLS can be layered in later. - Detach gesture: explicit detach via a palette command and/or a dedicated host chord. Ctrl-D stays as PTY input (shell EOF), as today. Quit-project and stop-daemon are explicit actions.
Current architecture (baseline facts — verify before editing)
app.Run(internal/app/app.go:49) wires the entire process: presets, settings, scratchpad/trust/persist stores, in-process MCP server, ONESession, theuiStateTUI, classifier, SIGWINCH, 60Hz chrome ticker, blockingstdinLoop.- The seam:
ChildEventListener(internal/app/session.go:83) —OnChildSpawned/OnChildExited/OnPTYOut/OnChildStateChanged/OnChildClosed. TodayuiStateis the only real listener (subscribed atapp.go:198). A remote client = a serialized listener + reverse command channel. - One
Session(session.go:28) holds a flatchildren map[string]*Child+order. Tabs are derived:KindAgentchildren withParentID==""(tree.gorunningTopLevels). The whole tree is reconstructed fromChild.ParentID. Child(child.go:72) owns*pty.PTY,*vt.GhosttyEmulator, raw ring, status/owner atomics. Lifecycle:Session.Spawn(session.go:222) →startPTY→pumpChild(session.go:423, PTY→emulator→ring→emitPTYOut)reapChild(session.go:488, exit→killDescendantsOf).
- Stores already keyed by projectKey on
Open(scratchpad/trust/persist);projectkey.Key(dir)=sha256(realpath)[:16]. SerializeChild(session.go:687) already yields a full VT snapshot for stateless repaint.- Rendering writes ANSI to
os.StdoutunderoutMu;viewportRenderer(internal/app/viewport_renderer.go) is a stateful ANSI rewriter confining child output to the viewport. Input: rawos.StdinviastdinLoop(app.go:1433)/processStdin. - MCP: in-process
Server(internal/mcp/mcp.go:26), newline-JSON over a per-PID unix socket$XDG_RUNTIME_DIR/patterm/<pid>.sock. Agents launchpatterm mcp-stdio --socket S --identity T. Identity →callerIDviahost.ResolveCallerIdentity→Session.FindChildByIdentity. - No TCP/TLS anywhere today. All
net.Listen/net.Dialare unix sockets. - Must-fix:
pty.Start(internal/pty/pty.go:26) does not setcmd.Dir; today the processos.Chdirs once. A daemon can't chdir globally, soSpawnSpec.WorkDirmust propagate toexec.Cmd.Dir.
Target component model
| Component | Owns |
|---|---|
internal/daemon (pattermd) |
Project registry (N Sessions), all PTYs, emulators, MCP server, per-project stores, classifier, timers. No TTY. |
internal/client (patterm) |
Real terminal: raw mode, alt-screen, SIGWINCH, stdin/stdout; uiState, viewportRenderer, chrome draws, palette, input. Holds ClientView. |
internal/transport |
Transport interface + framing; loopback, unix, TCP/TLS impls; auth handshake. |
internal/protocol |
Wire message types shared by daemon + client. |
Transport interface (migration linchpin)
type Transport interface {
Send(Frame) error // client→daemon command, or daemon→client push
Recv() (Frame, error)
Close() error
}
- Loopback impl: in-process channels, zero serialization. Default
patterm= client + loopback daemon in one process → today's UX preserved exactly, single binary. - Net impl: framed JSON-per-line over
net.Conn, reusing themcp.go:handleConnpattern; unix socket first, then TCP/TLS.
Per-client state vs daemon state
// daemon-side, pure process/project state
type Registry struct { projects map[string]*Project } // key = projectKey
type Project struct {
Key, Dir, Name string
Session *Session
Pads *scratchpad.Store
Trust *trust.Store
Persist *persist.Store
Launcher *Launcher
Host *ToolHost
}
// per-connection, client-owned view state (lives client-side; daemon tracks
// only what it must to size emulators + route subscriptions)
type ClientView struct {
ID string
ProjectKey string // which project this client is looking at
FocusedID string // pane (Child) or pad
ScrollOff int
Cols, Rows uint16
// palette state is fully client-local
}
Project switch = re-point this client's subscription to another Project's
Session + send chrome + pane_snapshot. No process teardown.
Wire protocol (control + UI channel)
Bidirectional framed JSON-per-line.
Daemon → client:
hello/auth_challenge/auth_ok— handshake.project_list—[{key, path, name, last_active, tab_count}]for the palette switcher.chrome— semantic model for the client's current project+view: tab list (runningTopLevels), sidebar tree (sidebarNav), status/owner, toasts, scratchpad list + selected preview. Client draws chrome locally (reusestabbar.go/sidebar.go).pane_snapshot{paneID, vtBytes}— full repaint on focus/attach/switch viaSerializeChild.pane_chunk{paneID, bytes}— live focused-pane PTY output (serializedOnPTYOut).lifecycle{spawned|exited|closed|stateChanged,...}— serialized listener.attention/trust_prompt— human-facing surfaces; render on the client whose view owns the relevant project.
Client → daemon:
attach{token, term_size, project_key?}/detach.input{paneID, bytes}(theInjectAsUserpath).focus{paneID|pad},switch_project{key},open_project{path}.palette_command{...}(spawn/kill/rename/quit-project),trust_response,resize{cols,rows}.
Encoding decision: ship raw focused-pane PTY bytes + periodic
SerializeChild snapshots; client runs its own viewportRenderer. No
daemon-side pre-render (keeps daemon size-agnostic), no grid diffs in v1.
Requires in-order delivery only (TCP gives it). Diffs are a later optimization.
Emulator sizing with per-client views
Each Child emulator has one size. Rules:
- A pane is sized by the client(s) viewing it. If exactly one client focuses a
pane, that client's cols/rows drive
ResizeAllfor that pane. - If two clients focus the same pane, one is the display owner (first to focus, or explicit take-control); the owner's size drives the emulator; the other letterboxes/clips. Surface a toast.
- Because clients are usually on different projects/panes, contention is rare.
Security (human clients, LAN — trusted-network stance)
Harry runs this on a trusted LAN (decision #6). Keep it lightweight but not wide open:
- localhost-only by default. LAN bind (
--listen 0.0.0.0:PORT) is explicit opt-in, never default. - A simple pairing/bearer token gates network attach so a stray host on the LAN
can't drive-by-attach. Daemon prints the token on
--listen; client presents it inattach; store a per-client token after first pairing. - Local unix-socket clients keep
0600perms (sufficient for same-user). - Keep the transport pluggable so TLS + cert pinning can be layered in later without reworking the protocol. Not building TLS now.
- Trust prompts may now be approved from another device — deliberate; route to the client whose view owns the project.
Daemon lifecycle (auto-start)
- Well-known local socket
$XDG_RUNTIME_DIR/patterm/daemon.sock+ pidfile/lockfile (single daemon per user). patterm [dir]: dial the socket; if absent, fork-exec the daemon, wait for readiness, attach.--project/dir selects the initial project for the view.patterm daemon(foreground),patterm daemon stop,patterm ls.- Detach = explicit palette command and/or a dedicated host chord; PTYs keep running. Ctrl-D stays as PTY input (shell EOF). Quitting a project / killing the daemon are explicit palette/CLI actions.
- Idle-shutdown policy: configurable; default keep alive until explicit stop.
Package-by-package changes
cmd/patterm(main.go): adddaemonsubcommand (headless core); default invocation becomes client (auto-start/attach);mcp-stdiodials the shared daemon socket (not per-PID);debug-harnessdrives a daemon (or loopback).internal/appsplit:- new
internal/daemon: headless half — movesession.go,child.go,host.go,tree.go,launch.go, classifier, timers,Shutdown, kill-cascade. AddRegistry/Project. internal/client: TTY half —uiState,viewport_renderer.go,screen_renderer.go,tabbar.go,sidebar.go, status,palette.go,stdinLoop/processStdin, SIGWINCH/chrome ticker, markdown/marquee/toast. Consumes events + chrome overTransportinstead ofsess.Subscribe.
- new
- new
internal/transport+internal/protocol: messages, framing, loopback/unix/TCP-TLS impls, auth handshake. internal/mcp:SocketPathper-daemon (not per-PID);ResolveCallerIdentitybecomes daemon-wide across projects (token already carriesPATTERM_PROJECT_KEYviaChildEnv).internal/pty: setcmd.DirfromSpawnSpec.WorkDir; add process-group handling for reliable tree teardown.internal/vt: unchanged grid source of truth; enforce per-child serialization around emulator access (interface isn't concurrency-safe) since clients + MCP + pump all snapshot.internal/{scratchpad,trust,persist}: per-Projectinstances in the registry (already keyed by projectKey).internal/preset: project-agnostic; daemon loads once, shares.internal/projectkey: doc update (key is now load-bearing for routing).internal/harness: add daemon/loopback mode; assert child survives client disconnect/reconnect, project-switch preserves each project's tree, two clients on different projects, unauth TCP rejected.
Backpressure
pumpChild's listener calls are synchronous (session.go:149). A slow network
client must not block the PTY pump. Introduce a per-client event bus with a
bounded buffer that coalesces/ drops to a snapshot under pressure, decoupled
from pumpChild.
Phased roadmap (all phases land on this branch)
- Extract headless core behind loopback transport.
daemon.Core+clientover in-processTransport. Zero behavior change; harness green. - Multi-project registry + per-client view scaffolding. Registry, per-
project stores,
ClientView, palette "Switch/Open project…", project tier in chrome. Still single local process. - Out-of-process daemon over unix socket. Auto-start/attach; PTYs survive client exit; reconnect + snapshot-on-attach; Ctrl-D = detach; pidfile/lock.
- TCP + TLS + auth. localhost TCP, then opt-in LAN bind; pairing token / cert pinning; remote trust-prompt routing.
- Per-client view fully realized + emulator sizing/display-owner. Independent focus/scroll/palette per client; multi-client on same/different projects; resize negotiation + letterbox.
- Hardening. systemd/launchd autostart,
daemon stop|ls, idle-shutdown, backpressure, security review, CHANGELOG.
Risks / open questions for review
- Heterogeneous client sizes vs one-PTY-one-size (display-owner + letterbox is the v1 answer — is it sufficient?).
- Security escalation: a network client spawns processes / runs shell / injects input. Auth/TLS scope adequate?
- Ctrl-D semantics flip — acceptable UX?
- Backpressure design — bounded bus + snapshot-on-pressure correct?
- MCP identity uniqueness across projects after per-PID socket removal.
- Is per-client view (decision #3) worth doing from Phase 1, or staged after a shared-focus interim that's faster to ship?
- Splitting
uiState(focus/palette/render caches/trust prompt/dims/outMu) out of the daemon is the largest refactor — sequencing concerns?