Files
patterm/docs/daemon-client-plan.md

14 KiB
Raw Blame History

patterm: persistent daemon + thin networked client — implementation plan

Status: implemented — Phases 04 landed on this branch. Branch: feat/daemon-client-split.

Implemented: pty workdir/process-group + protocol/Transport/loopback foundation; multi-project ProjectRegistry; out-of-process unix-socket daemon with auto-start, daemon stop/ls, detach (Ctrl-]) + reconnect; opt-in LAN TCP listener with a lightweight bearer token + patterm connect; per-pane display-owner sizing for multi-client viewing. Deferred (not built): TLS (transport kept pluggable), remote MCP, durable restore of live PTYs across daemon restart.

Goal

Turn patterm from a single foreground process into a persistent background daemon that owns all process/project state, plus a thin client that renders and forwards input. A client on another LAN device can attach, navigate projects via the command palette, detach, and reconnect — with child processes surviving across client disconnects.

Locked decisions

  1. Scope: build all phases; land as one PR off this branch.
  2. Remote access: human UI clients only. MCP for agents stays local (per-daemon unix socket); no remote MCP transport in this work.
  3. Multi-client = per-client independent view. The daemon holds pure process/project state. Each client connection owns a ClientView (selected project, focused pane/pad, scroll offset, palette state, terminal size). Two clients may sit on different projects at once.
  4. Daemon lifecycle: auto-start on demand (tmux/docker model). patterm starts the daemon if absent and attaches; patterm daemon stop|ls manage it.
  5. Durability: "persistent" = survive client disconnect while the daemon process lives. Daemon restart only rehydrates today's persist model (top-level commands, fresh IDs). No attempt to resurrect live PTYs/agents after daemon death.
  6. Auth (trusted-network stance): Harry runs this on a trusted LAN and is fine with LAN exposure. Keep it lightweight: localhost default, opt-in LAN bind (--listen), a simple pairing/bearer token to prevent accidental drive-by access. TLS/cert-pinning is NOT required now but the transport must stay pluggable so TLS can be layered in later.
  7. Detach gesture: explicit detach via a palette command and/or a dedicated host chord. Ctrl-D stays as PTY input (shell EOF), as today. Quit-project and stop-daemon are explicit actions.

Current architecture (baseline facts — verify before editing)

  • app.Run (internal/app/app.go:49) wires the entire process: presets, settings, scratchpad/trust/persist stores, in-process MCP server, ONE Session, the uiState TUI, classifier, SIGWINCH, 60Hz chrome ticker, blocking stdinLoop.
  • The seam: ChildEventListener (internal/app/session.go:83) — OnChildSpawned/OnChildExited/OnPTYOut/OnChildStateChanged/ OnChildClosed. Today uiState is the only real listener (subscribed at app.go:198). A remote client = a serialized listener + reverse command channel.
  • One Session (session.go:28) holds a flat children map[string]*Child + order. Tabs are derived: KindAgent children with ParentID=="" (tree.go runningTopLevels). The whole tree is reconstructed from Child.ParentID.
  • Child (child.go:72) owns *pty.PTY, *vt.GhosttyEmulator, raw ring, status/owner atomics. Lifecycle: Session.Spawn (session.go:222) → startPTYpumpChild (session.go:423, PTY→emulator→ring→emitPTYOut)
    • reapChild (session.go:488, exit→killDescendantsOf).
  • Stores already keyed by projectKey on Open (scratchpad/trust/persist); projectkey.Key(dir) = sha256(realpath)[:16].
  • SerializeChild (session.go:687) already yields a full VT snapshot for stateless repaint.
  • Rendering writes ANSI to os.Stdout under outMu; viewportRenderer (internal/app/viewport_renderer.go) is a stateful ANSI rewriter confining child output to the viewport. Input: raw os.Stdin via stdinLoop (app.go:1433)/processStdin.
  • MCP: in-process Server (internal/mcp/mcp.go:26), newline-JSON over a per-PID unix socket $XDG_RUNTIME_DIR/patterm/<pid>.sock. Agents launch patterm mcp-stdio --socket S --identity T. Identity → callerID via host.ResolveCallerIdentitySession.FindChildByIdentity.
  • No TCP/TLS anywhere today. All net.Listen/net.Dial are unix sockets.
  • Must-fix: pty.Start (internal/pty/pty.go:26) does not set cmd.Dir; today the process os.Chdirs once. A daemon can't chdir globally, so SpawnSpec.WorkDir must propagate to exec.Cmd.Dir.

Target component model

Component Owns
internal/daemon (pattermd) Project registry (N Sessions), all PTYs, emulators, MCP server, per-project stores, classifier, timers. No TTY.
internal/client (patterm) Real terminal: raw mode, alt-screen, SIGWINCH, stdin/stdout; uiState, viewportRenderer, chrome draws, palette, input. Holds ClientView.
internal/transport Transport interface + framing; loopback, unix, TCP/TLS impls; auth handshake.
internal/protocol Wire message types shared by daemon + client.

Transport interface (migration linchpin)

type Transport interface {
    Send(Frame) error          // client→daemon command, or daemon→client push
    Recv() (Frame, error)
    Close() error
}
  • Loopback impl: in-process channels, zero serialization. Default patterm = client + loopback daemon in one process → today's UX preserved exactly, single binary.
  • Net impl: framed JSON-per-line over net.Conn, reusing the mcp.go:handleConn pattern; unix socket first, then TCP/TLS.

Per-client state vs daemon state

// daemon-side, pure process/project state
type Registry struct { projects map[string]*Project }  // key = projectKey
type Project struct {
    Key, Dir, Name string
    Session  *Session
    Pads     *scratchpad.Store
    Trust    *trust.Store
    Persist  *persist.Store
    Launcher *Launcher
    Host     *ToolHost
}

// per-connection, client-owned view state (lives client-side; daemon tracks
// only what it must to size emulators + route subscriptions)
type ClientView struct {
    ID          string
    ProjectKey  string   // which project this client is looking at
    FocusedID   string   // pane (Child) or pad
    ScrollOff   int
    Cols, Rows  uint16
    // palette state is fully client-local
}

Project switch = re-point this client's subscription to another Project's Session + send chrome + pane_snapshot. No process teardown.

Wire protocol (control + UI channel)

Bidirectional framed JSON-per-line.

Daemon → client:

  • hello / auth_challenge / auth_ok — handshake.
  • project_list[{key, path, name, last_active, tab_count}] for the palette switcher.
  • chrome — semantic model for the client's current project+view: tab list (runningTopLevels), sidebar tree (sidebarNav), status/owner, toasts, scratchpad list + selected preview. Client draws chrome locally (reuses tabbar.go/sidebar.go).
  • pane_snapshot{paneID, vtBytes} — full repaint on focus/attach/switch via SerializeChild.
  • pane_chunk{paneID, bytes} — live focused-pane PTY output (serialized OnPTYOut).
  • lifecycle{spawned|exited|closed|stateChanged,...} — serialized listener.
  • attention / trust_prompt — human-facing surfaces; render on the client whose view owns the relevant project.

Client → daemon:

  • attach{token, term_size, project_key?} / detach.
  • input{paneID, bytes} (the InjectAsUser path).
  • focus{paneID|pad}, switch_project{key}, open_project{path}.
  • palette_command{...} (spawn/kill/rename/quit-project), trust_response, resize{cols,rows}.

Encoding decision: ship raw focused-pane PTY bytes + periodic SerializeChild snapshots; client runs its own viewportRenderer. No daemon-side pre-render (keeps daemon size-agnostic), no grid diffs in v1. Requires in-order delivery only (TCP gives it). Diffs are a later optimization.

Emulator sizing with per-client views

Each Child emulator has one size. Rules:

  • A pane is sized by the client(s) viewing it. If exactly one client focuses a pane, that client's cols/rows drive ResizeAll for that pane.
  • If two clients focus the same pane, one is the display owner (first to focus, or explicit take-control); the owner's size drives the emulator; the other letterboxes/clips. Surface a toast.
  • Because clients are usually on different projects/panes, contention is rare.

Security (human clients, LAN — trusted-network stance)

Harry runs this on a trusted LAN (decision #6). Keep it lightweight but not wide open:

  • localhost-only by default. LAN bind (--listen 0.0.0.0:PORT) is explicit opt-in, never default.
  • A simple pairing/bearer token gates network attach so a stray host on the LAN can't drive-by-attach. Daemon prints the token on --listen; client presents it in attach; store a per-client token after first pairing.
  • Local unix-socket clients keep 0600 perms (sufficient for same-user).
  • Keep the transport pluggable so TLS + cert pinning can be layered in later without reworking the protocol. Not building TLS now.
  • Trust prompts may now be approved from another device — deliberate; route to the client whose view owns the project.

Daemon lifecycle (auto-start)

  • Well-known local socket $XDG_RUNTIME_DIR/patterm/daemon.sock + pidfile/lockfile (single daemon per user).
  • patterm [dir]: dial the socket; if absent, fork-exec the daemon, wait for readiness, attach. --project/dir selects the initial project for the view.
  • patterm daemon (foreground), patterm daemon stop, patterm ls.
  • Detach = explicit palette command and/or a dedicated host chord; PTYs keep running. Ctrl-D stays as PTY input (shell EOF). Quitting a project / killing the daemon are explicit palette/CLI actions.
  • Idle-shutdown policy: configurable; default keep alive until explicit stop.

Package-by-package changes

  • cmd/patterm (main.go): add daemon subcommand (headless core); default invocation becomes client (auto-start/attach); mcp-stdio dials the shared daemon socket (not per-PID); debug-harness drives a daemon (or loopback).
  • internal/app split:
    • new internal/daemon: headless half — move session.go, child.go, host.go, tree.go, launch.go, classifier, timers, Shutdown, kill-cascade. Add Registry/Project.
    • internal/client: TTY half — uiState, viewport_renderer.go, screen_renderer.go, tabbar.go, sidebar.go, status, palette.go, stdinLoop/processStdin, SIGWINCH/chrome ticker, markdown/marquee/toast. Consumes events + chrome over Transport instead of sess.Subscribe.
  • new internal/transport + internal/protocol: messages, framing, loopback/unix/TCP-TLS impls, auth handshake.
  • internal/mcp: SocketPath per-daemon (not per-PID); ResolveCallerIdentity becomes daemon-wide across projects (token already carries PATTERM_PROJECT_KEY via ChildEnv).
  • internal/pty: set cmd.Dir from SpawnSpec.WorkDir; add process-group handling for reliable tree teardown.
  • internal/vt: unchanged grid source of truth; enforce per-child serialization around emulator access (interface isn't concurrency-safe) since clients + MCP + pump all snapshot.
  • internal/{scratchpad,trust,persist}: per-Project instances in the registry (already keyed by projectKey).
  • internal/preset: project-agnostic; daemon loads once, shares.
  • internal/projectkey: doc update (key is now load-bearing for routing).
  • internal/harness: add daemon/loopback mode; assert child survives client disconnect/reconnect, project-switch preserves each project's tree, two clients on different projects, unauth TCP rejected.

Backpressure

pumpChild's listener calls are synchronous (session.go:149). A slow network client must not block the PTY pump. Introduce a per-client event bus with a bounded buffer that coalesces/ drops to a snapshot under pressure, decoupled from pumpChild.

Phased roadmap (all phases land on this branch)

  1. Extract headless core behind loopback transport. daemon.Core + client over in-process Transport. Zero behavior change; harness green.
  2. Multi-project registry + per-client view scaffolding. Registry, per- project stores, ClientView, palette "Switch/Open project…", project tier in chrome. Still single local process.
  3. Out-of-process daemon over unix socket. Auto-start/attach; PTYs survive client exit; reconnect + snapshot-on-attach; Ctrl-D = detach; pidfile/lock.
  4. TCP + TLS + auth. localhost TCP, then opt-in LAN bind; pairing token / cert pinning; remote trust-prompt routing.
  5. Per-client view fully realized + emulator sizing/display-owner. Independent focus/scroll/palette per client; multi-client on same/different projects; resize negotiation + letterbox.
  6. Hardening. systemd/launchd autostart, daemon stop|ls, idle-shutdown, backpressure, security review, CHANGELOG.

Risks / open questions for review

  • Heterogeneous client sizes vs one-PTY-one-size (display-owner + letterbox is the v1 answer — is it sufficient?).
  • Security escalation: a network client spawns processes / runs shell / injects input. Auth/TLS scope adequate?
  • Ctrl-D semantics flip — acceptable UX?
  • Backpressure design — bounded bus + snapshot-on-pressure correct?
  • MCP identity uniqueness across projects after per-PID socket removal.
  • Is per-client view (decision #3) worth doing from Phase 1, or staged after a shared-focus interim that's faster to ship?
  • Splitting uiState (focus/palette/render caches/trust prompt/dims/outMu) out of the daemon is the largest refactor — sequencing concerns?