diff --git a/CHANGELOG.md b/CHANGELOG.md index e1d556f..f0ecaa2 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,40 @@ loosely follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ## [Unreleased] ### Added +- Per-child idle-state classifier with five states (`idle`, `working`, + `thinking`, `permission`, `error`) and three pluggable strategies: + `output_activity` (claude / opencode defaults), `osc_title_stability` + (codex), and `osc_title_status` (gemini-style status-in-title agents). + Optional `permission_patterns` / `thinking_patterns` / `error_patterns` + regexes promote a base state when matched against the tail of recent + output. State and last-match reason are exposed via MCP on + `ProcessInfo` and `get_process_status` (`idle_state`, `idle_reason`). +- New `idle_detection` block on `preset.Preset` for setting the strategy + threshold, title-to-state map, and promoter regex lists. Bundled + defaults are shipped for the first-party claude / codex / opencode + presets. +- Sidebar now renders a state glyph per process row (`○` idle, `●` + working, `◐` thinking, `?` permission, `✕` error) and, when a process + has a pending or paused timer, appends a nearest-timer indicator + (`⏱ 12s` or `⏸ paused`). +- MCP timer surface expanded to match Solo's tool set: `timer_set`, + `timer_fire_when_idle_any`, `timer_fire_when_idle_all`, `timer_cancel`, + `timer_pause`, `timer_resume`, `timer_list`. Idle-aware timers + registered against already-idle children fire synchronously + (`status: already_satisfied`) for `idle_all`, and report + `already_idle` / `waiting_on` arrays so callers can introspect the + watch set. Timer bodies are delivered to the owner process via the + same orchestrator-injection path as `send_message`. +- Timer tools accept an explicit `owner_process_id` so top-level + (non-agent) callers — including the harness MCP client — can attribute + timers to a specific process. Omitting it treats the caller as the + orchestrator with universal cancel / pause / resume / list privileges. +- libghostty-vt `Title()` accessor on the emulator surface, polled from + the session pump so OSC 0/1/2 title updates feed into the classifier + without a callback round-trip. +- Harness `wait_until_mcp` step type that re-runs an MCP method until an + assertion (Equals / Contains) holds or the timeout elapses. Used by + the new idle / timer scenarios. - User-created top-level command processes now survive a patterm restart. Each spawn (palette form, command preset, or MCP `spawn_process` with `kind=command`) writes a record to @@ -64,6 +98,9 @@ loosely follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html). after a child program disables mouse tracking. ### Changed +- `timer_wait` is now a thin wrapper over the shared timer manager + (`timer_set` semantics). Existing callers see no behavioural change; + the timer is visible in `timer_list` while it's pending. - CLI flag parsing switched from Go's stdlib `flag` to `spf13/pflag`. `--project` (and the internal `--socket` / `--identity` / `--scenario` / `--patterm-bin` flags) are now the only accepted form @@ -71,6 +108,25 @@ loosely follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html). renders the canonical `--flag` form. ### Fixed +- `whoami` and `help("timers")` now advertise the full Solo-parity timer + surface (`timer_set`, `timer_fire_when_idle_any`, + `timer_fire_when_idle_all`, `timer_cancel`, `timer_pause`, + `timer_resume`, `timer_list`) so agents using either tool for + orientation discover them — previously only `timer_wait` was listed. +- Resuming a paused idle-aware timer now re-checks the satisfaction + condition. Previously, if every watched process became idle (or, for + `idle_any`, any non-baseline watcher went idle) while the timer was + paused, the timer stayed pending forever because no further state + transitions were observed. +- Fired and canceled timers are now removed from the timer registry, + so long-running patterm sessions no longer accumulate completed + timer records and message bodies. `timer_list` and the sidebar + indicator already filtered them out; only the in-memory leak is + fixed. +- Per-preset idle-detection config is now installed through `SpawnSpec` + before the child is published to the session, closing a race in + which the classifier goroutine could observe a freshly spawned + process before its preset's classifier strategy was attached. - Opening the command palette while a scratchpad was focused left the palette wedged — typing did nothing and Esc left the palette's top border drawn over the pad until you closed the pad with Ctrl-W and diff --git a/TODO.md b/TODO.md index d6a8774..37cc685 100644 --- a/TODO.md +++ b/TODO.md @@ -1,7 +1,16 @@ - [ ] We should probably rename the Kill terminology to Close instead, across processes and agents. - [ ] Exited shells are still being treated as active processes. They should be removed from the process list when they exit. - [ ] Shells should be renamed to terminals. "New Terminal" etc. - +- [ ] Codex seemed to think that it needed to launch patterm itself to get the mcp working +- [ ] I cant click and drag to select text from codex +- [ ] codex uses perl to interact with the socket rather than calling mcp tools + - when it _did_ open a sub claude it opened it as a separate tab rather than a sub-agent. +- [ ] codex rendering is VERY slow + - maybe we need to use diffing rather than rendering the entire viewport for performance +- We should add a --debug and --profile flag, so we can get detailed performance data and full logs of the agent output to be debugged later on. + - I don't mind what format this is in, ideally easy for LLMs to understand +- [ ] Resuming a long claude session takes a couple of seconds for the entire buffer to load in, it looks like it's scrolling down for a couple seconds. + - In raw alacritty this is instant, so there's some sort of performance issue with patterm's terminal emulation. # On Hold - [ ] There's a unicode being displayed in opencode [ON HOLD] diff --git a/internal/app/app.go b/internal/app/app.go index 8bf515d..e150b10 100644 --- a/internal/app/app.go +++ b/internal/app/app.go @@ -113,6 +113,11 @@ func Run(ctx context.Context, opts Options) error { ctx, cancel := context.WithCancel(ctx) defer cancel() + // Per-session idle-detection classifier. One goroutine ticks every + // 250ms over every live child and updates IdleState. It stops when + // ctx is cancelled. + go sess.runClassifier(ctx) + st := &uiState{ sess: sess, presets: presets, @@ -120,6 +125,7 @@ func Run(ctx context.Context, opts Options) error { pads: pads, chromeWake: make(chan struct{}, 1), trust: trustStore, + timers: host.timers, hostCols: cols, hostRows: rows, stdinTTY: term.IsTerminal(int(os.Stdin.Fd())), @@ -296,6 +302,7 @@ type uiState struct { launcher *Launcher pads *scratchpad.Store trust *trust.Store + timers *timerManager outMu sync.Mutex @@ -610,6 +617,14 @@ func (st *uiState) OnChildSpawned(c *Child) { st.drawStatusLine() } +// OnChildStateChanged repaints the sidebar whenever a child's +// idle-state badge flips. Cheap — the badge is the only chrome that +// reflects state today, and drawSidebar bails when the cached frame +// hasn't changed. +func (st *uiState) OnChildStateChanged(string, IdleState) { + st.drawSidebar() +} + // OnChildExited drops focus and shows the empty state if it was the // focused child. func (st *uiState) OnChildExited(c *Child) { diff --git a/internal/app/child.go b/internal/app/child.go index 993a6fa..17cc8f6 100644 --- a/internal/app/child.go +++ b/internal/app/child.go @@ -123,6 +123,19 @@ type Child struct { portsMu sync.Mutex ports []PortSighting + // Idle-detection state. idleState carries the classifier's current + // opinion (StateIdle / StateWorking / …). lastTitleNS is the wall + // time of the most recent OSC title change — separate from + // lastWriteNS so the osc_title_* strategies can ignore plain output + // churn. idleDetection is the compiled per-preset config, resolved + // once at spawn and immutable thereafter. + idleState atomic.Pointer[IdleState] + idleReason atomic.Pointer[string] + titleMu sync.RWMutex + title string + lastTitleNS atomic.Int64 + idleDetection *resolvedIdleDetection + cleanupMu sync.Mutex cleanupPaths []string restarting atomic.Bool @@ -330,6 +343,75 @@ func (c *Child) IdleMS() int64 { return (time.Now().UnixNano() - last) / int64(time.Millisecond) } +// TitleIdleMS returns how many milliseconds since the OSC window title +// last changed. 0 means "no title set yet". +func (c *Child) TitleIdleMS() int64 { + last := c.lastTitleNS.Load() + if last == 0 { + return 0 + } + return (time.Now().UnixNano() - last) / int64(time.Millisecond) +} + +// Title returns the most recent OSC 0/2 title. +func (c *Child) Title() string { + c.titleMu.RLock() + defer c.titleMu.RUnlock() + return c.title +} + +// recordTitle updates the cached title and bumps lastTitleNS when it +// actually changes. Called from Session.pumpChild after each PTY chunk +// — cheap because most chunks don't carry an OSC sequence. +func (c *Child) recordTitle(newTitle string) { + c.titleMu.Lock() + if c.title == newTitle { + c.titleMu.Unlock() + return + } + c.title = newTitle + c.titleMu.Unlock() + c.lastTitleNS.Store(time.Now().UnixNano()) +} + +// IdleState returns the classifier's current opinion. Empty string +// (StateUnknown) means the classifier hasn't run yet for this child. +func (c *Child) IdleState() IdleState { + p := c.idleState.Load() + if p == nil { + return StateUnknown + } + return *p +} + +// IdleReason returns the human-readable reason the classifier last +// recorded. Empty when no classification has happened yet. +func (c *Child) IdleReason() string { + p := c.idleReason.Load() + if p == nil { + return "" + } + return *p +} + +// setIdleState updates idleState + idleReason. Returns true when the +// state actually changed (so callers can fan out a notification). +func (c *Child) setIdleState(s IdleState, reason string) bool { + prev := c.IdleState() + if prev == s { + return false + } + c.idleState.Store(&s) + c.idleReason.Store(&reason) + return true +} + +// setIdleDetection installs the resolved per-preset idle-detection +// config. Called once at spawn; not safe to swap at runtime. +func (c *Child) setIdleDetection(r *resolvedIdleDetection) { + c.idleDetection = r +} + func (c *Child) recordWrite(chunk []byte) { c.lastWriteNS.Store(time.Now().UnixNano()) c.screenVersion.Add(1) diff --git a/internal/app/classifier.go b/internal/app/classifier.go new file mode 100644 index 0000000..ef8680c --- /dev/null +++ b/internal/app/classifier.go @@ -0,0 +1,96 @@ +package app + +import ( + "context" + "time" +) + +// classifierTickInterval is how often the per-session classifier wakes +// up to re-evaluate every child's state. 250ms is fast enough that +// the sidebar badge looks live, slow enough that the cost is invisible +// even with dozens of children. +const classifierTickInterval = 250 * time.Millisecond + +// classifierTailBytes is the size of the ring-buffer tail the +// classifier scans for promoter regexes. Big enough to catch a multi- +// line "Approve?" prompt, small enough that we don't pay for a full +// 1 MiB regex scan every tick. +const classifierTailBytes = 4096 + +// runClassifier loops over every live child every classifierTickInterval +// and updates IdleState when it changes. It runs until ctx is cancelled +// (the host shutdown path cancels). One goroutine per Session is plenty +// — the work is cheap (atomic loads + ~4 KiB regex scan per child). +func (s *Session) runClassifier(ctx context.Context) { + ticker := time.NewTicker(classifierTickInterval) + defer ticker.Stop() + for { + select { + case <-ctx.Done(): + return + case <-ticker.C: + s.classifyAll() + } + } +} + +func (s *Session) classifyAll() { + for _, c := range s.Children() { + s.classifyOne(c) + } +} + +func (s *Session) classifyOne(c *Child) { + st := c.Status() + exited := st == StatusExited || st == StatusErrored + exitNonZero := false + if exited { + exitNonZero = c.ExitCode() != 0 + } + idleMS := c.IdleMS() + titleIdleMS := c.TitleIdleMS() + title := c.Title() + tail := c.tailBytes(classifierTailBytes) + state, reason := classify(c.idleDetection, exited, exitNonZero, idleMS, titleIdleMS, title, tail) + if c.setIdleState(state, reason) { + s.emitStateChanged(c.ID, state) + } +} + +// tailBytes returns up to n bytes from the end of the ring buffer. +// Safe to call from the classifier goroutine while pumpChild writes +// from another goroutine — both serialise on ringMu. +func (c *Child) tailBytes(n int) []byte { + c.ringMu.Lock() + defer c.ringMu.Unlock() + have := int64(ringCap) + if !c.ringFull { + have = c.ringWrites + } + if have == 0 { + return nil + } + want := int64(n) + if want > have { + want = have + } + out := make([]byte, want) + // The ring layout matches StreamRead: when not full, byte k lives + // at index k; when full, the oldest byte sits at ringPos and the + // newest at (ringPos-1) mod ringCap. + if !c.ringFull { + copy(out, c.ring[c.ringWrites-want:c.ringWrites]) + return out + } + // Tail starts `want` bytes back from the write head. + start := (c.ringPos - int(want) + ringCap) % ringCap + first := ringCap - start + if first > int(want) { + first = int(want) + } + copy(out, c.ring[start:start+first]) + if first < int(want) { + copy(out[first:], c.ring[:int(want)-first]) + } + return out +} diff --git a/internal/app/host.go b/internal/app/host.go index 6c96706..3917431 100644 --- a/internal/app/host.go +++ b/internal/app/host.go @@ -61,12 +61,11 @@ type toolHost struct { prompter trustPrompter scratch scratchpadSink - timersMu sync.Mutex - nextTimer int + timers *timerManager } func newToolHost(sess *Session, pads *scratchpad.Store, launcher *Launcher, presets preset.Set, tr *trust.Store, cols, rows uint16) *toolHost { - return &toolHost{ + h := &toolHost{ sess: sess, pads: pads, launcher: launcher, @@ -76,6 +75,28 @@ func newToolHost(sess *Session, pads *scratchpad.Store, launcher *Launcher, pres defaultRow: rows, startedAt: make(map[string]time.Time), } + h.timers = newTimerManager(sess) + // Plug the timer manager into the session's state-change fan-out so + // idle-aware timers fire when watched children transition into idle. + // Tests can construct a host with a nil session for sizing checks — + // those never run timers, so the subscribe is skipped. + if sess != nil { + sess.Subscribe(timerListenerAdapter{m: h.timers}) + } + return h +} + +// timerListenerAdapter forwards OnChildStateChanged into the timer +// manager and ignores the other ChildEventListener methods. The +// session's listener API is by-interface, so we wrap the manager +// rather than make it implement the full surface. +type timerListenerAdapter struct{ m *timerManager } + +func (a timerListenerAdapter) OnChildSpawned(*Child) {} +func (a timerListenerAdapter) OnChildExited(*Child) {} +func (a timerListenerAdapter) OnPTYOut(string, []byte) {} +func (a timerListenerAdapter) OnChildStateChanged(id string, st IdleState) { + a.m.onChildStateChanged(id, st) } func (h *toolHost) SetSize(cols, rows uint16) { @@ -531,6 +552,7 @@ func (n *chunkNotifier) OnPTYOut(id string, chunk []byte) { default: } } +func (n *chunkNotifier) OnChildStateChanged(string, IdleState) {} func (h *toolHost) GetProcessPorts(callerID, processID string) ([]mcp.PortSighting, error) { c := h.sess.FindChild(processID) @@ -725,27 +747,59 @@ func (h *toolHost) RequestHumanAttention(callerID, processID, reason string) err return nil } +// TimerWait is the legacy fire-and-forget delay timer. It now wraps +// TimerSet with an empty body — defaultFireFn substitutes the +// "[system] Your timer […] has completed." line so behaviour matches +// the original API. New callers should use timer_set with an explicit +// body. func (h *toolHost) TimerWait(callerID string, seconds float64, label string) (string, error) { - caller := h.sess.FindChild(callerID) - if caller == nil { - return "", mcp.Errorf(mcp.ErrorKindNotFound, "caller %q not known to patterm", callerID) + return h.timers.TimerSet(callerID, "", label, seconds) +} + +func (h *toolHost) TimerSet(callerID string, args mcp.TimerSetArgs) (mcp.TimerHandle, error) { + owner := resolveTimerOwner(callerID, args.OwnerProcessID) + id, err := h.timers.TimerSet(owner, args.Body, args.Label, args.Seconds) + if err != nil { + return mcp.TimerHandle{}, err } - h.timersMu.Lock() - h.nextTimer++ - id := fmt.Sprintf("t%d", h.nextTimer) - h.timersMu.Unlock() - if label == "" { - label = id + return mcp.TimerHandle{ID: id}, nil +} + +func (h *toolHost) TimerFireWhenIdleAny(callerID string, args mcp.TimerFireWhenIdleArgs) (mcp.TimerFireWhenIdleResponse, error) { + owner := resolveTimerOwner(callerID, args.OwnerProcessID) + return h.timers.TimerFireWhenIdleAny(owner, args.Body, args.Label, args.Watched, args.MaxWaitSeconds) +} + +func (h *toolHost) TimerFireWhenIdleAll(callerID string, args mcp.TimerFireWhenIdleArgs) (mcp.TimerFireWhenIdleResponse, error) { + owner := resolveTimerOwner(callerID, args.OwnerProcessID) + return h.timers.TimerFireWhenIdleAll(owner, args.Body, args.Label, args.Watched, args.MaxWaitSeconds) +} + +// resolveTimerOwner picks the owner process for a timer. Explicit +// owner_process_id wins; otherwise the caller's own id is used. +// Top-level MCP clients (no callerID) must provide owner_process_id +// explicitly. +func resolveTimerOwner(callerID, explicit string) string { + if explicit != "" { + return explicit } - go func() { - time.Sleep(time.Duration(seconds * float64(time.Second))) - if !caller.IsLive() { - return - } - line := fmt.Sprintf("[system] Your timer [%s] has completed.\r", label) - _ = caller.InjectAsOrchestrator([]byte(line)) - }() - return id, nil + return callerID +} + +func (h *toolHost) TimerCancel(callerID, id string) error { + return h.timers.TimerCancel(callerID, id) +} + +func (h *toolHost) TimerPause(callerID, id string) error { + return h.timers.TimerPause(callerID, id) +} + +func (h *toolHost) TimerResume(callerID, id string) error { + return h.timers.TimerResume(callerID, id) +} + +func (h *toolHost) TimerList(callerID string) ([]mcp.TimerInfo, error) { + return h.timers.TimerList(callerID), nil } // ─────────────────────────────────────────────────────────────────── @@ -816,6 +870,10 @@ func (h *toolHost) processInfoOf(c *Child) mcp.ProcessInfo { t := h.trust.IsTrusted(c.PresetRef) info.Trusted = &t } + if s := c.IdleState(); s != StateUnknown { + info.IdleState = string(s) + info.IdleReason = c.IdleReason() + } return info } @@ -1026,7 +1084,9 @@ func availableToolsForRole(role mcp.CallerRole) []string { "list_processes", "get_process_status", "get_project_status", "get_process_output", "get_process_raw_output", "search_output", "wait_for_pattern", "get_process_ports", - "send_input", "send_message", "request_human_attention", "timer_wait", + "send_input", "send_message", "request_human_attention", + "timer_wait", "timer_set", "timer_fire_when_idle_any", "timer_fire_when_idle_all", + "timer_cancel", "timer_pause", "timer_resume", "timer_list", "scratchpad_list", "scratchpad_read", "scratchpad_write", "scratchpad_append", "whoami", "help", } @@ -1056,8 +1116,8 @@ func helpFor(topic string) mcp.HelpResponse { } case "lifecycle": return mcp.HelpResponse{ - Topic: "lifecycle", - Content: "You own the processes you spawn. When a sub-agent has finished its task (it reports back via send_message, or you've collected what you need from it) call close_process on its process_id to remove the entry and tear down the PTY. Same goes for spawn_process children: command/terminal panes you started are not auto-reclaimed when their work completes. close_process is the normal cleanup path; stop_process(signal) is for sending a signal without removing the entry; start_process re-attaches an exited command preset. Leaving idle sub-agents around wastes vendor tokens and clutters the host — close them as soon as you're done. Sub-agents themselves are reminded (via the [system: …] preface on their first prompt) to clean up anything they created before reporting done.", + Topic: "lifecycle", + Content: "You own the processes you spawn. When a sub-agent has finished its task (it reports back via send_message, or you've collected what you need from it) call close_process on its process_id to remove the entry and tear down the PTY. Same goes for spawn_process children: command/terminal panes you started are not auto-reclaimed when their work completes. close_process is the normal cleanup path; stop_process(signal) is for sending a signal without removing the entry; start_process re-attaches an exited command preset. Leaving idle sub-agents around wastes vendor tokens and clutters the host — close them as soon as you're done. Sub-agents themselves are reminded (via the [system: …] preface on their first prompt) to clean up anything they created before reporting done.", RelatedTools: []string{"close_process", "stop_process", "start_process", "list_processes", "get_process_status"}, } case "inspection": @@ -1086,9 +1146,18 @@ func helpFor(topic string) mcp.HelpResponse { } case "timers": return mcp.HelpResponse{ - Topic: "timers", - Content: "timer_wait returns a timer_id immediately and injects `[system] Your timer [