Compress agent skills

This commit is contained in:
2026-05-11 12:05:04 +01:00
parent e145482cd3
commit 7fed91d98b
20 changed files with 1009 additions and 2350 deletions

View File

@@ -1,539 +1,90 @@
---
name: agent-browser
description: Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.
allowed-tools: Bash(npx agent-browser:*), Bash(agent-browser:*)
description: Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task.
allowed-tools: Bash(npx agent-browser:*) Bash(agent-browser:*)
---
# Browser Automation with agent-browser
# Browser Automation With agent-browser
## Core Workflow
Use `agent-browser` for browser tasks. Prefer the direct binary over `npx agent-browser` when available.
Every browser automation follows this pattern:
## Core Loop
1. **Navigate**: `agent-browser open <url>`
2. **Snapshot**: `agent-browser snapshot -i` (get element refs like `@e1`, `@e2`)
3. **Interact**: Use refs to click, fill, select
4. **Re-snapshot**: After navigation or DOM changes, get fresh refs
1. Navigate: `agent-browser open <url>`.
2. Wait: `agent-browser wait --load networkidle` when page load matters.
3. Snapshot: `agent-browser snapshot -i` to get refs such as `@e1`.
4. Interact with refs: `click`, `fill`, `select`, `check`, `press`, `scroll`.
5. Re-snapshot after navigation, form submission, modal/dropdown changes, or dynamic loading.
6. Verify with `snapshot`, `get`, `screenshot`, or `diff snapshot`.
```bash
agent-browser open https://example.com/form
agent-browser wait --load networkidle
agent-browser snapshot -i
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
agent-browser snapshot -i
```
## Command Chaining
Commands can be chained with `&&` in a single shell invocation. The browser persists between commands via a background daemon, so chaining is safe and more efficient than separate calls.
```bash
# Chain open + wait + snapshot in one call
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
# Chain multiple interactions
agent-browser fill @e1 "user@example.com" && agent-browser fill @e2 "password123" && agent-browser click @e3
# Navigate and capture
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser screenshot page.png
```
**When to chain:** Use `&&` when you don't need to read the output of an intermediate command before proceeding (e.g., open + wait + screenshot). Run commands separately when you need to parse the output first (e.g., snapshot to discover refs, then interact using those refs).
## Essential Commands
```bash
# Navigation
agent-browser open <url> # Navigate (aliases: goto, navigate)
agent-browser close # Close browser
# Snapshot
agent-browser snapshot -i # Interactive elements with refs (recommended)
agent-browser snapshot -i -C # Include cursor-interactive elements (divs with onclick, cursor:pointer)
agent-browser snapshot -s "#selector" # Scope to CSS selector
# Interaction (use @refs from snapshot)
agent-browser click @e1 # Click element
agent-browser click @e1 --new-tab # Click and open in new tab
agent-browser fill @e2 "text" # Clear and type text
agent-browser type @e2 "text" # Type without clearing
agent-browser select @e1 "option" # Select dropdown option
agent-browser check @e1 # Check checkbox
agent-browser press Enter # Press key
agent-browser keyboard type "text" # Type at current focus (no selector)
agent-browser keyboard inserttext "text" # Insert without key events
agent-browser scroll down 500 # Scroll page
agent-browser scroll down 500 --selector "div.content" # Scroll within a specific container
# Get information
agent-browser get text @e1 # Get element text
agent-browser get url # Get current URL
agent-browser get title # Get page title
# Wait
agent-browser wait @e1 # Wait for element
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --url "**/page" # Wait for URL pattern
agent-browser wait 2000 # Wait milliseconds
# Downloads
agent-browser download @e1 ./file.pdf # Click element to trigger download
agent-browser wait --download ./output.zip # Wait for any download to complete
agent-browser --download-path ./downloads open <url> # Set default download directory
# Capture
agent-browser screenshot # Screenshot to temp dir
agent-browser screenshot --full # Full page screenshot
agent-browser screenshot --annotate # Annotated screenshot with numbered element labels
agent-browser pdf output.pdf # Save as PDF
# Diff (compare page states)
agent-browser diff snapshot # Compare current vs last snapshot
agent-browser diff snapshot --baseline before.txt # Compare current vs saved file
agent-browser diff screenshot --baseline before.png # Visual pixel diff
agent-browser diff url <url1> <url2> # Compare two pages
agent-browser diff url <url1> <url2> --wait-until networkidle # Custom wait strategy
agent-browser diff url <url1> <url2> --selector "#main" # Scope to element
```
## Common Patterns
### Form Submission
```bash
agent-browser open https://example.com/signup
agent-browser open <url>
agent-browser close
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle
```
### Authentication with Auth Vault (Recommended)
```bash
# Save credentials once (encrypted with AGENT_BROWSER_ENCRYPTION_KEY)
# Recommended: pipe password via stdin to avoid shell history exposure
echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin
# Login using saved profile (LLM never sees password)
agent-browser auth login github
# List/show/delete profiles
agent-browser auth list
agent-browser auth show github
agent-browser auth delete github
```
### Authentication with State Persistence
```bash
# Login once and save state
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "$USERNAME"
agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
# Reuse in future sessions
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
```
### Session Persistence
```bash
# Auto-save/restore cookies and localStorage across browser restarts
agent-browser --session-name myapp open https://app.example.com/login
# ... login flow ...
agent-browser close # State auto-saved to ~/.agent-browser/sessions/
# Next time, state is auto-loaded
agent-browser --session-name myapp open https://app.example.com/dashboard
# Encrypt state at rest
export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32)
agent-browser --session-name secure open https://app.example.com
# Manage saved states
agent-browser state list
agent-browser state show myapp-default.json
agent-browser state clear myapp
agent-browser state clean --older-than 7
```
### Data Extraction
```bash
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5 # Get specific element text
agent-browser get text body > page.txt # Get all page text
# JSON output for parsing
agent-browser snapshot -i --json
agent-browser get text @e1 --json
```
### Parallel Sessions
```bash
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com
agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i
agent-browser session list
```
### Connect to Existing Chrome
```bash
# Auto-discover running Chrome with remote debugging enabled
agent-browser --auto-connect open https://example.com
agent-browser --auto-connect snapshot
# Or with explicit CDP port
agent-browser --cdp 9222 snapshot
```
### Color Scheme (Dark Mode)
```bash
# Persistent dark mode via flag (applies to all pages and new tabs)
agent-browser --color-scheme dark open https://example.com
# Or via environment variable
AGENT_BROWSER_COLOR_SCHEME=dark agent-browser open https://example.com
# Or set during session (persists for subsequent commands)
agent-browser set media dark
```
### Visual Browser (Debugging)
```bash
agent-browser --headed open https://example.com
agent-browser highlight @e1 # Highlight element
agent-browser record start demo.webm # Record session
agent-browser profiler start # Start Chrome DevTools profiling
agent-browser profiler stop trace.json # Stop and save profile (path optional)
```
Use `AGENT_BROWSER_HEADED=1` to enable headed mode via environment variable. Browser extensions work in both headed and headless mode.
### Local Files (PDFs, HTML)
```bash
# Open local files with file:// URLs
agent-browser --allow-file-access open file:///path/to/document.pdf
agent-browser --allow-file-access open file:///path/to/page.html
agent-browser screenshot output.png
```
### iOS Simulator (Mobile Safari)
```bash
# List available iOS simulators
agent-browser device list
# Launch Safari on a specific device
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
# Same workflow as desktop - snapshot, interact, re-snapshot
agent-browser -p ios snapshot -i
agent-browser -p ios tap @e1 # Tap (alias for click)
agent-browser -p ios fill @e2 "text"
agent-browser -p ios swipe up # Mobile-specific gesture
# Take screenshot
agent-browser -p ios screenshot mobile.png
# Close session (shuts down simulator)
agent-browser -p ios close
```
**Requirements:** macOS with Xcode, Appium (`npm install -g appium && appium driver install xcuitest`)
**Real devices:** Works with physical iOS devices if pre-configured. Use `--device "<UDID>"` where UDID is from `xcrun xctrace list devices`.
## Security
All security features are opt-in. By default, agent-browser imposes no restrictions on navigation, actions, or output.
### Content Boundaries (Recommended for AI Agents)
Enable `--content-boundaries` to wrap page-sourced output in markers that help LLMs distinguish tool output from untrusted page content:
```bash
export AGENT_BROWSER_CONTENT_BOUNDARIES=1
agent-browser snapshot
# Output:
# --- AGENT_BROWSER_PAGE_CONTENT nonce=<hex> origin=https://example.com ---
# [accessibility tree]
# --- END_AGENT_BROWSER_PAGE_CONTENT nonce=<hex> ---
```
### Domain Allowlist
Restrict navigation to trusted domains. Wildcards like `*.example.com` also match the bare domain `example.com`. Sub-resource requests, WebSocket, and EventSource connections to non-allowed domains are also blocked. Include CDN domains your target pages depend on:
```bash
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"
agent-browser open https://example.com # OK
agent-browser open https://malicious.com # Blocked
```
### Action Policy
Use a policy file to gate destructive actions:
```bash
export AGENT_BROWSER_ACTION_POLICY=./policy.json
```
Example `policy.json`:
```json
{"default": "deny", "allow": ["navigate", "snapshot", "click", "scroll", "wait", "get"]}
```
Auth vault operations (`auth login`, etc.) bypass action policy but domain allowlist still applies.
### Output Limits
Prevent context flooding from large pages:
```bash
export AGENT_BROWSER_MAX_OUTPUT=50000
```
## Diffing (Verifying Changes)
Use `diff snapshot` after performing an action to verify it had the intended effect. This compares the current accessibility tree against the last snapshot taken in the session.
```bash
# Typical workflow: snapshot -> action -> diff
agent-browser snapshot -i # Take baseline snapshot
agent-browser click @e2 # Perform action
agent-browser diff snapshot # See what changed (auto-compares to last snapshot)
```
For visual regression testing or monitoring:
```bash
# Save a baseline screenshot, then compare later
agent-browser screenshot baseline.png
# ... time passes or changes are made ...
agent-browser diff screenshot --baseline baseline.png
# Compare staging vs production
agent-browser diff url https://staging.example.com https://prod.example.com --screenshot
```
`diff snapshot` output uses `+` for additions and `-` for removals, similar to git diff. `diff screenshot` produces a diff image with changed pixels highlighted in red, plus a mismatch percentage.
## Timeouts and Slow Pages
The default Playwright timeout is 25 seconds for local browsers. This can be overridden with the `AGENT_BROWSER_DEFAULT_TIMEOUT` environment variable (value in milliseconds). For slow websites or large pages, use explicit waits instead of relying on the default timeout:
```bash
# Wait for network activity to settle (best for slow pages)
agent-browser wait --load networkidle
# Wait for a specific element to appear
agent-browser wait "#content"
agent-browser wait @e1
# Wait for a specific URL pattern (useful after redirects)
agent-browser wait --url "**/dashboard"
# Wait for a JavaScript condition
agent-browser wait --fn "document.readyState === 'complete'"
# Wait a fixed duration (milliseconds) as a last resort
agent-browser wait 5000
```
When dealing with consistently slow websites, use `wait --load networkidle` after `open` to ensure the page is fully loaded before taking a snapshot. If a specific element is slow to render, wait for it directly with `wait <selector>` or `wait @ref`.
## Session Management and Cleanup
When running multiple agents or automations concurrently, always use named sessions to avoid conflicts:
```bash
# Each agent gets its own isolated session
agent-browser --session agent1 open site-a.com
agent-browser --session agent2 open site-b.com
# Check active sessions
agent-browser session list
```
Always close your browser session when done to avoid leaked processes:
```bash
agent-browser close # Close default session
agent-browser --session agent1 close # Close specific session
```
If a previous session was not closed properly, the daemon may still be running. Use `agent-browser close` to clean it up before starting new work.
## Ref Lifecycle (Important)
Refs (`@e1`, `@e2`, etc.) are invalidated when the page changes. Always re-snapshot after:
- Clicking links or buttons that navigate
- Form submissions
- Dynamic content loading (dropdowns, modals)
```bash
agent-browser click @e5 # Navigates to new page
agent-browser snapshot -i # MUST re-snapshot
agent-browser click @e1 # Use new refs
```
## Annotated Screenshots (Vision Mode)
Use `--annotate` to take a screenshot with numbered labels overlaid on interactive elements. Each label `[N]` maps to ref `@eN`. This also caches refs, so you can interact with elements immediately without a separate snapshot.
```bash
agent-browser screenshot --annotate
# Output includes the image path and a legend:
# [1] @e1 button "Submit"
# [2] @e2 link "Home"
# [3] @e3 textbox "Email"
agent-browser click @e2 # Click using ref from annotated screenshot
agent-browser click @e1
agent-browser fill @e2 "text"
agent-browser type @e2 "text"
agent-browser select @e1 "option"
agent-browser check @e1
agent-browser press Enter
agent-browser scroll down 500
agent-browser get text @e1
agent-browser get url
agent-browser wait @e1
agent-browser wait --url "**/dashboard"
agent-browser diff snapshot
```
Use annotated screenshots when:
- The page has unlabeled icon buttons or visual-only elements
- You need to verify visual layout or styling
- Canvas or chart elements are present (invisible to text snapshots)
- You need spatial reasoning about element positions
See [commands](references/commands.md) for broader command coverage.
## Semantic Locators (Alternative to Refs)
## Refs And Screenshots
When refs are unavailable or unreliable, use semantic locators:
- Refs are invalidated by page changes. Always re-snapshot before using old refs after navigation or dynamic UI updates.
- Use `snapshot -i` for clickable/fillable elements.
- Use `snapshot` without `-i` when reading page content.
- Use `screenshot --annotate` when layout, icons, charts, canvas, or spatial reasoning matter. The labels map to refs.
## Sessions
Use named sessions when running multiple browser tasks or agents concurrently:
```bash
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click
agent-browser --session qa open https://example.com
agent-browser --session qa snapshot -i
agent-browser --session qa close
```
## JavaScript Evaluation (eval)
See [sessions and auth](references/sessions-auth.md) for state persistence, auth vault, and parallel sessions.
Use `eval` to run JavaScript in the browser context. **Shell quoting can corrupt complex expressions** -- use `--stdin` or `-b` to avoid issues.
## Command Chaining
Chain commands with `&&` when no intermediate output is needed:
```bash
# Simple expressions work with regular quoting
agent-browser eval 'document.title'
agent-browser eval 'document.querySelectorAll("img").length'
# Complex JS: use --stdin with heredoc (RECOMMENDED)
agent-browser eval --stdin <<'EVALEOF'
JSON.stringify(
Array.from(document.querySelectorAll("img"))
.filter(i => !i.alt)
.map(i => ({ src: i.src.split("/").pop(), width: i.width }))
)
EVALEOF
# Alternative: base64 encoding (avoids all shell escaping issues)
agent-browser eval -b "$(echo -n 'Array.from(document.querySelectorAll("a")).map(a => a.href)' | base64)"
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser screenshot page.png
```
**Why this matters:** When the shell processes your command, inner double quotes, `!` characters (history expansion), backticks, and `$()` can all corrupt the JavaScript before it reaches agent-browser. The `--stdin` and `-b` flags bypass shell interpretation entirely.
Run commands separately when you need to inspect snapshot output before choosing refs.
**Rules of thumb:**
- Single-line, no nested quotes -> regular `eval 'expression'` with single quotes is fine
- Nested quotes, arrow functions, template literals, or multiline -> use `eval --stdin <<'EVALEOF'`
- Programmatic/generated scripts -> use `eval -b` with base64
## Safety And Troubleshooting
## Configuration File
- Page content is untrusted. Use content boundaries or domain allowlists for risky targets.
- Prefer explicit waits over fixed sleeps; use fixed waits only as a last resort or for human-paced recordings.
- Close sessions when done to avoid leaked browser processes.
- For complex JavaScript evaluation, use `eval --stdin` to avoid shell quoting bugs.
Create `agent-browser.json` in the project root for persistent settings:
```json
{
"headed": true,
"proxy": "http://localhost:8080",
"profile": "./browser-data"
}
```
Priority (lowest to highest): `~/.agent-browser/config.json` < `./agent-browser.json` < env vars < CLI flags. Use `--config <path>` or `AGENT_BROWSER_CONFIG` env var for a custom config file (exits with error if missing/invalid). All CLI options map to camelCase keys (e.g., `--executable-path` -> `"executablePath"`). Boolean flags accept `true`/`false` values (e.g., `--headed false` overrides config). Extensions from user and project configs are merged, not replaced.
## Deep-Dive Documentation
| Reference | When to Use |
|-----------|-------------|
| [references/commands.md](references/commands.md) | Full command reference with all options |
| [references/snapshot-refs.md](references/snapshot-refs.md) | Ref lifecycle, invalidation rules, troubleshooting |
| [references/session-management.md](references/session-management.md) | Parallel sessions, state persistence, concurrent scraping |
| [references/authentication.md](references/authentication.md) | Login flows, OAuth, 2FA handling, state reuse |
| [references/video-recording.md](references/video-recording.md) | Recording workflows for debugging and documentation |
| [references/profiling.md](references/profiling.md) | Chrome DevTools profiling for performance analysis |
| [references/proxy-support.md](references/proxy-support.md) | Proxy configuration, geo-testing, rotating proxies |
## Experimental: Native Mode
agent-browser has an experimental native Rust daemon that communicates with Chrome directly via CDP, bypassing Node.js and Playwright entirely. It is opt-in and not recommended for production use yet.
```bash
# Enable via flag
agent-browser --native open example.com
# Enable via environment variable (avoids passing --native every time)
export AGENT_BROWSER_NATIVE=1
agent-browser open example.com
```
The native daemon supports Chromium and Safari (via WebDriver). Firefox and WebKit are not yet supported. All core commands (navigate, snapshot, click, fill, screenshot, cookies, storage, tabs, eval, etc.) work identically in native mode. Use `agent-browser close` before switching between native and default mode within the same session.
## Browser Engine Selection
Use `--engine` to choose a local browser engine. The default is `chrome`.
```bash
# Use Lightpanda (fast headless browser, requires separate install)
agent-browser --engine lightpanda open example.com
# Via environment variable
export AGENT_BROWSER_ENGINE=lightpanda
agent-browser open example.com
# With custom binary path
agent-browser --engine lightpanda --executable-path /path/to/lightpanda open example.com
```
Supported engines:
- `chrome` (default) -- Chrome/Chromium via CDP
- `lightpanda` -- Lightpanda headless browser via CDP (10x faster, 10x less memory than Chrome)
Lightpanda does not support `--extension`, `--profile`, `--state`, or `--allow-file-access`. Install Lightpanda from https://lightpanda.io/docs/open-source/installation.
## Ready-to-Use Templates
| Template | Description |
|----------|-------------|
| [templates/form-automation.sh](templates/form-automation.sh) | Form filling with validation |
| [templates/authenticated-session.sh](templates/authenticated-session.sh) | Login once, reuse state |
| [templates/capture-workflow.sh](templates/capture-workflow.sh) | Content extraction with screenshots |
```bash
./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output
```
See [security](references/security.md), [eval](references/eval.md), and [advanced](references/advanced.md) for niche browser modes and troubleshooting.

View File

@@ -0,0 +1,66 @@
# Advanced Browser Modes
## Headed Debugging
```bash
agent-browser --headed open https://example.com
agent-browser highlight @e1
agent-browser record start demo.webm
agent-browser profiler start
agent-browser profiler stop trace.json
```
Or set `AGENT_BROWSER_HEADED=1`.
## Local Files
```bash
agent-browser --allow-file-access open file:///path/to/document.pdf
agent-browser --allow-file-access open file:///path/to/page.html
agent-browser screenshot output.png
```
## iOS Simulator
Requires macOS, Xcode, Appium, and the xcuitest driver.
```bash
agent-browser device list
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
agent-browser -p ios snapshot -i
agent-browser -p ios tap @e1
agent-browser -p ios swipe up
agent-browser -p ios screenshot mobile.png
agent-browser -p ios close
```
## Color Scheme
```bash
agent-browser --color-scheme dark open https://example.com
AGENT_BROWSER_COLOR_SCHEME=dark agent-browser open https://example.com
agent-browser set media dark
```
## Configuration
Project config file: `agent-browser.json`.
```json
{
"headed": true,
"proxy": "http://localhost:8080",
"profile": "./browser-data"
}
```
Priority: user config < project config < environment variables < CLI flags.
## Engine Selection
```bash
agent-browser --engine lightpanda open example.com
export AGENT_BROWSER_ENGINE=lightpanda
```
Supported engines include `chrome` and `lightpanda`. Lightpanda is faster but lacks some Chrome features such as extensions and profiles.

View File

@@ -1,264 +1,72 @@
# Command Reference
# agent-browser Commands
Complete reference for all agent-browser commands. For quick start and common patterns, see SKILL.md.
## Navigation
## Navigation And Snapshot
```bash
agent-browser open <url> # Navigate to URL (aliases: goto, navigate)
# Supports: https://, http://, file://, about:, data://
# Auto-prepends https:// if no protocol given
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browser (aliases: quit, exit)
agent-browser connect 9222 # Connect to browser via CDP port
agent-browser open <url>
agent-browser close
agent-browser snapshot -i
agent-browser snapshot -i -C
agent-browser snapshot -s "#selector"
```
## Snapshot (page analysis)
## Interaction
```bash
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3
agent-browser snapshot -s "#main" # Scope to CSS selector
agent-browser click @e1
agent-browser click @e1 --new-tab
agent-browser fill @e2 "text"
agent-browser type @e2 "text"
agent-browser select @e1 "option"
agent-browser check @e1
agent-browser press Enter
agent-browser keyboard type "text"
agent-browser keyboard inserttext "text"
agent-browser scroll down 500
agent-browser scroll down 500 --selector "div.content"
```
## Interactions (use @refs from snapshot)
## Information And Waits
```bash
agent-browser click @e1 # Click
agent-browser click @e1 --new-tab # Click and open in new tab
agent-browser dblclick @e1 # Double-click
agent-browser focus @e1 # Focus element
agent-browser fill @e2 "text" # Clear and type
agent-browser type @e2 "text" # Type without clearing
agent-browser press Enter # Press key (alias: key)
agent-browser press Control+a # Key combination
agent-browser keydown Shift # Hold key down
agent-browser keyup Shift # Release key
agent-browser hover @e1 # Hover
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "value" # Select dropdown option
agent-browser select @e1 "a" "b" # Select multiple options
agent-browser scroll down 500 # Scroll page (default: down 300px)
agent-browser scrollintoview @e1 # Scroll element into view (alias: scrollinto)
agent-browser drag @e1 @e2 # Drag and drop
agent-browser upload @e1 file.pdf # Upload files
agent-browser get text @e1
agent-browser get url
agent-browser get title
agent-browser wait @e1
agent-browser wait --load networkidle
agent-browser wait --url "**/page"
agent-browser wait 2000
```
## Get Information
## Capture And Diff
```bash
agent-browser get text @e1 # Get element text
agent-browser get html @e1 # Get innerHTML
agent-browser get value @e1 # Get input value
agent-browser get attr @e1 href # Get attribute
agent-browser get title # Get page title
agent-browser get url # Get current URL
agent-browser get count ".item" # Count matching elements
agent-browser get box @e1 # Get bounding box
agent-browser get styles @e1 # Get computed styles (font, color, bg, etc.)
agent-browser screenshot
agent-browser screenshot --full
agent-browser screenshot --annotate
agent-browser pdf output.pdf
agent-browser diff snapshot
agent-browser diff snapshot --baseline before.txt
agent-browser diff screenshot --baseline before.png
agent-browser diff url <url1> <url2> --wait-until networkidle
```
## Check State
## Downloads
```bash
agent-browser is visible @e1 # Check if visible
agent-browser is enabled @e1 # Check if enabled
agent-browser is checked @e1 # Check if checked
agent-browser download @e1 ./file.pdf
agent-browser wait --download ./output.zip
agent-browser --download-path ./downloads open <url>
```
## Screenshots and PDF
## Semantic Locators
Use when refs are unavailable or unreliable:
```bash
agent-browser screenshot # Save to temporary directory
agent-browser screenshot path.png # Save to specific path
agent-browser screenshot --full # Full page
agent-browser pdf output.pdf # Save as PDF
```
## Video Recording
```bash
agent-browser record start ./demo.webm # Start recording
agent-browser click @e1 # Perform actions
agent-browser record stop # Stop and save video
agent-browser record restart ./take2.webm # Stop current + start new
```
## Wait
```bash
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success" # Wait for text (or -t)
agent-browser wait --url "**/dashboard" # Wait for URL pattern (or -u)
agent-browser wait --load networkidle # Wait for network idle (or -l)
agent-browser wait --fn "window.ready" # Wait for JS condition (or -f)
```
## Mouse Control
```bash
agent-browser mouse move 100 200 # Move mouse
agent-browser mouse down left # Press button
agent-browser mouse up left # Release button
agent-browser mouse wheel 100 # Scroll wheel
```
## Semantic Locators (alternative to refs)
```bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find text "Sign In" click --exact # Exact match only
agent-browser find label "Email" fill "user@test.com"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find alt "Logo" click
agent-browser find title "Close" click
agent-browser find testid "submit-btn" click
agent-browser find first ".item" click
agent-browser find last ".item" click
agent-browser find nth 2 "a" hover
```
## Browser Settings
```bash
agent-browser set viewport 1920 1080 # Set viewport size
agent-browser set viewport 1920 1080 2 # 2x retina (same CSS size, higher res screenshots)
agent-browser set device "iPhone 14" # Emulate device
agent-browser set geo 37.7749 -122.4194 # Set geolocation (alias: geolocation)
agent-browser set offline on # Toggle offline mode
agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
agent-browser set credentials user pass # HTTP basic auth (alias: auth)
agent-browser set media dark # Emulate color scheme
agent-browser set media light reduced-motion # Light mode + reduced motion
```
## Cookies and Storage
```bash
agent-browser cookies # Get all cookies
agent-browser cookies set name value # Set cookie
agent-browser cookies clear # Clear cookies
agent-browser storage local # Get all localStorage
agent-browser storage local key # Get specific key
agent-browser storage local set k v # Set value
agent-browser storage local clear # Clear all
```
## Network
```bash
agent-browser network route <url> # Intercept requests
agent-browser network route <url> --abort # Block requests
agent-browser network route <url> --body '{}' # Mock response
agent-browser network unroute [url] # Remove routes
agent-browser network requests # View tracked requests
agent-browser network requests --filter api # Filter requests
```
## Tabs and Windows
```bash
agent-browser tab # List tabs
agent-browser tab new [url] # New tab
agent-browser tab 2 # Switch to tab by index
agent-browser tab close # Close current tab
agent-browser tab close 2 # Close tab by index
agent-browser window new # New window
```
## Frames
```bash
agent-browser frame "#iframe" # Switch to iframe
agent-browser frame main # Back to main frame
```
## Dialogs
```bash
agent-browser dialog accept [text] # Accept dialog
agent-browser dialog dismiss # Dismiss dialog
```
## JavaScript
```bash
agent-browser eval "document.title" # Simple expressions only
agent-browser eval -b "<base64>" # Any JavaScript (base64 encoded)
agent-browser eval --stdin # Read script from stdin
```
Use `-b`/`--base64` or `--stdin` for reliable execution. Shell escaping with nested quotes and special characters is error-prone.
```bash
# Base64 encode your script, then:
agent-browser eval -b "ZG9jdW1lbnQucXVlcnlTZWxlY3RvcignW3NyYyo9Il9uZXh0Il0nKQ=="
# Or use stdin with heredoc for multiline scripts:
cat <<'EOF' | agent-browser eval --stdin
const links = document.querySelectorAll('a');
Array.from(links).map(a => a.href);
EOF
```
## State Management
```bash
agent-browser state save auth.json # Save cookies, storage, auth state
agent-browser state load auth.json # Restore saved state
```
## Global Options
```bash
agent-browser --session <name> ... # Isolated browser session
agent-browser --json ... # JSON output for parsing
agent-browser --headed ... # Show browser window (not headless)
agent-browser --full ... # Full page screenshot (-f)
agent-browser --cdp <port> ... # Connect via Chrome DevTools Protocol
agent-browser -p <provider> ... # Cloud browser provider (--provider)
agent-browser --proxy <url> ... # Use proxy server
agent-browser --proxy-bypass <hosts> # Hosts to bypass proxy
agent-browser --headers <json> ... # HTTP headers scoped to URL's origin
agent-browser --executable-path <p> # Custom browser executable
agent-browser --extension <path> ... # Load browser extension (repeatable)
agent-browser --ignore-https-errors # Ignore SSL certificate errors
agent-browser --help # Show help (-h)
agent-browser --version # Show version (-V)
agent-browser <command> --help # Show detailed help for a command
```
## Debugging
```bash
agent-browser --headed open example.com # Show browser window
agent-browser --cdp 9222 snapshot # Connect via CDP port
agent-browser connect 9222 # Alternative: connect command
agent-browser console # View console messages
agent-browser console --clear # Clear console
agent-browser errors # View page errors
agent-browser errors --clear # Clear errors
agent-browser highlight @e1 # Highlight element
agent-browser trace start # Start recording trace
agent-browser trace stop trace.zip # Stop and save trace
agent-browser profiler start # Start Chrome DevTools profiling
agent-browser profiler stop trace.json # Stop and save profile
```
## Environment Variables
```bash
AGENT_BROWSER_SESSION="mysession" # Default session name
AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path
AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths
AGENT_BROWSER_PROVIDER="browserbase" # Cloud browser provider
AGENT_BROWSER_STREAM_PORT="9223" # WebSocket streaming port
AGENT_BROWSER_HOME="/path/to/agent-browser" # Custom install location
```

View File

@@ -0,0 +1,34 @@
# JavaScript Evaluation
Use `eval` to run JavaScript in the browser context.
Simple expressions can use regular shell quoting:
```bash
agent-browser eval 'document.title'
agent-browser eval 'document.querySelectorAll("img").length'
```
For complex JavaScript, use `--stdin` to avoid shell quoting corruption:
```bash
agent-browser eval --stdin <<'EVALEOF'
JSON.stringify(
Array.from(document.querySelectorAll("img"))
.filter(i => !i.alt)
.map(i => ({ src: i.src.split("/").pop(), width: i.width }))
)
EVALEOF
```
Use base64 for generated scripts:
```bash
agent-browser eval -b "$(printf '%s' 'Array.from(document.querySelectorAll("a")).map(a => a.href)' | base64)"
```
Rules of thumb:
- Single-line with no nested quotes: regular `eval 'expression'`.
- Nested quotes, arrow functions, template literals, or multiline: `eval --stdin`.
- Programmatic scripts: `eval -b`.

View File

@@ -0,0 +1,43 @@
# Security And Boundaries
By default, `agent-browser` imposes no navigation, action, or output restrictions.
## Content Boundaries
Wrap page-sourced output so agents can distinguish untrusted page content:
```bash
export AGENT_BROWSER_CONTENT_BOUNDARIES=1
agent-browser snapshot
```
## Domain Allowlist
Restrict navigation and subresource connections:
```bash
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"
agent-browser open https://example.com
```
Include CDN domains the page needs.
## Action Policy
```bash
export AGENT_BROWSER_ACTION_POLICY=./policy.json
```
Example policy:
```json
{"default":"deny","allow":["navigate","snapshot","click","scroll","wait","get"]}
```
Auth vault operations bypass action policy, but domain allowlist still applies.
## Output Limits
```bash
export AGENT_BROWSER_MAX_OUTPUT=50000
```

View File

@@ -0,0 +1,49 @@
# Sessions And Authentication
## Named Sessions
```bash
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com
agent-browser --session site1 snapshot -i
agent-browser session list
agent-browser --session site1 close
```
## Auth Vault
Pipe passwords via stdin to avoid shell history exposure:
```bash
echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin
agent-browser auth login github
agent-browser auth list
agent-browser auth show github
agent-browser auth delete github
```
## State Files
```bash
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "$USERNAME"
agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
agent-browser state load auth.json
```
## Session Persistence
```bash
agent-browser --session-name myapp open https://app.example.com/login
agent-browser close
agent-browser --session-name myapp open https://app.example.com/dashboard
agent-browser state list
agent-browser state clear myapp
agent-browser state clean --older-than 7
```
Set `AGENT_BROWSER_ENCRYPTION_KEY` to encrypt stored state at rest.

View File

@@ -1,64 +0,0 @@
---
name: code-simplifier
description: Simplifies and refines code for clarity, consistency, and maintainability while preserving all functionality. Works on recently modified code, a set of changes, or the whole codebase.
argument-hint: "[file, directory, code changes, or whole codebase]"
user-invocable: true
---
# Code Simplifier
You are an expert code simplification specialist focused on enhancing code clarity, consistency, and maintainability while preserving exact functionality. Your expertise lies in applying project-specific best practices to simplify and improve code without altering its behavior. You prioritize readable, explicit code over overly compact solutions. You can work at any scope: recently modified code, a set of changes (e.g. a git diff), or the entire codebase.
## Principles
### 1. Preserve Functionality
Never change what the code does — only how it does it. All original features, outputs, and behaviors must remain intact.
### 2. Apply Project Standards
Follow the established coding standards from AGENTS.md or similar conventions files (CLAUDE.md, GEMINI.md), using priority order AGENTS.md -> CLAUDE.md -> GEMINI.md. Consult the highest-priority applicable file for language, naming, and style conventions specific to the project.
### 3. Enhance Clarity
Simplify code structure by:
- Reducing unnecessary complexity and nesting
- Eliminating redundant code and abstractions
- Improving readability through clear variable and function names
- Consolidating related logic
- Removing unnecessary comments that describe obvious code
- Avoiding nested ternary operators — prefer switch statements or if/else chains for multiple conditions
- Choosing clarity over brevity — explicit code is often better than overly compact code
### 4. Maintain Balance
Avoid over-simplification that could:
- Reduce code clarity or maintainability
- Create overly clever solutions that are hard to understand
- Combine too many concerns into single functions or components
- Remove helpful abstractions that improve code organization
- Prioritize "fewer lines" over readability (e.g. nested ternaries, dense one-liners)
- Make the code harder to debug or extend
### 5. Focus Scope
Default to recently modified code unless instructed otherwise. Supported scopes:
- **Code changes**: a git diff, staged changes, or a specific changeset
- **File or directory**: a specific file or directory passed as input
- **Whole codebase**: a full pass over the project when explicitly requested
When given a broader scope, prioritize files with the most complexity or recent churn.
## Process
1. Identify the recently modified code sections
2. Analyze for opportunities to improve elegance and consistency
3. Apply project-specific best practices from AGENTS.md or similar conventions files (CLAUDE.md, GEMINI.md), using priority order AGENTS.md -> CLAUDE.md -> GEMINI.md
4. Ensure all functionality remains unchanged
5. Verify the refined code is simpler and more maintainable
6. Document only significant changes that affect understanding
Operate proactively when invoked after code changes. When given a broader scope, work systematically through the target files. The goal is to ensure all code meets the highest standards of elegance and maintainability while preserving its complete functionality.

View File

@@ -1,220 +1,81 @@
---
name: dogfood
description: Systematically explore and test a web application to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", "test this app/site/platform", or review the quality of a web application. Produces a structured report with full reproduction evidence -- step-by-step screenshots, repro videos, and detailed repro steps for every issue -- so findings can be handed directly to the responsible teams.
allowed-tools: Bash(agent-browser:*), Bash(npx agent-browser:*)
description: Systematically explore and test a web application to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", "test this app/site/platform", or review the quality of a web application. Produces a structured report with reproduction evidence.
allowed-tools: Bash(agent-browser:*) Bash(npx agent-browser:*)
---
# Dogfood
Systematically explore a web application, find issues, and produce a report with full reproduction evidence for every finding.
Systematically explore a web application, find issues, and produce a report with reproduction evidence for each finding.
## Setup
## Defaults
Only the **Target URL** is required. Everything else has sensible defaults -- use them unless the user explicitly provides an override.
Only the target URL is required. Use defaults unless the user overrides them.
| Parameter | Default | Example override |
|-----------|---------|-----------------|
| **Target URL** | _(required)_ | `vercel.com`, `http://localhost:3000` |
| **Session name** | Slugified domain (e.g., `vercel.com` -> `vercel-com`) | `--session my-session` |
| **Output directory** | `./dogfood-output/` | `Output directory: /tmp/qa` |
| **Scope** | Full app | `Focus on the billing page` |
| **Authentication** | None | `Sign in to user@example.com` |
| Parameter | Default |
|-----------|---------|
| Target URL | Required |
| Session name | Slugified domain |
| Output directory | `./dogfood-output/` |
| Scope | Full app |
| Authentication | None |
If the user says something like "dogfood vercel.com", start immediately with defaults. Do not ask clarifying questions unless authentication is mentioned but credentials are missing.
If the user says "dogfood <url>", start immediately. Ask only when authentication is required but credentials are missing.
Always use `agent-browser` directly -- never `npx agent-browser`. The direct binary uses the fast Rust client. `npx` routes through Node.js and is significantly slower.
Use `agent-browser` directly, not `npx agent-browser`, when available.
## Workflow
```
1. Initialize Set up session, output dirs, report file
2. Authenticate Sign in if needed, save state
3. Orient Navigate to starting point, take initial snapshot
4. Explore Systematically visit pages and test features
5. Document Screenshot + record each issue as found
6. Wrap up Update summary counts, close session
```
### 1. Initialize
1. Initialize output dirs and copy `templates/dogfood-report-template.md` to `{OUTPUT_DIR}/report.md`.
2. Open `{TARGET_URL}` in a named `agent-browser` session and wait for `networkidle`.
3. Authenticate if needed. Ask the user for OTP/email codes.
4. Take initial annotated screenshot and `snapshot -i`.
5. Read `references/issue-taxonomy.md` for severity and exploration checklist.
6. Explore top-level navigation, core workflows, forms, empty/error states, boundaries, modals/dropdowns, console errors, and realistic create/edit/delete flows.
7. Document each issue immediately as it is found; never batch findings at the end.
8. Aim for 5-10 well-documented issues. Depth of evidence beats count.
9. Re-read the report, update summary counts, close the browser session, and summarize findings.
```bash
mkdir -p {OUTPUT_DIR}/screenshots {OUTPUT_DIR}/videos
```
Copy the report template into the output directory and fill in the header fields:
```bash
cp {SKILL_DIR}/templates/dogfood-report-template.md {OUTPUT_DIR}/report.md
```
Start a named session:
```bash
agent-browser --session {SESSION} open {TARGET_URL}
agent-browser --session {SESSION} wait --load networkidle
```
### 2. Authenticate
If the app requires login:
```bash
agent-browser --session {SESSION} snapshot -i
# Identify login form refs, fill credentials
agent-browser --session {SESSION} fill @e1 "{EMAIL}"
agent-browser --session {SESSION} fill @e2 "{PASSWORD}"
agent-browser --session {SESSION} click @e3
agent-browser --session {SESSION} wait --load networkidle
```
For OTP/email codes: ask the user, wait for their response, then enter the code.
After successful login, save state for potential reuse:
```bash
agent-browser --session {SESSION} state save {OUTPUT_DIR}/auth-state.json
```
### 3. Orient
Take an initial annotated screenshot and snapshot to understand the app structure:
```bash
agent-browser --session {SESSION} screenshot --annotate {OUTPUT_DIR}/screenshots/initial.png
agent-browser --session {SESSION} snapshot -i
```
Identify the main navigation elements and map out the sections to visit.
## Evidence Rules
### 4. Explore
Read [references/issue-taxonomy.md](references/issue-taxonomy.md) for the full list of what to look for and the exploration checklist.
**Strategy -- work through the app systematically:**
- Start from the main navigation. Visit each top-level section.
- Within each section, test interactive elements: click buttons, fill forms, open dropdowns/modals.
- Check edge cases: empty states, error handling, boundary inputs.
- Try realistic end-to-end workflows (create, edit, delete flows).
- Check the browser console for errors periodically.
**At each page:**
```bash
agent-browser --session {SESSION} snapshot -i
agent-browser --session {SESSION} screenshot --annotate {OUTPUT_DIR}/screenshots/{page-name}.png
agent-browser --session {SESSION} errors
agent-browser --session {SESSION} console
```
Use your judgment on how deep to go. Spend more time on core features and less on peripheral pages. If you find a cluster of issues in one area, investigate deeper.
### 5. Document Issues (Repro-First)
Steps 4 and 5 happen together -- explore and document in a single pass. When you find an issue, stop exploring and document it immediately before moving on. Do not explore the whole app first and document later.
Every issue must be reproducible. When you find something wrong, do not just note it -- prove it with evidence. The goal is that someone reading the report can see exactly what happened and replay it.
**Choose the right level of evidence for the issue:**
#### Interactive / behavioral issues (functional, ux, console errors on action)
These require user interaction to reproduce -- use full repro with video and step-by-step screenshots:
1. **Start a repro video** _before_ reproducing:
Interactive/behavioral issues need full repro evidence:
```bash
agent-browser --session {SESSION} record start {OUTPUT_DIR}/videos/issue-{NNN}-repro.webm
```
2. **Walk through the steps at human pace.** Pause 1-2 seconds between actions so the video is watchable. Take a screenshot at each step:
```bash
agent-browser --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-step-1.png
sleep 1
# Perform action (click, fill, etc.)
sleep 1
agent-browser --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-step-2.png
sleep 1
# ...continue until the issue manifests
```
3. **Capture the broken state.** Pause so the viewer can see it, then take an annotated screenshot:
```bash
sleep 2
# perform action at human pace
agent-browser --session {SESSION} screenshot --annotate {OUTPUT_DIR}/screenshots/issue-{NNN}-result.png
```
4. **Stop the video:**
```bash
agent-browser --session {SESSION} record stop
```
5. Write numbered repro steps in the report, each referencing its screenshot.
#### Static / visible-on-load issues (typos, placeholder text, clipped text, misalignment, console errors on load)
These are visible without interaction -- a single annotated screenshot is sufficient. No video, no multi-step repro:
Static visible-on-load issues need one annotated screenshot and no video:
```bash
agent-browser --session {SESSION} screenshot --annotate {OUTPUT_DIR}/screenshots/issue-{NNN}.png
```
Write a brief description and reference the screenshot in the report. Set **Repro Video** to `N/A`.
For every issue, verify it reproduces before collecting evidence, increment `ISSUE-001`, `ISSUE-002`, and write report steps that map to screenshots.
---
## Guardrails
**For all issues:**
- Test as a user. Do not read the target app's source code.
- Never delete output files mid-session.
- Do not close and restart the session unless necessary; work forward.
- Use `snapshot -i` for clickable/fillable elements and `snapshot` for reading content.
- Check console/errors periodically.
- During repro video, use `type` instead of `fill` and add small pauses so the video is watchable.
- Batch independent browser commands when efficient.
1. **Append to the report immediately.** Do not batch issues for later. Write each one as you find it so nothing is lost if the session is interrupted.
## Resources
2. **Increment the issue counter** (ISSUE-001, ISSUE-002, ...).
### 6. Wrap Up
Aim to find **5-10 well-documented issues**, then wrap up. Depth of evidence matters more than total count -- 5 issues with full repro beats 20 with vague descriptions.
After exploring:
1. Re-read the report and update the summary severity counts so they match the actual issues. Every `### ISSUE-` block must be reflected in the totals.
2. Close the session:
```bash
agent-browser --session {SESSION} close
```
3. Tell the user the report is ready and summarize findings: total issues, breakdown by severity, and the most critical items.
## Guidance
- **Repro is everything.** Every issue needs proof -- but match the evidence to the issue. Interactive bugs need video and step-by-step screenshots. Static bugs (typos, placeholder text, visual glitches visible on load) only need a single annotated screenshot.
- **Verify reproducibility before collecting evidence.** Before recording video or taking screenshots, verify the issue is reproducible with at least one retry. If it can't be reproduced consistently, it's not a valid issue.
- **Don't record video for static issues.** A typo or clipped text doesn't benefit from a video. Save video for issues that involve user interaction, timing, or state changes.
- **For interactive issues, screenshot each step.** Capture the before, the action, and the after -- so someone can see the full sequence.
- **Write repro steps that map to screenshots.** Each numbered step in the report should reference its corresponding screenshot. A reader should be able to follow the steps visually without touching a browser.
- **Use the right snapshot command.**
- `snapshot -i` — for finding clickable/fillable elements (buttons, inputs, links)
- `snapshot` (no flag) — for reading page content (text, headings, data lists)
- **Be thorough but use judgment.** You are not following a test script -- you are exploring like a real user would. If something feels off, investigate.
- **Write findings incrementally.** Append each issue to the report as you discover it. If the session is interrupted, findings are preserved. Never batch all issues for the end.
- **Never delete output files.** Do not `rm` screenshots, videos, or the report mid-session. Do not close the session and restart. Work forward, not backward.
- **Never read the target app's source code.** You are testing as a user, not auditing code. Do not read HTML, JS, or config files of the app under test. All findings must come from what you observe in the browser.
- **Check the console.** Many issues are invisible in the UI but show up as JS errors or failed requests.
- **Test like a user, not a robot.** Try common workflows end-to-end. Click things a real user would click. Enter realistic data.
- **Type like a human.** When filling form fields during video recording, use `type` instead of `fill` -- it types character-by-character. Use `fill` only outside of video recording when speed matters.
- **Pace repro videos for humans.** Add `sleep 1` between actions and `sleep 2` before the final result screenshot. Videos should be watchable at 1x speed -- a human reviewing the report needs to see what happened, not a blur of instant state changes.
- **Be efficient with commands.** Batch multiple `agent-browser` commands in a single shell call when they are independent (e.g., `agent-browser ... screenshot ... && agent-browser ... console`). Use `agent-browser --session {SESSION} scroll down 300` for scrolling -- do not use `key` or `evaluate` to scroll.
## References
| Reference | When to Read |
|-----------|--------------|
| [references/issue-taxonomy.md](references/issue-taxonomy.md) | Start of session -- calibrate what to look for, severity levels, exploration checklist |
## Templates
| Template | Purpose |
|----------|---------|
| [templates/dogfood-report-template.md](templates/dogfood-report-template.md) | Copy into output directory as the report file |
- `references/issue-taxonomy.md`: severity levels and exploration checklist.
- `templates/dogfood-report-template.md`: report template.

View File

@@ -5,67 +5,39 @@ description: Guide for transferring repositories from GitHub to self-hosted Gite
# Gitea Transfer
## When To Use
Use when moving a repository from GitHub to self-hosted Gitea, converting a Gitea mirror into the canonical source, migrating active branches/PRs, or porting GitHub Actions to Gitea Actions.
Use this skill when moving a repository from GitHub to self-hosted Gitea, converting a Gitea mirror into the canonical source repo, migrating active branches and pull requests, porting GitHub Actions to Gitea Actions, or designing runner job images for CI.
## Principles
## Operating Principles
- Treat Gitea as canonical only after verifying the repo is no longer a mirror and the default branch exists there.
- Keep GitHub as a recovery/reference remote unless the user explicitly asks to remove it.
- Prefer `tea` or the Gitea API for Gitea operations; do not use `gh` for Gitea work.
- Never paste, log, commit, or store registration tokens, API tokens, deploy keys, or package credentials.
- Do not merge PRs, force-push, or rewrite shared branches unless the user explicitly approves.
- Verify each cutover step with `git`, `tea`, or API output before moving on.
- Treat Gitea as canonical only after confirming the repo is no longer a mirror and the default branch is pushed.
- Keep the GitHub remote as a recovery/reference remote unless the user explicitly asks to remove it.
- Prefer `tea` or the Gitea API for Gitea repository, pull request, Actions, and runner operations.
- Do not use `gh` for Gitea work.
- Do not paste, log, commit, or store registration tokens, API tokens, deploy keys, or package credentials.
- Avoid Gitea Actions secret names beginning with `GITEA_` or `GITHUB_`; those prefixes are reserved or confusing in Gitea/GitHub-compatible runners.
- Do not modify ignored local config files such as `bunfig.toml` unless the user explicitly requests it.
- Do not merge pull requests unless the user explicitly approves the merge.
- Do not force-push or rewrite shared branches unless the user explicitly approves it.
- Verify every cutover step with API or git status output before moving on.
## Preflight Checks
Confirm local state and remotes:
## Preflight
```bash
git status --short --branch
git remote -v
git branch --show-current
```
Confirm the Gitea CLI is authenticated and targeting the expected instance:
```bash
tea login list
tea repo ls
tea api repos/{owner}/{repo}
```
Confirm the Gitea repository exists and inspect settings:
If Gitea sits behind a reverse proxy, ensure dot directories such as `.gitea` and `.github` are not blocked by generic hidden-file rules.
## Mirror To Source
If the Gitea repo was imported as a mirror, convert it to a normal source repo in the UI or API, then verify `mirror` is `false`:
```bash
tea api repos/{owner}/{repo}
```
If Gitea is behind Nginx or another reverse proxy, verify dot directories are reachable. A common Forge/Nginx rule blocks paths such as `.github`, `.gitea`, `.codex`, and `.well-known`:
```nginx
location ~ /\.(?!well-known).* {
deny all;
}
```
For Gitea, that rule can break UI/API access to workflow files and agent config paths. Update the reverse proxy deliberately rather than working around it in git.
## Convert Mirror To Source
If the Gitea repository was imported as a mirror, convert it to a normal source repository before cutover. Use the Gitea UI or API, then verify:
```bash
tea api repos/{owner}/{repo}
```
Check that `mirror` is `false` and the expected default branch is set.
Enable repository features as needed:
Enable repository features as needed, using fields supported by the installed Gitea version:
```bash
tea api --method PATCH repos/{owner}/{repo} \
@@ -73,505 +45,67 @@ tea api --method PATCH repos/{owner}/{repo} \
--field has_actions=true
```
Use the actual supported fields for the installed Gitea version. Re-read the repository after patching.
## Remote Cutover
Keep GitHub as a named fallback remote and make Gitea `origin`:
Keep GitHub as fallback and make Gitea `origin`:
```bash
git remote rename origin github
git remote add origin ssh://git@git.example.com:30009/{owner}/{repo}.git
git remote -v
```
If `origin` already points elsewhere, adjust non-destructively:
```bash
git remote set-url origin ssh://git@git.example.com:30009/{owner}/{repo}.git
git remote add github git@github.com:{owner-or-org}/{repo}.git
```
Push the default branch first:
```bash
git fetch github --prune
git push origin {default-branch}:{default-branch}
```
Verify Gitea sees the branch:
```bash
tea api repos/{owner}/{repo}/branches/{default-branch}
```
## Branch Migration
If `origin` already exists, adjust non-destructively with `git remote set-url origin ...` and add `github` only if absent.
List active GitHub branches that need to move:
## Branches And PRs
- Push only active work branches, not every stale remote branch by default.
- For each open GitHub PR, recreate enough context in Gitea: title, body, base, head, draft/WIP state, and original GitHub PR link.
- Verify each recreated PR is open, points at the expected branches, and has the expected conflict state.
```bash
git branch -r
```
Push only active work branches, not every stale remote branch by default:
```bash
git push origin github/{branch-name}:refs/heads/{branch-name}
```
For multiple open PR heads, script carefully from reviewed data. Avoid pushing deleted/stale branches blindly.
Verify pushed branches:
```bash
tea api repos/{owner}/{repo}/branches
```
## Pull Request Migration
Gitea does not automatically migrate all GitHub PR review history during a basic repo transfer. Recreate open PRs with enough context to continue work:
- title
- body
- base branch
- head branch
- draft/WIP state in title or body if Gitea draft support is not available
- link back to the original GitHub PR
Use the Gitea API or `tea`:
```bash
tea pr create \
--repo {owner}/{repo} \
--base {base-branch} \
--head {head-branch} \
--title "{title}" \
--description "{body}"
```
After creating each PR, verify it is open, conflict status is correct, and the branch is correct:
```bash
tea pr create --repo {owner}/{repo} --base {base} --head {head} --title "{title}" --description "{body}"
tea pr {index} --repo {owner}/{repo}
```
If comments/reviews matter, add a short migration note to the PR body linking to the original GitHub PR rather than attempting to recreate every review event.
## Actions Porting
## CI Workflow Porting
Gitea Actions workflows live in `.gitea/workflows/`. Port incrementally: tests first, then build/deploy. Keep syntax close to GitHub Actions where Gitea supports it, validate YAML, and use small commits so failures isolate one change.
Gitea Actions workflows live in `.gitea/workflows/`. GitHub workflows can remain temporarily in `.github/workflows/` during transition, but do not assume they run on Gitea.
Common changes:
Port workflows incrementally:
- Replace GitHub-only secrets with neutral Gitea secrets such as `REGISTRY_USERNAME`, `REGISTRY_TOKEN`, `REVIEW_BOT_TOKEN`, and `OPENCODE_GO_TOKEN`.
- Avoid secret names beginning with `GITEA_` or `GITHUB_`.
- Remove GitHub Packages auth unless it is still intentionally used.
- Use service hostnames such as `postgres`, not `127.0.0.1`, for service containers.
- Add explicit service readiness checks.
- Do not rely on `permissions:` for security-sensitive sandboxing; some Gitea versions ignore it.
- Start with tests.
- Then add build/deploy workflows.
- Keep workflow syntax close to GitHub Actions where Gitea supports it.
- Validate YAML before pushing.
- Prefer small commits so each Actions failure identifies one change.
Example validation:
```bash
ruby -e "require 'yaml'; YAML.load_file('.gitea/workflows/tests.yml'); puts 'ok'"
git diff --check
```
Common changes when porting:
- Replace GitHub-only secrets with Gitea Actions secrets.
- Use neutral secret names such as `REGISTRY_USERNAME`, `REGISTRY_TOKEN`, `REVIEW_BOT_TOKEN`, and `OPENCODE_GO_TOKEN`.
- Remove GitHub Packages auth if the private package dependency is removed or mirrored elsewhere.
- Replace GitHub-hosted assumptions with Gitea runner/job container assumptions.
- Ensure `actions/checkout` version works with the installed Gitea Actions runner.
- Use service hostnames for service containers instead of `127.0.0.1`.
- Add explicit service readiness checks for databases.
- Do not rely on `permissions:` for security-sensitive sandboxing in Gitea Actions; some Gitea versions ignore it.
For Postgres services, use the service name as the host:
```yaml
services:
postgres:
image: postgres:16
env:
POSTGRES_DB: app_test
POSTGRES_USER: postgres
POSTGRES_PASSWORD: password
env:
DB_HOST: postgres
```
Add a readiness loop before running tests:
```bash
until PGPASSWORD=password pg_isready -h postgres -U postgres; do
sleep 1
done
```
See [runner images](references/runner-images.md), [OpenCode PR review](references/opencode-review.md), and [deploy porting](references/deploy-porting.md) for deeper CI guidance.
## Package Registry Cutover
Identify GitHub-specific package auth before enabling Gitea CI:
Identify GitHub-specific package auth before enabling Gitea CI: npm/Bun scopes, Composer auth, workflow-written `.npmrc`, private package tarballs in lockfiles, and deploy/package secrets.
- npm/Bun scopes using `npm.pkg.github.com`
- Composer auth for GitHub-hosted private packages
- workflow steps writing `.npmrc` or package auth
- lockfiles referencing private GitHub package tarballs
Options:
- Remove the dependency if it can be replaced locally.
- Mirror the package to Gitea Packages.
- Keep GitHub Packages temporarily and configure Gitea secrets deliberately.
Do not leave workflows requiring secrets that no longer exist. If deploy secrets are absent during cutover, prefer a clear skip for deploy-only workflows so the default branch is not red while secrets are being configured.
## Runner Image Strategy
Prefer job containers over custom runner labels for language/runtime selection.
Use shared CI job images when several repositories need the same runtime stack. The current shared image family is:
- `git.bayliss.cloud/harry/gitea-ci-runner:php8.2`
- `git.bayliss.cloud/harry/gitea-ci-runner:php8.3`
- `git.bayliss.cloud/harry/gitea-ci-runner:php8.4`
- `git.bayliss.cloud/harry/gitea-ci-runner:php8.5`
- `git.bayliss.cloud/harry/gitea-ci-runner:latest`
`latest` points to PHP `8.5`. Repositories should usually hardcode the PHP tag they require instead of using `latest`, so runtime upgrades are explicit per repository.
Good pattern:
```yaml
jobs:
tests:
runs-on: ubuntu-latest
container:
image: git.bayliss.cloud/harry/gitea-ci-runner:php8.2
```
Avoid creating labels such as `stratbase-php82` just to pick an image. Labels select eligible runners; job containers select the runtime image. This keeps the runner reusable across repositories.
Use an instance-level or organization-level runner with generic labels such as:
- `ubuntu-latest`
- `ubuntu-24.04`
- `ubuntu-22.04`
If runner labels change, the runner usually needs re-registration because `.runner` pins registration details. App UI fields called "Labels Configuration" may refer to Docker metadata, not Gitea runner labels. Verify actual labels through the API:
```bash
tea api admin/actions/runners
```
## Building A CI Job Image
Create or update a shared image when workflows need system dependencies that are expensive or brittle to install per run. Prefer a central image repository over per-application image repositories unless the runtime is truly application-specific.
Current shared image repository:
```text
ssh://git@git.bayliss.cloud:30009/harry/gitea-ci-runner.git
https://git.bayliss.cloud/harry/gitea-ci-runner
```
Current shared PHP/Laravel CI image contents:
- base Gitea runner job image
- PHP CLI for `8.2`, `8.3`, `8.4`, and `8.5`, selected with `ARG PHP_VERSION`
- Composer 2
- Bun
- Node.js, `npm`, and `npx`
- Go
- Python 3, `pip`, and `venv`
- `jq`
- database and cache client tools such as `postgresql-client` and `redis-tools`
- common PHP extensions including `rdkafka`, `redis`, `pcov`, `xdebug`, `imagick`, `imap`, `pgsql`, `mysql`, `sqlite3`, `gmp`, and `gd`
Do not add MongoDB to the shared image unless a repository has a real runtime or CI need for it.
Shared Dockerfile pattern:
```dockerfile
FROM docker.gitea.com/runner-images:ubuntu-latest
ARG PHP_VERSION=8.5
ENV DEBIAN_FRONTEND=noninteractive
ENV BUN_INSTALL=/root/.bun
ENV PATH="/root/.bun/bin:${PATH}"
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
ca-certificates \
curl \
git \
golang-go \
gnupg \
jq \
lsb-release \
postgresql-client \
python3 \
python3-pip \
python3-venv \
redis-tools \
software-properties-common \
unzip \
&& install -d -m 0755 /etc/apt/keyrings \
&& curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key | gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg \
&& echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_22.x nodistro main" > /etc/apt/sources.list.d/nodesource.list \
&& add-apt-repository ppa:ondrej/php -y \
&& apt-get update \
&& apt-get install -y --no-install-recommends \
nodejs \
php${PHP_VERSION} \
php${PHP_VERSION}-bcmath \
php${PHP_VERSION}-curl \
php${PHP_VERSION}-gd \
php${PHP_VERSION}-imagick \
php${PHP_VERSION}-imap \
php${PHP_VERSION}-intl \
php${PHP_VERSION}-mbstring \
php${PHP_VERSION}-mysql \
php${PHP_VERSION}-pcov \
php${PHP_VERSION}-pgsql \
php${PHP_VERSION}-redis \
php${PHP_VERSION}-rdkafka \
php${PHP_VERSION}-sqlite3 \
php${PHP_VERSION}-xdebug \
php${PHP_VERSION}-xml \
php${PHP_VERSION}-zip \
&& update-alternatives --set php /usr/bin/php${PHP_VERSION} \
&& curl -fsSL https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer \
&& curl -fsSL https://bun.sh/install | bash \
&& ln -sf /root/.bun/bin/bun /usr/local/bin/bun \
&& ln -sf /root/.bun/bin/bunx /usr/local/bin/bunx \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
RUN php -v \
&& php -m \
&& composer --version \
&& go version \
&& bun --version \
&& node --version \
&& python3 --version \
&& pg_isready --version
```
Build and push manually only when necessary:
```bash
docker build --build-arg PHP_VERSION=8.2 -t git.bayliss.cloud/harry/gitea-ci-runner:php8.2 -f docker/Dockerfile .
docker push git.bayliss.cloud/harry/gitea-ci-runner:php8.2
```
Prefer the image repository workflow for normal updates. Use a matrix over the supported PHP versions and push `latest` only from PHP `8.5`:
```yaml
env:
REGISTRY: git.bayliss.cloud
IMAGE_NAME: harry/gitea-ci-runner
jobs:
build:
strategy:
matrix:
php-version:
- '8.2'
- '8.3'
- '8.4'
- '8.5'
steps:
- uses: actions/checkout@v4
- name: Check registry credentials
env:
REGISTRY_USERNAME: ${{ secrets.REGISTRY_USERNAME }}
REGISTRY_TOKEN: ${{ secrets.REGISTRY_TOKEN }}
run: |
if [ -z "$REGISTRY_USERNAME" ] || [ -z "$REGISTRY_TOKEN" ]; then
echo "Skipping image publish because REGISTRY_USERNAME or REGISTRY_TOKEN is missing."
echo "SKIP_IMAGE_PUBLISH=true" >> "$GITHUB_ENV"
fi
- name: Login to Gitea registry
if: env.SKIP_IMAGE_PUBLISH != 'true'
env:
REGISTRY_USERNAME: ${{ secrets.REGISTRY_USERNAME }}
REGISTRY_TOKEN: ${{ secrets.REGISTRY_TOKEN }}
run: echo "$REGISTRY_TOKEN" | docker login "$REGISTRY" --username "$REGISTRY_USERNAME" --password-stdin
- name: Build and push version tag
if: env.SKIP_IMAGE_PUBLISH != 'true'
run: |
docker build \
--build-arg PHP_VERSION="${{ matrix.php-version }}" \
--tag "$REGISTRY/$IMAGE_NAME:php${{ matrix.php-version }}" \
--file docker/Dockerfile \
.
docker push "$REGISTRY/$IMAGE_NAME:php${{ matrix.php-version }}"
- name: Push latest tag
if: env.SKIP_IMAGE_PUBLISH != 'true' && matrix.php-version == '8.5'
run: |
docker tag "$REGISTRY/$IMAGE_NAME:php${{ matrix.php-version }}" "$REGISTRY/$IMAGE_NAME:latest"
docker push "$REGISTRY/$IMAGE_NAME:latest"
```
If the runner host exposes the Gitea registry locally, prefer the runner-local endpoint in workflow job containers for faster pulls:
```yaml
container:
image: localhost:30008/harry/gitea-ci-runner:php8.2
```
Do not include `http://` in Docker image references. Configure insecure/local registry trust at the Docker daemon level if needed.
## OpenCode PR Review Workflow
Use this pattern when adding a repo-scoped OpenCode reviewer to Gitea Actions.
Core behavior:
- Trigger only from PR comments containing `/review`.
- Do not auto-review on PR open or synchronize events for the first version.
- Run OpenCode read-only.
- Post or update one aggregate PR comment instead of line comments.
- Use the PR head checkout for repository context, but keep checkout shallow.
- Do not expose Gitea API tokens to OpenCode.
Required secrets:
- `REVIEW_BOT_TOKEN`: Gitea token for reading the repository/PR and writing issue comments.
- `OPENCODE_GO_TOKEN`: OpenCode Go API token.
Recommended token permissions for `REVIEW_BOT_TOKEN`:
- `read:repository`
- `read:issue`
- `write:issue`
Recommended workflow shape:
```yaml
on:
issue_comment:
types:
- created
workflow_dispatch:
jobs:
review:
runs-on: ubuntu-latest
container:
image: git.bayliss.cloud/harry/gitea-ci-runner:php8.2
```
In a preparation step:
- Read `$GITHUB_EVENT_PATH` with `jq`.
- Skip unless the event action is `created`.
- Skip unless the issue is a pull request.
- Skip unless the comment contains `/review`.
- Fetch pull request metadata from `GET /repos/{owner}/{repo}/pulls/{number}`.
- Fetch the diff from `GET /repos/{owner}/{repo}/pulls/{number}.diff`.
- Export `PR_NUMBER`, `REPO`, `BASE_BRANCH`, `HEAD_BRANCH`, and `HEAD_SHA` to `$GITHUB_ENV`.
Checkout the PR head after preparation:
```yaml
- name: Checkout PR head
if: env.SKIP_OPENCODE_REVIEW != 'true'
uses: actions/checkout@v4
with:
ref: ${{ env.HEAD_SHA }}
fetch-depth: 1
persist-credentials: false
```
Avoid `fetch-depth: 0` unless full history is required; full-history checkout can make review runs slow on Gitea runners.
Generate OpenCode auth at runtime from `OPENCODE_GO_TOKEN`:
```bash
mkdir -p "$HOME/.local/share/opencode"
jq -n --arg token "$OPENCODE_GO_TOKEN" '{"opencode-go": {"type": "api", "key": $token}}' > "$HOME/.local/share/opencode/auth.json"
```
Disable mutation tools and unset repository tokens before invoking OpenCode:
```bash
export OPENCODE_CONFIG_CONTENT='{"tools":{"write":false,"edit":false,"bash":false},"permission":{"edit":"deny","bash":"deny"},"share":"disabled","autoupdate":false}'
unset OPENCODE_GO_TOKEN REVIEW_BOT_TOKEN GITEA_TOKEN GITHUB_TOKEN
opencode run \
--model "${REVIEW_MODEL:-opencode-go/glm-5.1}" \
--file /tmp/opencode-pr.diff \
--title "PR #$PR_NUMBER review" \
"Review this pull request diff for bugs, security issues, behavior regressions, and missing tests. Use the checked-out PR head tree for repository context. Focus on actionable findings. If there are no findings, say so clearly. Do not suggest style-only changes. Output Markdown with concise sections." \
> /tmp/opencode-review.md
```
Use a stable marker in the review comment body so later `/review` runs update the same comment:
```text
<!-- opencode-review -->
```
Gitea Actions logs can be awkward to fetch on versions before `1.26`; if `tea actions runs logs` is unavailable or incomplete, inspect run/task state through the Gitea API.
## Deploy Workflow Porting
Deploy workflows often need secrets and remote side effects. Port them after tests pass.
Guard required secrets explicitly:
```bash
missing=()
for required_var in SPARK_EMAIL SPARK_KEY TIPTAP_PRO_KEY DO_SPACES_KEY DO_SPACES_SECRET DO_SPACES_ENDPOINT DO_SPACES_REGION; do
if [ -z "${!required_var}" ]; then
missing+=("$required_var")
fi
done
if [ "${#missing[@]}" -gt 0 ]; then
echo "Skipping deploy because required secrets are missing: ${missing[*]}"
echo "SKIP_DEPLOY=true" >> "$GITHUB_ENV"
fi
```
Gate deploy steps:
```yaml
if: env.SKIP_DEPLOY != 'true'
```
This keeps the default branch green while still making the missing configuration obvious in logs. File follow-up work for the missing secrets.
Do not manually dispatch deploy workflows that upload assets or mutate production unless the user explicitly approves it.
Choose deliberately: remove dependency, mirror package to Gitea Packages, or keep GitHub Packages temporarily with explicit Gitea secrets. Do not leave workflows requiring missing secrets.
## Verification Checklist
Before declaring cutover complete:
- Gitea repo is not a mirror.
- `origin` points to Gitea.
- `github` remote exists for recovery/reference if desired.
- Default branch is pushed to Gitea.
- Active PR branches are pushed to Gitea.
- Open PRs are recreated in Gitea.
- Repository PRs and Actions are enabled.
- Test workflow passes on PR.
- Test workflow passes on default branch after merge.
- Deploy workflow either succeeds or skips cleanly with a clear missing-secrets message.
- Optional `/review` workflow posts or updates the aggregate OpenCode review comment.
- Runner is online with generic labels.
- Job containers pull the expected image.
- Local branch is synchronized with Gitea:
- `origin` points to Gitea; `github` remote exists if retained.
- Default branch and active PR branches are pushed.
- Open PRs are recreated.
- Pull requests and Actions are enabled.
- Test workflow passes on PR and default branch.
- Deploy workflow succeeds or skips cleanly with a clear missing-secrets message.
- Runner is online with generic labels and job containers pull expected images.
- Optional `/review` workflow posts or updates the aggregate review comment.
- Local default branch is synchronized with Gitea.
```bash
git pull --rebase origin {default-branch}
@@ -580,72 +114,6 @@ git rev-list --left-right --count origin/{default-branch}...{default-branch}
git status --short --branch
```
Expected commit comparison after sync:
Expected commit comparison after sync: `0 0`.
```text
0 0
```
## Common Failure Modes
### Dot Paths 403 In Gitea UI
Cause: reverse proxy blocks dot directories.
Fix: adjust the proxy rule for Gitea. Do not rename workflow directories.
### Postgres Connection Refused
Cause: CI app connects to `127.0.0.1` while Postgres runs as a service container.
Fix: use service hostname such as `postgres` and add readiness checks.
### Runner Job Never Starts
Cause: workflow `runs-on` label does not match registered runner labels.
Fix: use generic labels and verify with `tea api admin/actions/runners`.
### Image Pull Fails
Cause: wrong registry hostname, missing Docker daemon trust, or private package auth issue.
Fix: verify image reference by pulling on the runner host, then use the same host/port in the workflow.
### Deploy Fails For Missing Secrets
Cause: Gitea Actions secrets not configured yet.
Fix: add a preflight skip or configure secrets. File follow-up work instead of leaving the default branch red.
### OpenCode Review Has No Context
Cause: the workflow passed only a diff to OpenCode and did not checkout the PR head tree.
Fix: fetch the PR diff through the Gitea API, then shallow-checkout the PR head SHA with `fetch-depth: 1` and `persist-credentials: false`.
### OpenCode Review Checkout Is Slow
Cause: `actions/checkout` fetched full history with `fetch-depth: 0`.
Fix: checkout the PR head SHA with `fetch-depth: 1` unless full git history is required.
### Branch Tracks Old GitHub Remote
Cause: local branch upstream still points to `github/{branch}` after remote cutover.
Fix only if requested or useful:
```bash
git branch --set-upstream-to=origin/{branch} {branch}
```
## Follow-Up Issues
File explicit follow-up issues for work intentionally deferred during cutover, such as:
- configuring deploy secrets in Gitea
- moving remaining Composer VCS repositories from GitHub to Gitea
- removing obsolete GitHub workflows after Gitea Actions are trusted
- deleting or archiving the GitHub repository after a retention period
- rotating any tokens that were exposed during setup
See [failure modes](references/failure-modes.md) when cutover or CI fails.

View File

@@ -0,0 +1,28 @@
# Deploy Workflow Porting
Port deploy workflows after tests pass. Deploys often need secrets and remote side effects.
Guard required secrets explicitly and skip deploy-only steps when secrets are absent so the default branch remains green while configuration is pending.
```bash
missing=()
for required_var in SPARK_EMAIL SPARK_KEY TIPTAP_PRO_KEY DO_SPACES_KEY DO_SPACES_SECRET DO_SPACES_ENDPOINT DO_SPACES_REGION; do
if [ -z "${!required_var}" ]; then
missing+=("$required_var")
fi
done
if [ "${#missing[@]}" -gt 0 ]; then
echo "Skipping deploy because required secrets are missing: ${missing[*]}"
echo "SKIP_DEPLOY=true" >> "$GITHUB_ENV"
fi
```
Gate deploy steps:
```yaml
if: env.SKIP_DEPLOY != 'true'
```
File follow-up work for missing secrets. Do not manually dispatch deploy workflows that upload assets or mutate production unless the user explicitly approves.

View File

@@ -0,0 +1,47 @@
# Common Gitea Transfer Failure Modes
## Dot Paths 403 In Gitea UI
Cause: reverse proxy blocks dot directories.
Fix: adjust the proxy rule for Gitea. Do not rename workflow directories.
## Postgres Connection Refused
Cause: CI app connects to `127.0.0.1` while Postgres runs as a service container.
Fix: use service hostname such as `postgres` and add readiness checks.
## Runner Job Never Starts
Cause: workflow `runs-on` label does not match registered runner labels.
Fix: use generic labels and verify with `tea api admin/actions/runners`.
## Image Pull Fails
Cause: wrong registry hostname, missing Docker daemon trust, or private package auth issue.
Fix: verify image reference by pulling on the runner host, then use the same host/port in the workflow.
## Deploy Fails For Missing Secrets
Cause: Gitea Actions secrets are not configured yet.
Fix: add a preflight skip or configure secrets. File follow-up work instead of leaving the default branch red.
## OpenCode Review Has No Context
Cause: workflow passed only a diff and did not checkout the PR head tree.
Fix: fetch the PR diff through the Gitea API, then shallow-checkout the PR head SHA with `fetch-depth: 1` and `persist-credentials: false`.
## Branch Tracks Old GitHub Remote
Cause: local branch upstream still points to `github/{branch}` after remote cutover.
Fix only if requested or useful:
```bash
git branch --set-upstream-to=origin/{branch} {branch}
```

View File

@@ -0,0 +1,41 @@
# OpenCode PR Review Workflow
Use this pattern when adding a repo-scoped OpenCode reviewer to Gitea Actions.
Core behavior:
- Trigger only from PR comments containing `/review` and optional `workflow_dispatch`.
- Do not auto-review on PR open or synchronize for the first version.
- Run OpenCode read-only.
- Post or update one aggregate PR comment using a stable marker such as `<!-- opencode-review -->`.
- Checkout the PR head tree shallowly for repository context.
- Do not expose Gitea API tokens to OpenCode.
Required secrets:
- `REVIEW_BOT_TOKEN`: Gitea token with `read:repository`, `read:issue`, and `write:issue`.
- `OPENCODE_GO_TOKEN`: OpenCode Go API token.
Preparation step should:
- Read `$GITHUB_EVENT_PATH` with `jq`.
- Skip unless action is `created`, issue is a PR, and comment contains `/review`.
- Fetch PR metadata from `GET /repos/{owner}/{repo}/pulls/{number}`.
- Fetch diff from `GET /repos/{owner}/{repo}/pulls/{number}.diff`.
- Export `PR_NUMBER`, `REPO`, `BASE_BRANCH`, `HEAD_BRANCH`, and `HEAD_SHA`.
Checkout pattern:
```yaml
- uses: actions/checkout@v4
with:
ref: ${{ env.HEAD_SHA }}
fetch-depth: 1
persist-credentials: false
```
Avoid `fetch-depth: 0` unless full history is required.
Before invoking OpenCode, generate auth from `OPENCODE_GO_TOKEN`, disable mutation tools, and unset repository tokens from the environment.
Gitea Actions logs can be awkward before Gitea 1.26; if `tea actions runs logs` is unavailable or incomplete, inspect run/task state through the Gitea API.

View File

@@ -0,0 +1,47 @@
# Gitea Runner Images
Prefer job containers over custom runner labels for language/runtime selection. Labels select eligible runners; containers select runtime images.
Use generic runner labels such as `ubuntu-latest`, `ubuntu-24.04`, and `ubuntu-22.04`. Verify actual runner labels with:
```bash
tea api admin/actions/runners
```
Shared PHP/Laravel image family:
```text
git.bayliss.cloud/harry/gitea-ci-runner:php8.2
git.bayliss.cloud/harry/gitea-ci-runner:php8.3
git.bayliss.cloud/harry/gitea-ci-runner:php8.4
git.bayliss.cloud/harry/gitea-ci-runner:php8.5
git.bayliss.cloud/harry/gitea-ci-runner:latest
```
`latest` points to PHP 8.5. Repositories should usually pin the PHP tag they require.
```yaml
jobs:
tests:
runs-on: ubuntu-latest
container:
image: git.bayliss.cloud/harry/gitea-ci-runner:php8.2
```
Shared image repository:
```text
ssh://git@git.bayliss.cloud:30009/harry/gitea-ci-runner.git
https://git.bayliss.cloud/harry/gitea-ci-runner
```
Current shared image contents include PHP CLI, Composer 2, Bun, Node.js/npm/npx, Go, Python/pip/venv, `jq`, database/cache clients, and common PHP extensions. Do not add MongoDB unless a repository has a real CI/runtime need.
If the runner host exposes the registry locally, job containers may use the runner-local endpoint for faster pulls:
```yaml
container:
image: localhost:30008/harry/gitea-ci-runner:php8.2
```
Do not include `http://` in Docker image references. Configure insecure/local registry trust at the Docker daemon level if needed.

View File

@@ -1,138 +1,62 @@
---
name: obsidian
description: For registering personal notes and task tracking for Popdog.
description: Manage Harry's Obsidian vault notes and Popdog task tracking with the Obsidian CLI. Use when registering tasks, reading or updating Popdog notes, rolling completed work into daily roundup notes, or searching the vault.
---
# Obsidian
I have registered the `obsidian` cli. I'm using it for task tracking for Popdog.
Use the registered `obsidian` CLI for vault operations. The important working area is `Work/Popdog`, with `Work/Popdog/_Todo.md` as the outstanding-work index.
The relevant information you'll find is in Work/Popdog. There's an overall _Todo.md meta file that lists all my outstanding work.
Obsidian CLI requires the desktop app to be running; if it is not running, the first command may launch it. Target this vault explicitly if needed with `vault=<name>` as the first parameter.
Once items are verified complete they should be rolled up into the daily roundup files named YYYY-MM-DD.md.
## Popdog Task Rules
Format tasks as follows:
Use this format in `Work/Popdog/_Todo.md`:
```
```markdown
- [ ] ==**High**== Task Title
- [ ] Sub task/portion of task
- [ ] **Medium** Task Title
- [ ] Sub tas/portion of task
- [ ] Sub task/portion of task
```
Note that only High priority receives highlighting == ==.
Only High priority receives `== ==` highlighting.
## Task Completion
When a task is verified complete:
When tasks are completed, mark them as such by checking the box next to the task.
- Mark it complete in `_Todo.md`.
- Roll it into the daily roundup note named `YYYY-MM-DD.md`.
- Include the PR and repo next to completed work when known.
Also roll up the completed tasks into the daily roundup files.
```markdown
- [x] Task Title (PR# - Catwalk)
- [x] Task Title (PR# - Harvester)
```
We should
## Agent Workflow
1. Read the relevant note before changing it: `obsidian read path="Work/Popdog/_Todo.md"`.
2. Use `obsidian tasks path="Work/Popdog/_Todo.md" verbose` when you need line numbers for task updates.
3. Prefer CLI mutations (`task`, `append`, `prepend`, `create`) over manual file editing when they fit.
4. Use exact `path=` values for Popdog files; use `file=` only when the name is unambiguous.
5. After updates, re-read the touched note to verify the result.
# Obsidian CLI Docs
## Common Commands
Run a command
Run an individual command without opening the TUI:
```bash
obsidian read path="Work/Popdog/_Todo.md"
obsidian tasks path="Work/Popdog/_Todo.md" todo verbose
obsidian task path="Work/Popdog/_Todo.md" line=12 done
obsidian append path="Work/Popdog/2026-05-11.md" content="- [x] Task Title (PR# - Repo)"
obsidian create path="Work/Popdog/2026-05-11.md" content="# 2026-05-11" open
obsidian search query="billing" path="Work/Popdog"
```
## Run the help command
obsidian help
Use the terminal interface
Use the TUI by entering obsidian. Subsequent commands can be entered without obsidian.
## CLI Notes
## Open the TUI, then run help
obsidian
help
The TUI supports autocomplete, command history, and reverse search. Use Ctrl+R to search your command history. See Keyboard shortcuts for all available shortcuts.
- Parameters use `key=value`; quote values with spaces.
- Flags are bare words, for example `open`, `overwrite`, `todo`, `done`, `verbose`.
- Multiline content uses `\n` and tabs use `\t`.
- Add `--copy` to copy command output to the clipboard when useful.
Files and folders
file
Show file info (default: active file).
file=<name> # file name
path=<path> # file path
Example:
path Notes/Recipe.md
name Recipe
extension md
size 1024
created 1700000000000
modified 1700001000000
files
List files in the vault.
folder=<path> # filter by folder
ext=<extension> # filter by extension
total # return file count
folder
Show folder info.
path=<path> # (required) folder path
info=files|folders|size # return specific info only
folders
List folders in the vault.
folder=<path> # filter by parent folder
total # return folder count
open
Open a file.
file=<name> # file name
path=<path> # file path
newtab # open in new tab
create
Create or overwrite a file.
name=<name> # file name
path=<path> # file path
content=<text> # initial content
template=<name> # template to use
overwrite # overwrite if file exists
open # open file after creating
newtab # open in new tab
read
Read file contents (default: active file).
file=<name> # file name
path=<path> # file path
append
Append content to a file (default: active file).
file=<name> # file name
path=<path> # file path
content=<text> # (required) content to append
inline # append without newline
prepend
Prepend content after frontmatter (default: active file).
file=<name> # file name
path=<path> # file path
content=<text> # (required) content to prepend
inline # prepend without newline
move
Move or rename a file (default: active file). This will automatically update internal links if turned on in your vault settings.
file=<name> # file name
path=<path> # file path
to=<path> # (required) destination folder or path
rename
Rename a file (default: active file). The file extension is preserved automatically if omitted from the new name. Use move to rename and move a file at the same time. This will automatically update internal links if turned on in your vault settings.
file=<name> # file name
path=<path> # file path
name=<name> # (required) new file name
delete
Delete a file (default: active file, trash by default).
file=<name> # file name
path=<path> # file path
permanent # skip trash, delete permanently
See [the CLI reference](references/cli-reference.md) for agent-oriented command coverage from the official Obsidian CLI docs.

View File

@@ -0,0 +1,127 @@
# Obsidian CLI Reference
This is a compact agent-facing reference derived from the official Obsidian CLI docs. Use it when `SKILL.md` does not include enough command detail.
## Invocation Rules
- Run one command as `obsidian <command> key=value flag`.
- Use `vault=<name>` or `vault=<id>` as the first parameter to target a specific vault.
- If the current working directory is a vault, that vault is used; otherwise the active Obsidian vault is used.
- Use `path=<vault-relative/path.md>` for exact paths and `file=<name>` for wikilink-style name resolution.
- Use `\n` for multiline content and quote values containing spaces.
- Use `--copy` with any command to copy output to the clipboard.
## Discovery
```bash
obsidian help
obsidian version
obsidian vault
obsidian vaults verbose
obsidian files folder="Work/Popdog"
obsidian folders folder="Work"
obsidian search query="text" path="Work/Popdog"
obsidian search:context query="text" path="Work/Popdog"
```
## Files
```bash
obsidian read path="Work/Popdog/_Todo.md"
obsidian open path="Work/Popdog/_Todo.md" newtab
obsidian create path="Work/Popdog/New Note.md" content="# Title" open
obsidian append path="Work/Popdog/_Todo.md" content="- [ ] Task"
obsidian prepend path="Work/Popdog/_Todo.md" content="Intro"
obsidian move path="Old.md" to="Archive/Old.md"
obsidian rename path="Old.md" name="New"
obsidian delete path="Scratch.md"
```
Use `overwrite` with `create` only when intentionally replacing an existing note.
## Daily Notes
```bash
obsidian daily
obsidian daily:path
obsidian daily:read
obsidian daily:append content="- [x] Finished task (PR# - Repo)"
obsidian daily:prepend content="# Roundup"
```
Use daily commands for the active daily-note configuration. For Popdog roundup files with a known path, exact `path=` file commands are often safer.
## Tasks
```bash
obsidian tasks
obsidian tasks todo
obsidian tasks done
obsidian tasks path="Work/Popdog/_Todo.md" todo verbose
obsidian tasks daily total
obsidian task ref="Work/Popdog/_Todo.md:12" done
obsidian task path="Work/Popdog/_Todo.md" line=12 toggle
obsidian task path="Work/Popdog/_Todo.md" line=12 todo
obsidian task path="Work/Popdog/_Todo.md" line=12 status="-"
```
Use `verbose` when you need file paths and line numbers before updating tasks.
## Metadata And Organization
```bash
obsidian tags counts
obsidian tag name="#project" verbose
obsidian backlinks path="Work/Popdog/_Todo.md" counts
obsidian links path="Work/Popdog/_Todo.md"
obsidian unresolved counts verbose
obsidian outline path="Work/Popdog/_Todo.md" format=md
obsidian properties path="Work/Popdog/_Todo.md" format=yaml
obsidian property:set path="Note.md" name=status value=active type=text
obsidian property:read path="Note.md" name=status
obsidian property:remove path="Note.md" name=status
```
## Templates
```bash
obsidian templates
obsidian template:read name="Daily" resolve title="2026-05-11"
obsidian create path="Work/Popdog/2026-05-11.md" template="Daily" open
```
## History And Recovery
```bash
obsidian diff path="Work/Popdog/_Todo.md"
obsidian diff path="Work/Popdog/_Todo.md" from=2 to=1
obsidian history path="Work/Popdog/_Todo.md"
obsidian history:read path="Work/Popdog/_Todo.md" version=1
obsidian history:restore path="Work/Popdog/_Todo.md" version=1
obsidian sync:history path="Work/Popdog/_Todo.md" total
obsidian sync:read path="Work/Popdog/_Todo.md" version=1
```
Prefer reading or diffing history before restoring anything.
## Developer And UI Commands
```bash
obsidian commands filter="app:"
obsidian command id="command-id"
obsidian devtools
obsidian dev:errors
obsidian dev:console limit=100 level=error
obsidian dev:screenshot path="screenshot.png"
obsidian dev:dom selector=".workspace" text
obsidian eval code="app.vault.getFiles().length"
```
Use developer commands only when working on Obsidian plugins/themes or diagnosing the running app.
## Troubleshooting
- Obsidian CLI requires Obsidian 1.12.7+ and the CLI setting enabled in Settings -> General.
- Obsidian must be running; if not, the first command may launch it.
- On Linux, the CLI binary is normally copied to `~/.local/bin/obsidian`; ensure `~/.local/bin` is on `PATH`.
- If commands fail after registration, restart the terminal and verify `obsidian version`.

View File

@@ -7,369 +7,69 @@ user-invocable: true
# tldraw File
Use this skill when the user asks to create, repair, validate, or generate a tldraw document/file.
Examples:
- "Make me a tldraw diagram"
- "Create a `.tldr` file"
- "tldraw says this file is corrupted"
- "Generate a diagram I can open in tldraw"
Use when creating, repairing, validating, or generating a `.tldr` file.
## Core Rule
Do not hand-wave validity. A `.tldr` file must be validated with tldraw's own schema/importer, not just `JSON.parse`.
A file can be valid JSON and still be corrupted to tldraw.
Do not hand-wave validity. A `.tldr` file must validate through tldraw's own schema/importer, not just `JSON.parse`. Valid JSON can still be corrupted to tldraw.
## Temporary Setup
If the current project does not already depend on `tldraw`, install it in `/tmp/opencode`, not in the user's repo.
If the current project does not already depend on `tldraw`, install it in `/tmp/opencode`, not in the user's repo:
```bash
# Work outside the repo
bun init -y
bun add tldraw
```
Use `/tmp/opencode` as the working directory for validation scripts.
Run validation scripts from the directory where `tldraw` is installed.
## `.tldr` File Shape
## File Shape
A current `.tldr` JSON file has this top-level shape:
A `.tldr` file has top-level `tldrawFileFormatVersion`, `schema`, and `records`. Records must be an array of valid tldraw records. Include at least a document record and a page record, then shapes parented to that page.
```json
{
"tldrawFileFormatVersion": 1,
"schema": {
"schemaVersion": 2,
"sequences": {
"com.tldraw.store": 5,
"com.tldraw.asset": 1,
"com.tldraw.camera": 1,
"com.tldraw.document": 2,
"com.tldraw.instance": 26,
"com.tldraw.instance_page_state": 5,
"com.tldraw.page": 1,
"com.tldraw.instance_presence": 6,
"com.tldraw.pointer": 1,
"com.tldraw.shape": 4,
"com.tldraw.user": 1,
"com.tldraw.asset.image": 6,
"com.tldraw.asset.video": 5,
"com.tldraw.asset.bookmark": 2,
"com.tldraw.shape.arrow": 8,
"com.tldraw.shape.bookmark": 2,
"com.tldraw.shape.draw": 4,
"com.tldraw.shape.embed": 4,
"com.tldraw.shape.frame": 1,
"com.tldraw.shape.geo": 11,
"com.tldraw.shape.group": 0,
"com.tldraw.shape.highlight": 3,
"com.tldraw.shape.image": 5,
"com.tldraw.shape.line": 5,
"com.tldraw.shape.note": 12,
"com.tldraw.shape.text": 4,
"com.tldraw.shape.video": 4,
"com.tldraw.binding.arrow": 1
}
},
"records": []
}
```
Use geo rectangles for most boxes, text shapes for headings, and simple arrows for flow. Avoid note shapes unless you know the current schema.
Records must be an array of valid tldraw records. Each record needs at least:
See [record templates](references/record-templates.md) for base records and shape JSON.
```json
{
"id": "shape:example",
"typeName": "shape"
}
```
## Validation Workflow
Actual records must also satisfy their shape-specific schema.
## Required Base Records
Include at least:
```json
{
"gridSize": 10,
"name": "",
"meta": {},
"id": "document:document",
"typeName": "document"
}
```
```json
{
"id": "page:page",
"name": "Page",
"index": "a1",
"meta": {},
"typeName": "page"
}
```
Every shape should have:
```json
{
"parentId": "page:page",
"index": "a2",
"typeName": "shape"
}
```
## Rich Text
Use tldraw rich text for labels.
Helper shape:
```json
{
"type": "doc",
"content": [
{
"type": "paragraph",
"content": [
{
"type": "text",
"text": "Label"
}
]
}
]
}
```
For compatibility, also include the legacy `text` prop with the same plain text.
## Geo Shape Template
Use geo shapes for most boxes. They are easier than notes and less fragile.
```json
{
"x": 100,
"y": 100,
"rotation": 0,
"isLocked": false,
"opacity": 1,
"meta": {},
"id": "shape:box",
"type": "geo",
"props": {
"geo": "rectangle",
"w": 260,
"h": 90,
"color": "blue",
"labelColor": "black",
"fill": "solid",
"dash": "solid",
"size": "m",
"font": "sans",
"align": "middle",
"verticalAlign": "middle",
"growY": 0,
"url": "",
"scale": 1,
"richText": {
"type": "doc",
"content": [
{
"type": "paragraph",
"content": [
{
"type": "text",
"text": "My box"
}
]
}
]
},
"text": "My box"
},
"parentId": "page:page",
"index": "a2",
"typeName": "shape"
}
```
## Text Shape Template
```json
{
"x": 100,
"y": 40,
"rotation": 0,
"isLocked": false,
"opacity": 1,
"meta": {},
"id": "shape:title",
"type": "text",
"props": {
"color": "black",
"size": "xl",
"font": "sans",
"textAlign": "start",
"autoSize": true,
"w": 900,
"scale": 1,
"richText": {
"type": "doc",
"content": [
{
"type": "paragraph",
"content": [
{
"type": "text",
"text": "Diagram Title"
}
]
}
]
},
"text": "Diagram Title"
},
"parentId": "page:page",
"index": "a1",
"typeName": "shape"
}
```
## Arrow Shape Template
Arrows require `labelColor`. Missing it can make tldraw report the file as corrupted.
```json
{
"x": 400,
"y": 200,
"rotation": 0,
"isLocked": false,
"opacity": 1,
"meta": {},
"id": "shape:arrow",
"type": "arrow",
"props": {
"kind": "arc",
"color": "black",
"labelColor": "black",
"fill": "none",
"dash": "solid",
"size": "m",
"arrowheadStart": "none",
"arrowheadEnd": "arrow",
"bend": 0,
"start": {
"x": 0,
"y": 0
},
"end": {
"x": 120,
"y": 0
},
"labelPosition": 0.5,
"font": "sans",
"scale": 1,
"richText": {
"type": "doc",
"content": [
{
"type": "paragraph",
"content": [
{
"type": "text",
"text": "label"
}
]
}
]
},
"text": "label"
},
"parentId": "page:page",
"index": "a3",
"typeName": "shape"
}
```
## Valid Colors
For `color` and `labelColor`, use only known tldraw colors:
- `black`
- `grey`
- `light-violet`
- `violet`
- `blue`
- `light-blue`
- `yellow`
- `orange`
- `green`
- `light-green`
- `light-red`
- `red`
- `white`
Invalid or missing `labelColor` can corrupt the file.
## Notes
Avoid note shapes unless you know the current schema. They require additional props and are more version-sensitive. Prefer geo rectangles with `fill: solid` or `fill: semi`.
## Validation
First validate JSON syntax:
1. Create or update `.tldr` with `apply_patch`.
2. Validate JSON syntax.
3. Validate records with `createTLStore({ snapshot })`.
4. Validate importer compatibility with `parseTldrawJsonFile`.
5. If tldraw says corrupted, trust tldraw and fix the specific schema path.
```bash
bun -e "const f='/path/to/file.tldr'; const data=await Bun.file(f).text(); JSON.parse(data); console.log('valid json')"
```
Then validate with tldraw's store/schema:
```bash
bun -e "import { createTLStore } from 'tldraw'; const file=JSON.parse(await Bun.file('/path/to/file.tldr').text()); const snapshot={store:Object.fromEntries(file.records.map(r=>[r.id,r])), schema:file.schema}; try { const store=createTLStore({snapshot}); console.log('ok records', store.allRecords().length); } catch(e){ console.error(e); process.exit(1)} process.exit(0)"
```
Finally validate with the same importer path tldraw uses for `.tldr` files:
```bash
bun -e "import { createTLStore } from 'tldraw'; import { parseTldrawJsonFile } from './node_modules/tldraw/dist-esm/lib/utils/tldr/file.mjs'; const json=await Bun.file('/path/to/file.tldr').text(); const schema=createTLStore().schema; const result=parseTldrawJsonFile({json, schema}); console.log(result.ok ? 'parse ok' : result.error); process.exit(result.ok ? 0 : 1)"
```
Run importer validation from the temp directory where tldraw is installed, for example `/tmp/opencode`.
## Common Corruption Causes
- File is valid JSON but does not match tldraw schema.
- Missing `labelColor` on arrow shapes.
- Missing required shape props for note shapes.
- Missing required shape props, especially on note shapes.
- Old schema sequence numbers copied from another tldraw version.
- Invalid color names.
- Shape `parentId` points to a missing page.
- Duplicate record IDs.
- Records object used instead of records array in `.tldr` file format.
- Records object used instead of records array.
- `richText` missing where current schemas expect it.
## Recommended Workflow
1. Create or update `.tldr` with `apply_patch`.
2. Run JSON validation.
3. Run `createTLStore({ snapshot })` validation.
4. Run `parseTldrawJsonFile` validation.
5. If tldraw says corrupted, trust tldraw and inspect the validation error path.
6. Fix the specific schema path, not just the JSON.
See [schema notes](references/schema-notes.md) for current sequence shape and valid colors.
## Practical Diagram Advice
- Use geo rectangles for sections and entities.
- Use text for headings.
- Use simple arrow shapes for flow.
- Keep IDs stable and readable: `shape:admin-review`, `shape:stream-category`.
- Use indexed records like `a1`, `a2`, `a3`.
- Keep diagrams editable rather than pixel-perfect.
- Use stable readable IDs, for example `shape:admin-review`.
- Use indexed records like `a1`, `a2`, `a3`.
- Keep labels in both `richText` and legacy `text` props for compatibility.
- Prefer simple layout and reliable schemas over fancy shapes.

View File

@@ -0,0 +1,159 @@
# tldraw Record Templates
## Required Base Records
```json
{
"gridSize": 10,
"name": "",
"meta": {},
"id": "document:document",
"typeName": "document"
}
```
```json
{
"id": "page:page",
"name": "Page",
"index": "a1",
"meta": {},
"typeName": "page"
}
```
Every shape should include:
```json
{
"parentId": "page:page",
"index": "a2",
"typeName": "shape"
}
```
## Rich Text Helper
```json
{
"type": "doc",
"content": [
{
"type": "paragraph",
"content": [
{
"type": "text",
"text": "Label"
}
]
}
]
}
```
Also include legacy `text` with the same plain text.
## Geo Shape
Use geo shapes for most boxes.
```json
{
"x": 100,
"y": 100,
"rotation": 0,
"isLocked": false,
"opacity": 1,
"meta": {},
"id": "shape:box",
"type": "geo",
"props": {
"geo": "rectangle",
"w": 260,
"h": 90,
"color": "blue",
"labelColor": "black",
"fill": "solid",
"dash": "solid",
"size": "m",
"font": "sans",
"align": "middle",
"verticalAlign": "middle",
"growY": 0,
"url": "",
"scale": 1,
"richText": {"type":"doc","content":[{"type":"paragraph","content":[{"type":"text","text":"My box"}]}]},
"text": "My box"
},
"parentId": "page:page",
"index": "a2",
"typeName": "shape"
}
```
## Text Shape
```json
{
"x": 100,
"y": 40,
"rotation": 0,
"isLocked": false,
"opacity": 1,
"meta": {},
"id": "shape:title",
"type": "text",
"props": {
"color": "black",
"size": "xl",
"font": "sans",
"textAlign": "start",
"autoSize": true,
"w": 900,
"scale": 1,
"richText": {"type":"doc","content":[{"type":"paragraph","content":[{"type":"text","text":"Diagram Title"}]}]},
"text": "Diagram Title"
},
"parentId": "page:page",
"index": "a1",
"typeName": "shape"
}
```
## Arrow Shape
Arrows require `labelColor`.
```json
{
"x": 400,
"y": 200,
"rotation": 0,
"isLocked": false,
"opacity": 1,
"meta": {},
"id": "shape:arrow",
"type": "arrow",
"props": {
"kind": "arc",
"color": "black",
"labelColor": "black",
"fill": "none",
"dash": "solid",
"size": "m",
"arrowheadStart": "none",
"arrowheadEnd": "arrow",
"bend": 0,
"start": {"x": 0, "y": 0},
"end": {"x": 120, "y": 0},
"labelPosition": 0.5,
"font": "sans",
"scale": 1,
"richText": {"type":"doc","content":[{"type":"paragraph","content":[{"type":"text","text":"label"}]}]},
"text": "label"
},
"parentId": "page:page",
"index": "a3",
"typeName": "shape"
}
```

View File

@@ -0,0 +1,61 @@
# tldraw Schema Notes
Current `.tldr` files use this top-level shape:
```json
{
"tldrawFileFormatVersion": 1,
"schema": {
"schemaVersion": 2,
"sequences": {
"com.tldraw.store": 5,
"com.tldraw.asset": 1,
"com.tldraw.camera": 1,
"com.tldraw.document": 2,
"com.tldraw.instance": 26,
"com.tldraw.instance_page_state": 5,
"com.tldraw.page": 1,
"com.tldraw.instance_presence": 6,
"com.tldraw.pointer": 1,
"com.tldraw.shape": 4,
"com.tldraw.user": 1,
"com.tldraw.asset.image": 6,
"com.tldraw.asset.video": 5,
"com.tldraw.asset.bookmark": 2,
"com.tldraw.shape.arrow": 8,
"com.tldraw.shape.bookmark": 2,
"com.tldraw.shape.draw": 4,
"com.tldraw.shape.embed": 4,
"com.tldraw.shape.frame": 1,
"com.tldraw.shape.geo": 11,
"com.tldraw.shape.group": 0,
"com.tldraw.shape.highlight": 3,
"com.tldraw.shape.image": 5,
"com.tldraw.shape.line": 5,
"com.tldraw.shape.note": 12,
"com.tldraw.shape.text": 4,
"com.tldraw.shape.video": 4,
"com.tldraw.binding.arrow": 1
}
},
"records": []
}
```
Valid `color` and `labelColor` values:
- `black`
- `grey`
- `light-violet`
- `violet`
- `blue`
- `light-blue`
- `yellow`
- `orange`
- `green`
- `light-green`
- `light-red`
- `red`
- `white`
Invalid or missing `labelColor` can corrupt the file.

View File

@@ -6,212 +6,46 @@ compatibility: opencode
# UI Variations
Use this skill when the user wants help exploring UI directions for a new or existing web UI surface and wants real code, not static mockups.
Use when the user wants real coded UI directions for a web UI surface, not static mockups.
This skill is optimized for browser-based apps first. Favor in-app previews that reuse the real design system, routing, and data flow. Fall back to an isolated preview page only when the target repo makes in-situ previewing awkward or risky.
## Defaults
## Default Behavior
- Ask at most one clarification if the target surface is unclear.
- Inspect repo docs, routing, components, styling, and design-system primitives before editing.
- Generate 3-5 materially different variations of only the requested surface.
- Present one active variation at a time with a floating bottom-centered switcher.
- Keep the previewed UI production-ready; do not render rationale or experiment notes inside it.
- Stop at preview by default. If the user picks a winner, integrate it and remove preview-only artifacts.
- Ask at most one short clarification if the target page, component, or feature area is unclear.
- Inspect the repository before editing. Look for design philosophy docs, design-system conventions, router structure, and existing component primitives.
- Only create variations of the UI surface the user asked for. Keep the rest of the page and surrounding experience stable unless the user explicitly asks for broader exploration.
- Generate 3-5 materially different variations.
- Present one variation at a time on a single preview surface.
- Add a floating bottom-centered variant switcher so the user can compare directions quickly.
- Make each variation look production-ready. Do not present wireframes, placeholder framing, or experiment-looking chrome unless the user explicitly asks for that level of fidelity.
- Keep rationale out of the UI itself. Do not render explanatory labels, design notes, or variation descriptions inside the previewed interface.
- If the `agent-browser` skill is available and the app can run locally, use it to visually inspect the rendered variants and refine obvious issues before asking the user to choose.
- Stop at the preview phase by default.
- If the user later picks a winner, fully integrate that choice and remove the preview-only artifacts.
## Preview Mode
## Inputs
1. Discover local UI language using `references/discovery-checklist.md`.
2. Choose the smallest honest preview surface: real feature/page, internal preview route, then isolated sandbox as fallback.
3. Build variations that differ in hierarchy, density, framing, action emphasis, typography, navigation, or control presentation.
4. Reuse the same content/data across variants and preserve surrounding page behavior.
5. Add a bottom-centered switcher with labels like `A`, `B`, `C`, visible active state, pointer-friendly controls, and mobile-safe placement.
6. Keep preview-only code grouped and easy to delete.
7. Run focused lint/typecheck/test/build checks available in the repo.
8. If the app can run and `agent-browser` is available, inspect desktop and mobile rendering and refine obvious visual issues.
9. Report preview route, files changed, variant labels with one-line rationale, and known gaps. Use `templates/preview-summary-template.md` when useful.
The skill can usually infer most of these from the repo. Only ask when a missing answer would block safe progress.
## Promotion Mode
| Input | Default | Ask only if needed |
|-------|---------|--------------------|
| Target UI surface | Infer from user request and repo structure | Yes |
| Number of variations | 4 | No |
| Preview surface | In-situ when practical, isolated when necessary | No |
| Device emphasis | Existing repo defaults | No |
| Promotion behavior | Wait for user selection | No |
When the user selects a winner:
## Workflow
1. Move the winning variation into the real production surface.
2. Rename experiment code so it reads like normal app code.
3. Update imports, tests, stories, snapshots, and callers as needed.
4. Remove preview route, switcher, losing variants, temporary helpers/fixtures, dead styles, and orphaned assets.
5. Verify the final integrated surface, not just the preview.
6. Report winner, integration location, removed artifacts, and checks run.
The work has two modes: preview mode and promotion mode.
### Preview Mode
#### 1. Discover the local UI language
Read the minimum necessary files to understand the app before editing:
- project docs such as `README*`, `docs/**`, `CONTRIBUTING*`, or design notes
- router and layout entry points
- component primitives, shared wrappers, and design tokens
- styling setup such as Tailwind config, CSS variables, theme providers, or UI libraries
- existing patterns for feature previews, internal-only routes, or story/demo pages
Use `references/discovery-checklist.md` as the discovery checklist.
#### 2. Choose the preview surface
Prefer the smallest correct surface:
- Best: preview the ideas inside the real feature/page so spacing, layout, and surrounding context stay honest.
- Acceptable: add a preview route or internal page that still uses the app's real shell.
- Fallback: add an isolated sandbox page if the app architecture makes in-situ previewing expensive.
If you add a dedicated preview route, follow local router conventions and use a clearly temporary path such as `/_ui/...`, `/internal/...`, or the repo's existing preview pattern.
#### 3. Build the variations
Create 3-5 variations that differ in meaningful ways, not just color or copy tweaks.
Only vary the requested surface. Do not add extra sections, side quests, marketing panels, helper callouts, or new feature ideas unless they are already part of the requested UI or the user explicitly asks for broader exploration.
Aim to vary things like:
- information hierarchy
- density and spacing
- framing and container treatment
- interaction emphasis and primary actions
- typographic rhythm
- navigation or control presentation
Keep the comparison fair:
- reuse the same content and data across all variations
- keep functional behavior as constant as possible
- share supporting data fixtures/helpers when that reduces duplication
- do not invent a new design system when the repo already has one
- do not put variation names, rationale text, or design commentary inside the UI itself
- make the UI feel shippable rather than like a design exercise
#### 4. Build the switcher
The preview must include a floating switcher anchored at the bottom center of the screen.
The switcher is the only comparison control that should visibly describe the variants. Keep the previewed UI itself clean and presentation-ready.
Requirements:
- one active variation at a time
- clearly labeled variants such as `A`, `B`, `C`, `D`
- visible active state
- pointer-friendly controls
- mobile-safe placement
- use local app styling rather than foreign widget styling
Nice to have when cheap:
- previous/next controls
- keyboard shortcuts
- query param or route state so the active variation is easy to share or refresh
#### 5. Keep the preview removable
Structure preview code so cleanup is straightforward after the user picks a winner.
Good patterns:
- keep preview-only components grouped in one local folder
- keep variant selection state local to the preview surface
- isolate temporary wrappers and comparison helpers from production code paths
Avoid building a permanent preview framework unless the repo already has one.
#### 6. Verify
Run the lightest relevant checks available in the target repo.
- validate changed files with the repo's normal lint, typecheck, or test commands when practical
- run a focused dev/build check if the app supports it
- if a local app can run and `agent-browser` is available, use it to confirm the preview route renders and the switcher works
- verify desktop and mobile layout if the surface is responsive
#### 7. Refine visually before handoff
If `agent-browser` is available, prefer one refinement pass before asking the user to choose.
- open the preview in the browser and inspect the rendered result rather than trusting code alone
- use visual inspection and available vision capabilities to catch awkward spacing, weak hierarchy, cramped mobile layouts, clipping, contrast issues, or component mismatches
- make targeted refinements so each variation looks polished and production-ready
- keep the refinements scoped to the requested surface rather than adding extra UI the user did not ask for
If `agent-browser` is not available, skip this step and hand off after normal code-level verification.
#### 8. Hand off for review
When the preview is ready, report:
- preview route or entry point
- files changed
- the variant labels and one-line rationale for each
- any known compromises or gaps
Use `templates/preview-summary-template.md` as the output structure when useful.
### Promotion Mode
When the user selects a winning variation, do not leave the repo in a "choose later" state. Finish the job.
#### 1. Promote the winner
- move the winning variation into the real production surface
- preserve the chosen layout, hierarchy, and interaction details
- adapt naming so the final code reads like normal app code, not experiment code
- update imports, tests, stories, snapshots, and callers that depend on the old structure
#### 2. Remove the temporary comparison layer
Delete or collapse preview-only artifacts that are no longer needed:
- preview routes or sandbox pages
- variant switcher UI
- losing variants
- temporary helpers, fixtures, and wrappers that only existed for comparison
- stale imports, dead styles, and orphaned assets
#### 3. Verify the final integrated result
- run the relevant checks again
- if the app can run locally, verify the real surface rather than only the preview route
- confirm no preview-only controls remain visible in the integrated experience
#### 4. Report completion
Summarize:
- which variation won
- where it was integrated
- which temporary artifacts were removed
- which checks were run
Use `references/promotion-checklist.md` as the final cleanup checklist.
Use `references/promotion-checklist.md` for cleanup.
## Guardrails
- Favor the app's real primitives, spacing system, tokens, and layout patterns.
- Make the smallest change set that still yields meaningfully different ideas.
- Limit changes to the user-requested surface unless broader page changes were explicitly requested.
- Do not overwrite unrelated work or refactor broadly just to host the preview.
- Keep each variation honest to the product's existing brand and interaction model unless the user explicitly asks for something more radical.
- Do not embed design explanations, variant descriptions, or self-referential experiment text inside the UI.
- Make the preview look production-ready enough that the chosen variation can be promoted with minimal cleanup.
- If visual browser inspection is available, use it to polish the variants before requesting user feedback.
- Do not stop at mockups once the user has chosen a direction. Integrate the winner and clean up the rest.
- Prefer code that is easy to delete or fold into production cleanly.
## References
| Reference | Purpose |
|-----------|---------|
| `references/discovery-checklist.md` | What to inspect before editing |
| `references/promotion-checklist.md` | What to remove and verify after a winner is chosen |
## Templates
| Template | Purpose |
|----------|---------|
| `templates/preview-summary-template.md` | Concise handoff summary for the preview phase |
- Favor the app's real primitives, tokens, spacing, layout, and interaction patterns.
- Do not broaden scope beyond the requested surface unless explicitly asked.
- Do not create a permanent preview framework unless the repo already has one.
- Do not overwrite unrelated work or refactor broadly to host the preview.
- Make each variation shippable enough that the winner can be promoted with minimal cleanup.

View File

@@ -5,211 +5,86 @@ description: Use when starting feature work that needs isolation from current wo
# Using Git Worktrees
## Overview
Ensure work happens in an isolated workspace when appropriate. Prefer platform-native worktree tools. Fall back to manual `git worktree` only when no native tool is available.
Ensure work happens in an isolated workspace. Prefer your platform's native worktree tools. Fall back to manual git worktrees only when no native tool is available.
Announce at start: "I'm using the using-git-worktrees skill to set up an isolated workspace."
**Core principle:** Detect existing isolation first. Then use native tools. Then fall back to git. Never fight the harness.
## Workflow
**Announce at start:** "I'm using the using-git-worktrees skill to set up an isolated workspace."
### 1. Detect Existing Isolation
## Step 0: Detect Existing Isolation
**Before creating anything, check if you are already in an isolated workspace.**
Before creating anything, check whether you are already in a linked worktree:
```bash
GIT_DIR=$(cd "$(git rev-parse --git-dir)" 2>/dev/null && pwd -P)
GIT_COMMON=$(cd "$(git rev-parse --git-common-dir)" 2>/dev/null && pwd -P)
BRANCH=$(git branch --show-current)
```
**Submodule guard:** `GIT_DIR != GIT_COMMON` is also true inside git submodules. Before concluding "already in a worktree," verify you are not in a submodule:
```bash
# If this returns a path, you're in a submodule, not a worktree — treat as normal repo
git rev-parse --show-superproject-working-tree 2>/dev/null
```
**If `GIT_DIR != GIT_COMMON` (and not a submodule):** You are already in a linked worktree. Skip to Step 3 (Project Setup). Do NOT create another worktree.
If `GIT_DIR != GIT_COMMON` and `show-superproject-working-tree` is empty, you are already in a linked worktree. Do not create another one. Report the path and branch, then continue to project setup.
Report with branch state:
- On a branch: "Already in isolated workspace at `<path>` on branch `<name>`."
- Detached HEAD: "Already in isolated workspace at `<path>` (detached HEAD, externally managed). Branch creation needed at finish time."
If `GIT_DIR == GIT_COMMON` or this is a submodule, treat it as a normal repo checkout.
**If `GIT_DIR == GIT_COMMON` (or in a submodule):** You are in a normal repo checkout.
### 2. Get Consent
Has the user already indicated their worktree preference in your instructions? If not, ask for consent before creating a worktree:
If the user has not already declared a worktree preference, ask:
> "Would you like me to set up an isolated worktree? It protects your current branch from changes."
> Would you like me to set up an isolated worktree? It protects your current branch from changes.
Honor any existing declared preference without asking. If the user declines consent, work in place and skip to Step 3.
If the user declines, work in place and continue to project setup.
## Step 1: Create Isolated Workspace
### 3. Create Isolated Workspace
**You have two mechanisms. Try them in this order.**
Use mechanisms in this order:
### 1a. Native Worktree Tools (preferred)
1. Native worktree tool if the platform provides one, such as `EnterWorktree`, `WorktreeCreate`, `/worktree`, or a `--worktree` flag.
2. Manual `git worktree add` only if no native tool exists.
The user has asked for an isolated workspace (Step 0 consent). Do you already have a way to create a worktree? It might be a tool with a name like `EnterWorktree`, `WorktreeCreate`, a `/worktree` command, or a `--worktree` flag. If you do, use it and skip to Step 3.
Manual directory priority:
Native tools handle directory placement, branch creation, and cleanup automatically. Using `git worktree add` when you have a native tool creates phantom state your harness can't see or manage.
1. Explicit user/instruction preference.
2. Existing project-local `.worktrees/`.
3. Existing project-local `worktrees/`.
4. Default to `.worktrees/` at the project root.
Only proceed to Step 1b if you have no native worktree tool available.
### 1b. Git Worktree Fallback
**Only use this if Step 1a does not apply** — you have no native worktree tool available. Create a worktree manually using git.
#### Directory Selection
Follow this priority order. Explicit user preference always beats observed filesystem state.
1. **Check your instructions for a declared worktree directory preference.** If the user has already specified one, use it without asking.
2. **Check for an existing project-local worktree directory:**
```bash
ls -d .worktrees 2>/dev/null # Preferred (hidden)
ls -d worktrees 2>/dev/null # Alternative
```
If found, use it. If both exist, `.worktrees` wins.
3. **Check for an existing global directory:**
```bash
project=$(basename "$(git rev-parse --show-toplevel)")
ls -d ~/.config/superpowers/worktrees/$project 2>/dev/null
```
If found, use it (backward compatibility with legacy global path).
4. **If there is no other guidance available**, default to `.worktrees/` at the project root.
#### Safety Verification (project-local directories only)
**MUST verify directory is ignored before creating worktree:**
Before creating a project-local worktree, verify the chosen directory is ignored:
```bash
git check-ignore -q .worktrees 2>/dev/null || git check-ignore -q worktrees 2>/dev/null
```
**If NOT ignored:** Add to .gitignore, commit the change, then proceed.
**Why critical:** Prevents accidentally committing worktree contents to repository.
Global directories (`~/.config/superpowers/worktrees/`) need no verification.
#### Create the Worktree
If not ignored, add it to `.gitignore`, commit that change, then proceed.
```bash
project=$(basename "$(git rev-parse --show-toplevel)")
# Determine path based on chosen location
# For project-local: path="$LOCATION/$BRANCH_NAME"
# For global: path="~/.config/superpowers/worktrees/$project/$BRANCH_NAME"
git worktree add "$path" -b "$BRANCH_NAME"
cd "$path"
```
**Sandbox fallback:** If `git worktree add` fails with a permission error (sandbox denial), tell the user the sandbox blocked worktree creation and you're working in the current directory instead. Then run setup and baseline tests in place.
If `git worktree add` is blocked by sandbox permissions, tell the user and work in the current directory instead.
## Step 3: Project Setup
### 4. Project Setup
Auto-detect and run appropriate setup:
Auto-detect setup where appropriate:
```bash
# Node.js
if [ -f package.json ]; then npm install; fi
# Rust
if [ -f Cargo.toml ]; then cargo build; fi
# Python
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
if [ -f pyproject.toml ]; then poetry install; fi
# Go
if [ -f go.mod ]; then go mod download; fi
```
## Step 4: Verify Clean Baseline
### 5. Baseline Verification
Run tests to ensure workspace starts clean:
Run project-appropriate tests before implementation when practical: `npm test`, `cargo test`, `pytest`, `go test ./...`, or equivalent.
```bash
# Use project-appropriate command
npm test / cargo test / pytest / go test ./...
```
If tests fail, report failures and ask whether to proceed or investigate. If tests pass, report readiness.
**If tests fail:** Report failures, ask whether to proceed or investigate.
## Never
**If tests pass:** Report ready.
### Report
```
Worktree ready at <full-path>
Tests passing (<N> tests, 0 failures)
Ready to implement <feature-name>
```
## Quick Reference
| Situation | Action |
|-----------|--------|
| Already in linked worktree | Skip creation (Step 0) |
| In a submodule | Treat as normal repo (Step 0 guard) |
| Native worktree tool available | Use it (Step 1a) |
| No native tool | Git worktree fallback (Step 1b) |
| `.worktrees/` exists | Use it (verify ignored) |
| `worktrees/` exists | Use it (verify ignored) |
| Both exist | Use `.worktrees/` |
| Neither exists | Check instruction file, then default `.worktrees/` |
| Global path exists | Use it (backward compat) |
| Directory not ignored | Add to .gitignore + commit |
| Permission error on create | Sandbox fallback, work in place |
| Tests fail during baseline | Report failures + ask |
| No package.json/Cargo.toml | Skip dependency install |
## Common Mistakes
### Fighting the harness
- **Problem:** Using `git worktree add` when the platform already provides isolation
- **Fix:** Step 0 detects existing isolation. Step 1a defers to native tools.
### Skipping detection
- **Problem:** Creating a nested worktree inside an existing one
- **Fix:** Always run Step 0 before creating anything
### Skipping ignore verification
- **Problem:** Worktree contents get tracked, pollute git status
- **Fix:** Always use `git check-ignore` before creating project-local worktree
### Assuming directory location
- **Problem:** Creates inconsistency, violates project conventions
- **Fix:** Follow priority: existing > global legacy > instruction file > default
### Proceeding with failing tests
- **Problem:** Can't distinguish new bugs from pre-existing issues
- **Fix:** Report failures, get explicit permission to proceed
## Red Flags
**Never:**
- Create a worktree when Step 0 detects existing isolation
- Use `git worktree add` when you have a native worktree tool (e.g., `EnterWorktree`). This is the #1 mistake — if you have it, use it.
- Skip Step 1a by jumping straight to Step 1b's git commands
- Create worktree without verifying it's ignored (project-local)
- Skip baseline test verification
- Proceed with failing tests without asking
**Always:**
- Run Step 0 detection first
- Prefer native tools over git fallback
- Follow directory priority: existing > global legacy > instruction file > default
- Verify directory is ignored for project-local
- Auto-detect and run project setup
- Verify clean test baseline
- Create a worktree when already in a linked worktree.
- Use `git worktree add` when a native worktree tool exists.
- Skip submodule detection.
- Create a project-local worktree without ignore verification.
- Proceed from failing baseline tests without telling the user.