Compress agent skills
This commit is contained in:
66
agent-browser/references/advanced.md
Normal file
66
agent-browser/references/advanced.md
Normal file
@@ -0,0 +1,66 @@
|
||||
# Advanced Browser Modes
|
||||
|
||||
## Headed Debugging
|
||||
|
||||
```bash
|
||||
agent-browser --headed open https://example.com
|
||||
agent-browser highlight @e1
|
||||
agent-browser record start demo.webm
|
||||
agent-browser profiler start
|
||||
agent-browser profiler stop trace.json
|
||||
```
|
||||
|
||||
Or set `AGENT_BROWSER_HEADED=1`.
|
||||
|
||||
## Local Files
|
||||
|
||||
```bash
|
||||
agent-browser --allow-file-access open file:///path/to/document.pdf
|
||||
agent-browser --allow-file-access open file:///path/to/page.html
|
||||
agent-browser screenshot output.png
|
||||
```
|
||||
|
||||
## iOS Simulator
|
||||
|
||||
Requires macOS, Xcode, Appium, and the xcuitest driver.
|
||||
|
||||
```bash
|
||||
agent-browser device list
|
||||
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
|
||||
agent-browser -p ios snapshot -i
|
||||
agent-browser -p ios tap @e1
|
||||
agent-browser -p ios swipe up
|
||||
agent-browser -p ios screenshot mobile.png
|
||||
agent-browser -p ios close
|
||||
```
|
||||
|
||||
## Color Scheme
|
||||
|
||||
```bash
|
||||
agent-browser --color-scheme dark open https://example.com
|
||||
AGENT_BROWSER_COLOR_SCHEME=dark agent-browser open https://example.com
|
||||
agent-browser set media dark
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Project config file: `agent-browser.json`.
|
||||
|
||||
```json
|
||||
{
|
||||
"headed": true,
|
||||
"proxy": "http://localhost:8080",
|
||||
"profile": "./browser-data"
|
||||
}
|
||||
```
|
||||
|
||||
Priority: user config < project config < environment variables < CLI flags.
|
||||
|
||||
## Engine Selection
|
||||
|
||||
```bash
|
||||
agent-browser --engine lightpanda open example.com
|
||||
export AGENT_BROWSER_ENGINE=lightpanda
|
||||
```
|
||||
|
||||
Supported engines include `chrome` and `lightpanda`. Lightpanda is faster but lacks some Chrome features such as extensions and profiles.
|
||||
@@ -1,264 +1,72 @@
|
||||
# Command Reference
|
||||
# agent-browser Commands
|
||||
|
||||
Complete reference for all agent-browser commands. For quick start and common patterns, see SKILL.md.
|
||||
|
||||
## Navigation
|
||||
## Navigation And Snapshot
|
||||
|
||||
```bash
|
||||
agent-browser open <url> # Navigate to URL (aliases: goto, navigate)
|
||||
# Supports: https://, http://, file://, about:, data://
|
||||
# Auto-prepends https:// if no protocol given
|
||||
agent-browser back # Go back
|
||||
agent-browser forward # Go forward
|
||||
agent-browser reload # Reload page
|
||||
agent-browser close # Close browser (aliases: quit, exit)
|
||||
agent-browser connect 9222 # Connect to browser via CDP port
|
||||
agent-browser open <url>
|
||||
agent-browser close
|
||||
agent-browser snapshot -i
|
||||
agent-browser snapshot -i -C
|
||||
agent-browser snapshot -s "#selector"
|
||||
```
|
||||
|
||||
## Snapshot (page analysis)
|
||||
## Interaction
|
||||
|
||||
```bash
|
||||
agent-browser snapshot # Full accessibility tree
|
||||
agent-browser snapshot -i # Interactive elements only (recommended)
|
||||
agent-browser snapshot -c # Compact output
|
||||
agent-browser snapshot -d 3 # Limit depth to 3
|
||||
agent-browser snapshot -s "#main" # Scope to CSS selector
|
||||
agent-browser click @e1
|
||||
agent-browser click @e1 --new-tab
|
||||
agent-browser fill @e2 "text"
|
||||
agent-browser type @e2 "text"
|
||||
agent-browser select @e1 "option"
|
||||
agent-browser check @e1
|
||||
agent-browser press Enter
|
||||
agent-browser keyboard type "text"
|
||||
agent-browser keyboard inserttext "text"
|
||||
agent-browser scroll down 500
|
||||
agent-browser scroll down 500 --selector "div.content"
|
||||
```
|
||||
|
||||
## Interactions (use @refs from snapshot)
|
||||
## Information And Waits
|
||||
|
||||
```bash
|
||||
agent-browser click @e1 # Click
|
||||
agent-browser click @e1 --new-tab # Click and open in new tab
|
||||
agent-browser dblclick @e1 # Double-click
|
||||
agent-browser focus @e1 # Focus element
|
||||
agent-browser fill @e2 "text" # Clear and type
|
||||
agent-browser type @e2 "text" # Type without clearing
|
||||
agent-browser press Enter # Press key (alias: key)
|
||||
agent-browser press Control+a # Key combination
|
||||
agent-browser keydown Shift # Hold key down
|
||||
agent-browser keyup Shift # Release key
|
||||
agent-browser hover @e1 # Hover
|
||||
agent-browser check @e1 # Check checkbox
|
||||
agent-browser uncheck @e1 # Uncheck checkbox
|
||||
agent-browser select @e1 "value" # Select dropdown option
|
||||
agent-browser select @e1 "a" "b" # Select multiple options
|
||||
agent-browser scroll down 500 # Scroll page (default: down 300px)
|
||||
agent-browser scrollintoview @e1 # Scroll element into view (alias: scrollinto)
|
||||
agent-browser drag @e1 @e2 # Drag and drop
|
||||
agent-browser upload @e1 file.pdf # Upload files
|
||||
agent-browser get text @e1
|
||||
agent-browser get url
|
||||
agent-browser get title
|
||||
agent-browser wait @e1
|
||||
agent-browser wait --load networkidle
|
||||
agent-browser wait --url "**/page"
|
||||
agent-browser wait 2000
|
||||
```
|
||||
|
||||
## Get Information
|
||||
## Capture And Diff
|
||||
|
||||
```bash
|
||||
agent-browser get text @e1 # Get element text
|
||||
agent-browser get html @e1 # Get innerHTML
|
||||
agent-browser get value @e1 # Get input value
|
||||
agent-browser get attr @e1 href # Get attribute
|
||||
agent-browser get title # Get page title
|
||||
agent-browser get url # Get current URL
|
||||
agent-browser get count ".item" # Count matching elements
|
||||
agent-browser get box @e1 # Get bounding box
|
||||
agent-browser get styles @e1 # Get computed styles (font, color, bg, etc.)
|
||||
agent-browser screenshot
|
||||
agent-browser screenshot --full
|
||||
agent-browser screenshot --annotate
|
||||
agent-browser pdf output.pdf
|
||||
agent-browser diff snapshot
|
||||
agent-browser diff snapshot --baseline before.txt
|
||||
agent-browser diff screenshot --baseline before.png
|
||||
agent-browser diff url <url1> <url2> --wait-until networkidle
|
||||
```
|
||||
|
||||
## Check State
|
||||
## Downloads
|
||||
|
||||
```bash
|
||||
agent-browser is visible @e1 # Check if visible
|
||||
agent-browser is enabled @e1 # Check if enabled
|
||||
agent-browser is checked @e1 # Check if checked
|
||||
agent-browser download @e1 ./file.pdf
|
||||
agent-browser wait --download ./output.zip
|
||||
agent-browser --download-path ./downloads open <url>
|
||||
```
|
||||
|
||||
## Screenshots and PDF
|
||||
## Semantic Locators
|
||||
|
||||
Use when refs are unavailable or unreliable:
|
||||
|
||||
```bash
|
||||
agent-browser screenshot # Save to temporary directory
|
||||
agent-browser screenshot path.png # Save to specific path
|
||||
agent-browser screenshot --full # Full page
|
||||
agent-browser pdf output.pdf # Save as PDF
|
||||
```
|
||||
|
||||
## Video Recording
|
||||
|
||||
```bash
|
||||
agent-browser record start ./demo.webm # Start recording
|
||||
agent-browser click @e1 # Perform actions
|
||||
agent-browser record stop # Stop and save video
|
||||
agent-browser record restart ./take2.webm # Stop current + start new
|
||||
```
|
||||
|
||||
## Wait
|
||||
|
||||
```bash
|
||||
agent-browser wait @e1 # Wait for element
|
||||
agent-browser wait 2000 # Wait milliseconds
|
||||
agent-browser wait --text "Success" # Wait for text (or -t)
|
||||
agent-browser wait --url "**/dashboard" # Wait for URL pattern (or -u)
|
||||
agent-browser wait --load networkidle # Wait for network idle (or -l)
|
||||
agent-browser wait --fn "window.ready" # Wait for JS condition (or -f)
|
||||
```
|
||||
|
||||
## Mouse Control
|
||||
|
||||
```bash
|
||||
agent-browser mouse move 100 200 # Move mouse
|
||||
agent-browser mouse down left # Press button
|
||||
agent-browser mouse up left # Release button
|
||||
agent-browser mouse wheel 100 # Scroll wheel
|
||||
```
|
||||
|
||||
## Semantic Locators (alternative to refs)
|
||||
|
||||
```bash
|
||||
agent-browser find role button click --name "Submit"
|
||||
agent-browser find text "Sign In" click
|
||||
agent-browser find text "Sign In" click --exact # Exact match only
|
||||
agent-browser find label "Email" fill "user@test.com"
|
||||
agent-browser find role button click --name "Submit"
|
||||
agent-browser find placeholder "Search" type "query"
|
||||
agent-browser find alt "Logo" click
|
||||
agent-browser find title "Close" click
|
||||
agent-browser find testid "submit-btn" click
|
||||
agent-browser find first ".item" click
|
||||
agent-browser find last ".item" click
|
||||
agent-browser find nth 2 "a" hover
|
||||
```
|
||||
|
||||
## Browser Settings
|
||||
|
||||
```bash
|
||||
agent-browser set viewport 1920 1080 # Set viewport size
|
||||
agent-browser set viewport 1920 1080 2 # 2x retina (same CSS size, higher res screenshots)
|
||||
agent-browser set device "iPhone 14" # Emulate device
|
||||
agent-browser set geo 37.7749 -122.4194 # Set geolocation (alias: geolocation)
|
||||
agent-browser set offline on # Toggle offline mode
|
||||
agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
|
||||
agent-browser set credentials user pass # HTTP basic auth (alias: auth)
|
||||
agent-browser set media dark # Emulate color scheme
|
||||
agent-browser set media light reduced-motion # Light mode + reduced motion
|
||||
```
|
||||
|
||||
## Cookies and Storage
|
||||
|
||||
```bash
|
||||
agent-browser cookies # Get all cookies
|
||||
agent-browser cookies set name value # Set cookie
|
||||
agent-browser cookies clear # Clear cookies
|
||||
agent-browser storage local # Get all localStorage
|
||||
agent-browser storage local key # Get specific key
|
||||
agent-browser storage local set k v # Set value
|
||||
agent-browser storage local clear # Clear all
|
||||
```
|
||||
|
||||
## Network
|
||||
|
||||
```bash
|
||||
agent-browser network route <url> # Intercept requests
|
||||
agent-browser network route <url> --abort # Block requests
|
||||
agent-browser network route <url> --body '{}' # Mock response
|
||||
agent-browser network unroute [url] # Remove routes
|
||||
agent-browser network requests # View tracked requests
|
||||
agent-browser network requests --filter api # Filter requests
|
||||
```
|
||||
|
||||
## Tabs and Windows
|
||||
|
||||
```bash
|
||||
agent-browser tab # List tabs
|
||||
agent-browser tab new [url] # New tab
|
||||
agent-browser tab 2 # Switch to tab by index
|
||||
agent-browser tab close # Close current tab
|
||||
agent-browser tab close 2 # Close tab by index
|
||||
agent-browser window new # New window
|
||||
```
|
||||
|
||||
## Frames
|
||||
|
||||
```bash
|
||||
agent-browser frame "#iframe" # Switch to iframe
|
||||
agent-browser frame main # Back to main frame
|
||||
```
|
||||
|
||||
## Dialogs
|
||||
|
||||
```bash
|
||||
agent-browser dialog accept [text] # Accept dialog
|
||||
agent-browser dialog dismiss # Dismiss dialog
|
||||
```
|
||||
|
||||
## JavaScript
|
||||
|
||||
```bash
|
||||
agent-browser eval "document.title" # Simple expressions only
|
||||
agent-browser eval -b "<base64>" # Any JavaScript (base64 encoded)
|
||||
agent-browser eval --stdin # Read script from stdin
|
||||
```
|
||||
|
||||
Use `-b`/`--base64` or `--stdin` for reliable execution. Shell escaping with nested quotes and special characters is error-prone.
|
||||
|
||||
```bash
|
||||
# Base64 encode your script, then:
|
||||
agent-browser eval -b "ZG9jdW1lbnQucXVlcnlTZWxlY3RvcignW3NyYyo9Il9uZXh0Il0nKQ=="
|
||||
|
||||
# Or use stdin with heredoc for multiline scripts:
|
||||
cat <<'EOF' | agent-browser eval --stdin
|
||||
const links = document.querySelectorAll('a');
|
||||
Array.from(links).map(a => a.href);
|
||||
EOF
|
||||
```
|
||||
|
||||
## State Management
|
||||
|
||||
```bash
|
||||
agent-browser state save auth.json # Save cookies, storage, auth state
|
||||
agent-browser state load auth.json # Restore saved state
|
||||
```
|
||||
|
||||
## Global Options
|
||||
|
||||
```bash
|
||||
agent-browser --session <name> ... # Isolated browser session
|
||||
agent-browser --json ... # JSON output for parsing
|
||||
agent-browser --headed ... # Show browser window (not headless)
|
||||
agent-browser --full ... # Full page screenshot (-f)
|
||||
agent-browser --cdp <port> ... # Connect via Chrome DevTools Protocol
|
||||
agent-browser -p <provider> ... # Cloud browser provider (--provider)
|
||||
agent-browser --proxy <url> ... # Use proxy server
|
||||
agent-browser --proxy-bypass <hosts> # Hosts to bypass proxy
|
||||
agent-browser --headers <json> ... # HTTP headers scoped to URL's origin
|
||||
agent-browser --executable-path <p> # Custom browser executable
|
||||
agent-browser --extension <path> ... # Load browser extension (repeatable)
|
||||
agent-browser --ignore-https-errors # Ignore SSL certificate errors
|
||||
agent-browser --help # Show help (-h)
|
||||
agent-browser --version # Show version (-V)
|
||||
agent-browser <command> --help # Show detailed help for a command
|
||||
```
|
||||
|
||||
## Debugging
|
||||
|
||||
```bash
|
||||
agent-browser --headed open example.com # Show browser window
|
||||
agent-browser --cdp 9222 snapshot # Connect via CDP port
|
||||
agent-browser connect 9222 # Alternative: connect command
|
||||
agent-browser console # View console messages
|
||||
agent-browser console --clear # Clear console
|
||||
agent-browser errors # View page errors
|
||||
agent-browser errors --clear # Clear errors
|
||||
agent-browser highlight @e1 # Highlight element
|
||||
agent-browser trace start # Start recording trace
|
||||
agent-browser trace stop trace.zip # Stop and save trace
|
||||
agent-browser profiler start # Start Chrome DevTools profiling
|
||||
agent-browser profiler stop trace.json # Stop and save profile
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
AGENT_BROWSER_SESSION="mysession" # Default session name
|
||||
AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path
|
||||
AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths
|
||||
AGENT_BROWSER_PROVIDER="browserbase" # Cloud browser provider
|
||||
AGENT_BROWSER_STREAM_PORT="9223" # WebSocket streaming port
|
||||
AGENT_BROWSER_HOME="/path/to/agent-browser" # Custom install location
|
||||
```
|
||||
|
||||
34
agent-browser/references/eval.md
Normal file
34
agent-browser/references/eval.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# JavaScript Evaluation
|
||||
|
||||
Use `eval` to run JavaScript in the browser context.
|
||||
|
||||
Simple expressions can use regular shell quoting:
|
||||
|
||||
```bash
|
||||
agent-browser eval 'document.title'
|
||||
agent-browser eval 'document.querySelectorAll("img").length'
|
||||
```
|
||||
|
||||
For complex JavaScript, use `--stdin` to avoid shell quoting corruption:
|
||||
|
||||
```bash
|
||||
agent-browser eval --stdin <<'EVALEOF'
|
||||
JSON.stringify(
|
||||
Array.from(document.querySelectorAll("img"))
|
||||
.filter(i => !i.alt)
|
||||
.map(i => ({ src: i.src.split("/").pop(), width: i.width }))
|
||||
)
|
||||
EVALEOF
|
||||
```
|
||||
|
||||
Use base64 for generated scripts:
|
||||
|
||||
```bash
|
||||
agent-browser eval -b "$(printf '%s' 'Array.from(document.querySelectorAll("a")).map(a => a.href)' | base64)"
|
||||
```
|
||||
|
||||
Rules of thumb:
|
||||
|
||||
- Single-line with no nested quotes: regular `eval 'expression'`.
|
||||
- Nested quotes, arrow functions, template literals, or multiline: `eval --stdin`.
|
||||
- Programmatic scripts: `eval -b`.
|
||||
43
agent-browser/references/security.md
Normal file
43
agent-browser/references/security.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# Security And Boundaries
|
||||
|
||||
By default, `agent-browser` imposes no navigation, action, or output restrictions.
|
||||
|
||||
## Content Boundaries
|
||||
|
||||
Wrap page-sourced output so agents can distinguish untrusted page content:
|
||||
|
||||
```bash
|
||||
export AGENT_BROWSER_CONTENT_BOUNDARIES=1
|
||||
agent-browser snapshot
|
||||
```
|
||||
|
||||
## Domain Allowlist
|
||||
|
||||
Restrict navigation and subresource connections:
|
||||
|
||||
```bash
|
||||
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"
|
||||
agent-browser open https://example.com
|
||||
```
|
||||
|
||||
Include CDN domains the page needs.
|
||||
|
||||
## Action Policy
|
||||
|
||||
```bash
|
||||
export AGENT_BROWSER_ACTION_POLICY=./policy.json
|
||||
```
|
||||
|
||||
Example policy:
|
||||
|
||||
```json
|
||||
{"default":"deny","allow":["navigate","snapshot","click","scroll","wait","get"]}
|
||||
```
|
||||
|
||||
Auth vault operations bypass action policy, but domain allowlist still applies.
|
||||
|
||||
## Output Limits
|
||||
|
||||
```bash
|
||||
export AGENT_BROWSER_MAX_OUTPUT=50000
|
||||
```
|
||||
49
agent-browser/references/sessions-auth.md
Normal file
49
agent-browser/references/sessions-auth.md
Normal file
@@ -0,0 +1,49 @@
|
||||
# Sessions And Authentication
|
||||
|
||||
## Named Sessions
|
||||
|
||||
```bash
|
||||
agent-browser --session site1 open https://site-a.com
|
||||
agent-browser --session site2 open https://site-b.com
|
||||
agent-browser --session site1 snapshot -i
|
||||
agent-browser session list
|
||||
agent-browser --session site1 close
|
||||
```
|
||||
|
||||
## Auth Vault
|
||||
|
||||
Pipe passwords via stdin to avoid shell history exposure:
|
||||
|
||||
```bash
|
||||
echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin
|
||||
agent-browser auth login github
|
||||
agent-browser auth list
|
||||
agent-browser auth show github
|
||||
agent-browser auth delete github
|
||||
```
|
||||
|
||||
## State Files
|
||||
|
||||
```bash
|
||||
agent-browser open https://app.example.com/login
|
||||
agent-browser snapshot -i
|
||||
agent-browser fill @e1 "$USERNAME"
|
||||
agent-browser fill @e2 "$PASSWORD"
|
||||
agent-browser click @e3
|
||||
agent-browser wait --url "**/dashboard"
|
||||
agent-browser state save auth.json
|
||||
agent-browser state load auth.json
|
||||
```
|
||||
|
||||
## Session Persistence
|
||||
|
||||
```bash
|
||||
agent-browser --session-name myapp open https://app.example.com/login
|
||||
agent-browser close
|
||||
agent-browser --session-name myapp open https://app.example.com/dashboard
|
||||
agent-browser state list
|
||||
agent-browser state clear myapp
|
||||
agent-browser state clean --older-than 7
|
||||
```
|
||||
|
||||
Set `AGENT_BROWSER_ENCRYPTION_KEY` to encrypt stored state at rest.
|
||||
Reference in New Issue
Block a user