feat: sync agent-browser skill with upstream vercel-labs/agent-browser

Update SKILL.md to match the latest upstream skill from vercel-labs/agent-browser, adding substantial new capabilities: - Authentication (auth vault, profiles, session persistence, state files) - Command chaining, annotated screenshots, diffing - Security features (content boundaries, domain allowlist, action policy) - iOS Simulator support, Lightpanda engine, downloads, clipboard - JS eval improvements (--stdin, -b for shell safety) - Timeout guidance, config files, session cleanup Add 7 reference docs (commands, authentication, snapshot-refs, session-management, video-recording, profiling, proxy-support) and 3 ready-to-use shell templates. Kept our YAML frontmatter, setup check section, and Playwright MCP comparison table which are unique to our plugin context.
2026-03-14 20:08:27 -07:00
parent 7c04c3158f
commit 24860ec3f1
11 changed files with 2260 additions and 240 deletions
--- a/plugins/compound-engineering/skills/agent-browser/references/authentication.md
+++ b/plugins/compound-engineering/skills/agent-browser/references/authentication.md
@@ -0,0 +1,303 @@
+# Authentication Patterns
+
+Login flows, session persistence, OAuth, 2FA, and authenticated browsing.
+
+**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
+
+## Contents
+
+- [Import Auth from Your Browser](#import-auth-from-your-browser)
+- [Persistent Profiles](#persistent-profiles)
+- [Session Persistence](#session-persistence)
+- [Basic Login Flow](#basic-login-flow)
+- [Saving Authentication State](#saving-authentication-state)
+- [Restoring Authentication](#restoring-authentication)
+- [OAuth / SSO Flows](#oauth--sso-flows)
+- [Two-Factor Authentication](#two-factor-authentication)
+- [HTTP Basic Auth](#http-basic-auth)
+- [Cookie-Based Auth](#cookie-based-auth)
+- [Token Refresh Handling](#token-refresh-handling)
+- [Security Best Practices](#security-best-practices)
+
+## Import Auth from Your Browser
+
+The fastest way to authenticate is to reuse cookies from a Chrome session you are already logged into.
+
+**Step 1: Start Chrome with remote debugging**
+
+```bash
+# macOS
+"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --remote-debugging-port=9222
+
+# Linux
+google-chrome --remote-debugging-port=9222
+
+# Windows
+"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
+```
+
+Log in to your target site(s) in this Chrome window as you normally would.
+
+> **Security note:** `--remote-debugging-port` exposes full browser control on localhost. Any local process can connect and read cookies, execute JS, etc. Only use on trusted machines and close Chrome when done.
+
+**Step 2: Grab the auth state**
+
+```bash
+# Auto-discover the running Chrome and save its cookies + localStorage
+agent-browser --auto-connect state save ./my-auth.json
+```
+
+**Step 3: Reuse in automation**
+
+```bash
+# Load auth at launch
+agent-browser --state ./my-auth.json open https://app.example.com/dashboard
+
+# Or load into an existing session
+agent-browser state load ./my-auth.json
+agent-browser open https://app.example.com/dashboard
+```
+
+This works for any site, including those with complex OAuth flows, SSO, or 2FA -- as long as Chrome already has valid session cookies.
+
+> **Security note:** State files contain session tokens in plaintext. Add them to `.gitignore`, delete when no longer needed, and set `AGENT_BROWSER_ENCRYPTION_KEY` for encryption at rest. See [Security Best Practices](#security-best-practices).
+
+**Tip:** Combine with `--session-name` so the imported auth auto-persists across restarts:
+
+```bash
+agent-browser --session-name myapp state load ./my-auth.json
+# From now on, state is auto-saved/restored for "myapp"
+```
+
+## Persistent Profiles
+
+Use `--profile` to point agent-browser at a Chrome user data directory. This persists everything (cookies, IndexedDB, service workers, cache) across browser restarts without explicit save/load:
+
+```bash
+# First run: login once
+agent-browser --profile ~/.myapp-profile open https://app.example.com/login
+# ... complete login flow ...
+
+# All subsequent runs: already authenticated
+agent-browser --profile ~/.myapp-profile open https://app.example.com/dashboard
+```
+
+Use different paths for different projects or test users:
+
+```bash
+agent-browser --profile ~/.profiles/admin open https://app.example.com
+agent-browser --profile ~/.profiles/viewer open https://app.example.com
+```
+
+Or set via environment variable:
+
+```bash
+export AGENT_BROWSER_PROFILE=~/.myapp-profile
+agent-browser open https://app.example.com/dashboard
+```
+
+## Session Persistence
+
+Use `--session-name` to auto-save and restore cookies + localStorage by name, without managing files:
+
+```bash
+# Auto-saves state on close, auto-restores on next launch
+agent-browser --session-name twitter open https://twitter.com
+# ... login flow ...
+agent-browser close  # state saved to ~/.agent-browser/sessions/
+
+# Next time: state is automatically restored
+agent-browser --session-name twitter open https://twitter.com
+```
+
+Encrypt state at rest:
+
+```bash
+export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32)
+agent-browser --session-name secure open https://app.example.com
+```
+
+## Basic Login Flow
+
+```bash
+# Navigate to login page
+agent-browser open https://app.example.com/login
+agent-browser wait --load networkidle
+
+# Get form elements
+agent-browser snapshot -i
+# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Sign In"
+
+# Fill credentials
+agent-browser fill @e1 "user@example.com"
+agent-browser fill @e2 "password123"
+
+# Submit
+agent-browser click @e3
+agent-browser wait --load networkidle
+
+# Verify login succeeded
+agent-browser get url  # Should be dashboard, not login
+```
+
+## Saving Authentication State
+
+After logging in, save state for reuse:
+
+```bash
+# Login first (see above)
+agent-browser open https://app.example.com/login
+agent-browser snapshot -i
+agent-browser fill @e1 "user@example.com"
+agent-browser fill @e2 "password123"
+agent-browser click @e3
+agent-browser wait --url "**/dashboard"
+
+# Save authenticated state
+agent-browser state save ./auth-state.json
+```
+
+## Restoring Authentication
+
+Skip login by loading saved state:
+
+```bash
+# Load saved auth state
+agent-browser state load ./auth-state.json
+
+# Navigate directly to protected page
+agent-browser open https://app.example.com/dashboard
+
+# Verify authenticated
+agent-browser snapshot -i
+```
+
+## OAuth / SSO Flows
+
+For OAuth redirects:
+
+```bash
+# Start OAuth flow
+agent-browser open https://app.example.com/auth/google
+
+# Handle redirects automatically
+agent-browser wait --url "**/accounts.google.com**"
+agent-browser snapshot -i
+
+# Fill Google credentials
+agent-browser fill @e1 "user@gmail.com"
+agent-browser click @e2  # Next button
+agent-browser wait 2000
+agent-browser snapshot -i
+agent-browser fill @e3 "password"
+agent-browser click @e4  # Sign in
+
+# Wait for redirect back
+agent-browser wait --url "**/app.example.com**"
+agent-browser state save ./oauth-state.json
+```
+
+## Two-Factor Authentication
+
+Handle 2FA with manual intervention:
+
+```bash
+# Login with credentials
+agent-browser open https://app.example.com/login --headed  # Show browser
+agent-browser snapshot -i
+agent-browser fill @e1 "user@example.com"
+agent-browser fill @e2 "password123"
+agent-browser click @e3
+
+# Wait for user to complete 2FA manually
+echo "Complete 2FA in the browser window..."
+agent-browser wait --url "**/dashboard" --timeout 120000
+
+# Save state after 2FA
+agent-browser state save ./2fa-state.json
+```
+
+## HTTP Basic Auth
+
+For sites using HTTP Basic Authentication:
+
+```bash
+# Set credentials before navigation
+agent-browser set credentials username password
+
+# Navigate to protected resource
+agent-browser open https://protected.example.com/api
+```
+
+## Cookie-Based Auth
+
+Manually set authentication cookies:
+
+```bash
+# Set auth cookie
+agent-browser cookies set session_token "abc123xyz"
+
+# Navigate to protected page
+agent-browser open https://app.example.com/dashboard
+```
+
+## Token Refresh Handling
+
+For sessions with expiring tokens:
+
+```bash
+#!/bin/bash
+# Wrapper that handles token refresh
+
+STATE_FILE="./auth-state.json"
+
+# Try loading existing state
+if [[ -f "$STATE_FILE" ]]; then
+    agent-browser state load "$STATE_FILE"
+    agent-browser open https://app.example.com/dashboard
+
+    # Check if session is still valid
+    URL=$(agent-browser get url)
+    if [[ "$URL" == *"/login"* ]]; then
+        echo "Session expired, re-authenticating..."
+        # Perform fresh login
+        agent-browser snapshot -i
+        agent-browser fill @e1 "$USERNAME"
+        agent-browser fill @e2 "$PASSWORD"
+        agent-browser click @e3
+        agent-browser wait --url "**/dashboard"
+        agent-browser state save "$STATE_FILE"
+    fi
+else
+    # First-time login
+    agent-browser open https://app.example.com/login
+    # ... login flow ...
+fi
+```
+
+## Security Best Practices
+
+1. **Never commit state files** - They contain session tokens
+   ```bash
+   echo "*.auth-state.json" >> .gitignore
+   ```
+
+2. **Use environment variables for credentials**
+   ```bash
+   agent-browser fill @e1 "$APP_USERNAME"
+   agent-browser fill @e2 "$APP_PASSWORD"
+   ```
+
+3. **Clean up after automation**
+   ```bash
+   agent-browser cookies clear
+   rm -f ./auth-state.json
+   ```
+
+4. **Use short-lived sessions for CI/CD**
+   ```bash
+   # Don't persist state in CI
+   agent-browser open https://app.example.com/login
+   # ... login and perform actions ...
+   agent-browser close  # Session ends, nothing persisted
+   ```
--- a/plugins/compound-engineering/skills/agent-browser/references/commands.md
+++ b/plugins/compound-engineering/skills/agent-browser/references/commands.md
@@ -0,0 +1,266 @@
+# Command Reference
+
+Complete reference for all agent-browser commands. For quick start and common patterns, see SKILL.md.
+
+## Navigation
+
+```bash
+agent-browser open <url>      # Navigate to URL (aliases: goto, navigate)
+                              # Supports: https://, http://, file://, about:, data://
+                              # Auto-prepends https:// if no protocol given
+agent-browser back            # Go back
+agent-browser forward         # Go forward
+agent-browser reload          # Reload page
+agent-browser close           # Close browser (aliases: quit, exit)
+agent-browser connect 9222    # Connect to browser via CDP port
+```
+
+## Snapshot (page analysis)
+
+```bash
+agent-browser snapshot            # Full accessibility tree
+agent-browser snapshot -i         # Interactive elements only (recommended)
+agent-browser snapshot -c         # Compact output
+agent-browser snapshot -d 3       # Limit depth to 3
+agent-browser snapshot -s "#main" # Scope to CSS selector
+```
+
+## Interactions (use @refs from snapshot)
+
+```bash
+agent-browser click @e1           # Click
+agent-browser click @e1 --new-tab # Click and open in new tab
+agent-browser dblclick @e1        # Double-click
+agent-browser focus @e1           # Focus element
+agent-browser fill @e2 "text"     # Clear and type
+agent-browser type @e2 "text"     # Type without clearing
+agent-browser press Enter         # Press key (alias: key)
+agent-browser press Control+a     # Key combination
+agent-browser keydown Shift       # Hold key down
+agent-browser keyup Shift         # Release key
+agent-browser hover @e1           # Hover
+agent-browser check @e1           # Check checkbox
+agent-browser uncheck @e1         # Uncheck checkbox
+agent-browser select @e1 "value"  # Select dropdown option
+agent-browser select @e1 "a" "b"  # Select multiple options
+agent-browser scroll down 500     # Scroll page (default: down 300px)
+agent-browser scrollintoview @e1  # Scroll element into view (alias: scrollinto)
+agent-browser drag @e1 @e2        # Drag and drop
+agent-browser upload @e1 file.pdf # Upload files
+```
+
+## Get Information
+
+```bash
+agent-browser get text @e1        # Get element text
+agent-browser get html @e1        # Get innerHTML
+agent-browser get value @e1       # Get input value
+agent-browser get attr @e1 href   # Get attribute
+agent-browser get title           # Get page title
+agent-browser get url             # Get current URL
+agent-browser get cdp-url         # Get CDP WebSocket URL
+agent-browser get count ".item"   # Count matching elements
+agent-browser get box @e1         # Get bounding box
+agent-browser get styles @e1      # Get computed styles (font, color, bg, etc.)
+```
+
+## Check State
+
+```bash
+agent-browser is visible @e1      # Check if visible
+agent-browser is enabled @e1      # Check if enabled
+agent-browser is checked @e1      # Check if checked
+```
+
+## Screenshots and PDF
+
+```bash
+agent-browser screenshot          # Save to temporary directory
+agent-browser screenshot path.png # Save to specific path
+agent-browser screenshot --full   # Full page
+agent-browser pdf output.pdf      # Save as PDF
+```
+
+## Video Recording
+
+```bash
+agent-browser record start ./demo.webm    # Start recording
+agent-browser click @e1                   # Perform actions
+agent-browser record stop                 # Stop and save video
+agent-browser record restart ./take2.webm # Stop current + start new
+```
+
+## Wait
+
+```bash
+agent-browser wait @e1                     # Wait for element
+agent-browser wait 2000                    # Wait milliseconds
+agent-browser wait --text "Success"        # Wait for text (or -t)
+agent-browser wait --url "**/dashboard"    # Wait for URL pattern (or -u)
+agent-browser wait --load networkidle      # Wait for network idle (or -l)
+agent-browser wait --fn "window.ready"     # Wait for JS condition (or -f)
+```
+
+## Mouse Control
+
+```bash
+agent-browser mouse move 100 200      # Move mouse
+agent-browser mouse down left         # Press button
+agent-browser mouse up left           # Release button
+agent-browser mouse wheel 100         # Scroll wheel
+```
+
+## Semantic Locators (alternative to refs)
+
+```bash
+agent-browser find role button click --name "Submit"
+agent-browser find text "Sign In" click
+agent-browser find text "Sign In" click --exact      # Exact match only
+agent-browser find label "Email" fill "user@test.com"
+agent-browser find placeholder "Search" type "query"
+agent-browser find alt "Logo" click
+agent-browser find title "Close" click
+agent-browser find testid "submit-btn" click
+agent-browser find first ".item" click
+agent-browser find last ".item" click
+agent-browser find nth 2 "a" hover
+```
+
+## Browser Settings
+
+```bash
+agent-browser set viewport 1920 1080          # Set viewport size
+agent-browser set viewport 1920 1080 2        # 2x retina (same CSS size, higher res screenshots)
+agent-browser set device "iPhone 14"          # Emulate device
+agent-browser set geo 37.7749 -122.4194       # Set geolocation (alias: geolocation)
+agent-browser set offline on                  # Toggle offline mode
+agent-browser set headers '{"X-Key":"v"}'     # Extra HTTP headers
+agent-browser set credentials user pass       # HTTP basic auth (alias: auth)
+agent-browser set media dark                  # Emulate color scheme
+agent-browser set media light reduced-motion  # Light mode + reduced motion
+```
+
+## Cookies and Storage
+
+```bash
+agent-browser cookies                     # Get all cookies
+agent-browser cookies set name value      # Set cookie
+agent-browser cookies clear               # Clear cookies
+agent-browser storage local               # Get all localStorage
+agent-browser storage local key           # Get specific key
+agent-browser storage local set k v       # Set value
+agent-browser storage local clear         # Clear all
+```
+
+## Network
+
+```bash
+agent-browser network route <url>              # Intercept requests
+agent-browser network route <url> --abort      # Block requests
+agent-browser network route <url> --body '{}'  # Mock response
+agent-browser network unroute [url]            # Remove routes
+agent-browser network requests                 # View tracked requests
+agent-browser network requests --filter api    # Filter requests
+```
+
+## Tabs and Windows
+
+```bash
+agent-browser tab                 # List tabs
+agent-browser tab new [url]       # New tab
+agent-browser tab 2               # Switch to tab by index
+agent-browser tab close           # Close current tab
+agent-browser tab close 2         # Close tab by index
+agent-browser window new          # New window
+```
+
+## Frames
+
+```bash
+agent-browser frame "#iframe"     # Switch to iframe
+agent-browser frame main          # Back to main frame
+```
+
+## Dialogs
+
+```bash
+agent-browser dialog accept [text]  # Accept dialog
+agent-browser dialog dismiss        # Dismiss dialog
+```
+
+## JavaScript
+
+```bash
+agent-browser eval "document.title"          # Simple expressions only
+agent-browser eval -b "<base64>"             # Any JavaScript (base64 encoded)
+agent-browser eval --stdin                   # Read script from stdin
+```
+
+Use `-b`/`--base64` or `--stdin` for reliable execution. Shell escaping with nested quotes and special characters is error-prone.
+
+```bash
+# Base64 encode your script, then:
+agent-browser eval -b "ZG9jdW1lbnQucXVlcnlTZWxlY3RvcignW3NyYyo9Il9uZXh0Il0nKQ=="
+
+# Or use stdin with heredoc for multiline scripts:
+cat <<'EOF' | agent-browser eval --stdin
+const links = document.querySelectorAll('a');
+Array.from(links).map(a => a.href);
+EOF
+```
+
+## State Management
+
+```bash
+agent-browser state save auth.json    # Save cookies, storage, auth state
+agent-browser state load auth.json    # Restore saved state
+```
+
+## Global Options
+
+```bash
+agent-browser --session <name> ...    # Isolated browser session
+agent-browser --json ...              # JSON output for parsing
+agent-browser --headed ...            # Show browser window (not headless)
+agent-browser --full ...              # Full page screenshot (-f)
+agent-browser --cdp <port> ...        # Connect via Chrome DevTools Protocol
+agent-browser -p <provider> ...       # Cloud browser provider (--provider)
+agent-browser --proxy <url> ...       # Use proxy server
+agent-browser --proxy-bypass <hosts>  # Hosts to bypass proxy
+agent-browser --headers <json> ...    # HTTP headers scoped to URL's origin
+agent-browser --executable-path <p>   # Custom browser executable
+agent-browser --extension <path> ...  # Load browser extension (repeatable)
+agent-browser --ignore-https-errors   # Ignore SSL certificate errors
+agent-browser --help                  # Show help (-h)
+agent-browser --version               # Show version (-V)
+agent-browser <command> --help        # Show detailed help for a command
+```
+
+## Debugging
+
+```bash
+agent-browser --headed open example.com   # Show browser window
+agent-browser --cdp 9222 snapshot         # Connect via CDP port
+agent-browser connect 9222                # Alternative: connect command
+agent-browser console                     # View console messages
+agent-browser console --clear             # Clear console
+agent-browser errors                      # View page errors
+agent-browser errors --clear              # Clear errors
+agent-browser highlight @e1               # Highlight element
+agent-browser inspect                     # Open Chrome DevTools for this session
+agent-browser trace start                 # Start recording trace
+agent-browser trace stop trace.zip        # Stop and save trace
+agent-browser profiler start              # Start Chrome DevTools profiling
+agent-browser profiler stop trace.json    # Stop and save profile
+```
+
+## Environment Variables
+
+```bash
+AGENT_BROWSER_SESSION="mysession"            # Default session name
+AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path
+AGENT_BROWSER_EXTENSIONS="/ext1,/ext2"       # Comma-separated extension paths
+AGENT_BROWSER_PROVIDER="browserbase"         # Cloud browser provider
+AGENT_BROWSER_STREAM_PORT="9223"             # WebSocket streaming port
+AGENT_BROWSER_HOME="/path/to/agent-browser"  # Custom install location
+```
--- a/plugins/compound-engineering/skills/agent-browser/references/profiling.md
+++ b/plugins/compound-engineering/skills/agent-browser/references/profiling.md
@@ -0,0 +1,120 @@
+# Profiling
+
+Capture Chrome DevTools performance profiles during browser automation for performance analysis.
+
+**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
+
+## Contents
+
+- [Basic Profiling](#basic-profiling)
+- [Profiler Commands](#profiler-commands)
+- [Categories](#categories)
+- [Use Cases](#use-cases)
+- [Output Format](#output-format)
+- [Viewing Profiles](#viewing-profiles)
+- [Limitations](#limitations)
+
+## Basic Profiling
+
+```bash
+# Start profiling
+agent-browser profiler start
+
+# Perform actions
+agent-browser navigate https://example.com
+agent-browser click "#button"
+agent-browser wait 1000
+
+# Stop and save
+agent-browser profiler stop ./trace.json
+```
+
+## Profiler Commands
+
+```bash
+# Start profiling with default categories
+agent-browser profiler start
+
+# Start with custom trace categories
+agent-browser profiler start --categories "devtools.timeline,v8.execute,blink.user_timing"
+
+# Stop profiling and save to file
+agent-browser profiler stop ./trace.json
+```
+
+## Categories
+
+The `--categories` flag accepts a comma-separated list of Chrome trace categories. Default categories include:
+
+- `devtools.timeline` -- standard DevTools performance traces
+- `v8.execute` -- time spent running JavaScript
+- `blink` -- renderer events
+- `blink.user_timing` -- `performance.mark()` / `performance.measure()` calls
+- `latencyInfo` -- input-to-latency tracking
+- `renderer.scheduler` -- task scheduling and execution
+- `toplevel` -- broad-spectrum basic events
+
+Several `disabled-by-default-*` categories are also included for detailed timeline, call stack, and V8 CPU profiling data.
+
+## Use Cases
+
+### Diagnosing Slow Page Loads
+
+```bash
+agent-browser profiler start
+agent-browser navigate https://app.example.com
+agent-browser wait --load networkidle
+agent-browser profiler stop ./page-load-profile.json
+```
+
+### Profiling User Interactions
+
+```bash
+agent-browser navigate https://app.example.com
+agent-browser profiler start
+agent-browser click "#submit"
+agent-browser wait 2000
+agent-browser profiler stop ./interaction-profile.json
+```
+
+### CI Performance Regression Checks
+
+```bash
+#!/bin/bash
+agent-browser profiler start
+agent-browser navigate https://app.example.com
+agent-browser wait --load networkidle
+agent-browser profiler stop "./profiles/build-${BUILD_ID}.json"
+```
+
+## Output Format
+
+The output is a JSON file in Chrome Trace Event format:
+
+```json
+{
+  "traceEvents": [
+    { "cat": "devtools.timeline", "name": "RunTask", "ph": "X", "ts": 12345, "dur": 100 },
+    ...
+  ],
+  "metadata": {
+    "clock-domain": "LINUX_CLOCK_MONOTONIC"
+  }
+}
+```
+
+The `metadata.clock-domain` field is set based on the host platform (Linux or macOS). On Windows it is omitted.
+
+## Viewing Profiles
+
+Load the output JSON file in any of these tools:
+
+- **Chrome DevTools**: Performance panel > Load profile (Ctrl+Shift+I > Performance)
+- **Perfetto UI**: https://ui.perfetto.dev/ -- drag and drop the JSON file
+- **Trace Viewer**: `chrome://tracing` in any Chromium browser
+
+## Limitations
+
+- Only works with Chromium-based browsers (Chrome, Edge). Not supported on Firefox or WebKit.
+- Trace data accumulates in memory while profiling is active (capped at 5 million events). Stop profiling promptly after the area of interest.
+- Data collection on stop has a 30-second timeout. If the browser is unresponsive, the stop command may fail.
--- a/plugins/compound-engineering/skills/agent-browser/references/proxy-support.md
+++ b/plugins/compound-engineering/skills/agent-browser/references/proxy-support.md
@@ -0,0 +1,194 @@
+# Proxy Support
+
+Proxy configuration for geo-testing, rate limiting avoidance, and corporate environments.
+
+**Related**: [commands.md](commands.md) for global options, [SKILL.md](../SKILL.md) for quick start.
+
+## Contents
+
+- [Basic Proxy Configuration](#basic-proxy-configuration)
+- [Authenticated Proxy](#authenticated-proxy)
+- [SOCKS Proxy](#socks-proxy)
+- [Proxy Bypass](#proxy-bypass)
+- [Common Use Cases](#common-use-cases)
+- [Verifying Proxy Connection](#verifying-proxy-connection)
+- [Troubleshooting](#troubleshooting)
+- [Best Practices](#best-practices)
+
+## Basic Proxy Configuration
+
+Use the `--proxy` flag or set proxy via environment variable:
+
+```bash
+# Via CLI flag
+agent-browser --proxy "http://proxy.example.com:8080" open https://example.com
+
+# Via environment variable
+export HTTP_PROXY="http://proxy.example.com:8080"
+agent-browser open https://example.com
+
+# HTTPS proxy
+export HTTPS_PROXY="https://proxy.example.com:8080"
+agent-browser open https://example.com
+
+# Both
+export HTTP_PROXY="http://proxy.example.com:8080"
+export HTTPS_PROXY="http://proxy.example.com:8080"
+agent-browser open https://example.com
+```
+
+## Authenticated Proxy
+
+For proxies requiring authentication:
+
+```bash
+# Include credentials in URL
+export HTTP_PROXY="http://username:password@proxy.example.com:8080"
+agent-browser open https://example.com
+```
+
+## SOCKS Proxy
+
+```bash
+# SOCKS5 proxy
+export ALL_PROXY="socks5://proxy.example.com:1080"
+agent-browser open https://example.com
+
+# SOCKS5 with auth
+export ALL_PROXY="socks5://user:pass@proxy.example.com:1080"
+agent-browser open https://example.com
+```
+
+## Proxy Bypass
+
+Skip proxy for specific domains using `--proxy-bypass` or `NO_PROXY`:
+
+```bash
+# Via CLI flag
+agent-browser --proxy "http://proxy.example.com:8080" --proxy-bypass "localhost,*.internal.com" open https://example.com
+
+# Via environment variable
+export NO_PROXY="localhost,127.0.0.1,.internal.company.com"
+agent-browser open https://internal.company.com  # Direct connection
+agent-browser open https://external.com          # Via proxy
+```
+
+## Common Use Cases
+
+### Geo-Location Testing
+
+```bash
+#!/bin/bash
+# Test site from different regions using geo-located proxies
+
+PROXIES=(
+    "http://us-proxy.example.com:8080"
+    "http://eu-proxy.example.com:8080"
+    "http://asia-proxy.example.com:8080"
+)
+
+for proxy in "${PROXIES[@]}"; do
+    export HTTP_PROXY="$proxy"
+    export HTTPS_PROXY="$proxy"
+
+    region=$(echo "$proxy" | grep -oP '^\w+-\w+')
+    echo "Testing from: $region"
+
+    agent-browser --session "$region" open https://example.com
+    agent-browser --session "$region" screenshot "./screenshots/$region.png"
+    agent-browser --session "$region" close
+done
+```
+
+### Rotating Proxies for Scraping
+
+```bash
+#!/bin/bash
+# Rotate through proxy list to avoid rate limiting
+
+PROXY_LIST=(
+    "http://proxy1.example.com:8080"
+    "http://proxy2.example.com:8080"
+    "http://proxy3.example.com:8080"
+)
+
+URLS=(
+    "https://site.com/page1"
+    "https://site.com/page2"
+    "https://site.com/page3"
+)
+
+for i in "${!URLS[@]}"; do
+    proxy_index=$((i % ${#PROXY_LIST[@]}))
+    export HTTP_PROXY="${PROXY_LIST[$proxy_index]}"
+    export HTTPS_PROXY="${PROXY_LIST[$proxy_index]}"
+
+    agent-browser open "${URLS[$i]}"
+    agent-browser get text body > "output-$i.txt"
+    agent-browser close
+
+    sleep 1  # Polite delay
+done
+```
+
+### Corporate Network Access
+
+```bash
+#!/bin/bash
+# Access internal sites via corporate proxy
+
+export HTTP_PROXY="http://corpproxy.company.com:8080"
+export HTTPS_PROXY="http://corpproxy.company.com:8080"
+export NO_PROXY="localhost,127.0.0.1,.company.com"
+
+# External sites go through proxy
+agent-browser open https://external-vendor.com
+
+# Internal sites bypass proxy
+agent-browser open https://intranet.company.com
+```
+
+## Verifying Proxy Connection
+
+```bash
+# Check your apparent IP
+agent-browser open https://httpbin.org/ip
+agent-browser get text body
+# Should show proxy's IP, not your real IP
+```
+
+## Troubleshooting
+
+### Proxy Connection Failed
+
+```bash
+# Test proxy connectivity first
+curl -x http://proxy.example.com:8080 https://httpbin.org/ip
+
+# Check if proxy requires auth
+export HTTP_PROXY="http://user:pass@proxy.example.com:8080"
+```
+
+### SSL/TLS Errors Through Proxy
+
+Some proxies perform SSL inspection. If you encounter certificate errors:
+
+```bash
+# For testing only - not recommended for production
+agent-browser open https://example.com --ignore-https-errors
+```
+
+### Slow Performance
+
+```bash
+# Use proxy only when necessary
+export NO_PROXY="*.cdn.com,*.static.com"  # Direct CDN access
+```
+
+## Best Practices
+
+1. **Use environment variables** - Don't hardcode proxy credentials
+2. **Set NO_PROXY appropriately** - Avoid routing local traffic through proxy
+3. **Test proxy before automation** - Verify connectivity with simple requests
+4. **Handle proxy failures gracefully** - Implement retry logic for unstable proxies
+5. **Rotate proxies for large scraping jobs** - Distribute load and avoid bans
--- a/plugins/compound-engineering/skills/agent-browser/references/session-management.md
+++ b/plugins/compound-engineering/skills/agent-browser/references/session-management.md
@@ -0,0 +1,193 @@
+# Session Management
+
+Multiple isolated browser sessions with state persistence and concurrent browsing.
+
+**Related**: [authentication.md](authentication.md) for login patterns, [SKILL.md](../SKILL.md) for quick start.
+
+## Contents
+
+- [Named Sessions](#named-sessions)
+- [Session Isolation Properties](#session-isolation-properties)
+- [Session State Persistence](#session-state-persistence)
+- [Common Patterns](#common-patterns)
+- [Default Session](#default-session)
+- [Session Cleanup](#session-cleanup)
+- [Best Practices](#best-practices)
+
+## Named Sessions
+
+Use `--session` flag to isolate browser contexts:
+
+```bash
+# Session 1: Authentication flow
+agent-browser --session auth open https://app.example.com/login
+
+# Session 2: Public browsing (separate cookies, storage)
+agent-browser --session public open https://example.com
+
+# Commands are isolated by session
+agent-browser --session auth fill @e1 "user@example.com"
+agent-browser --session public get text body
+```
+
+## Session Isolation Properties
+
+Each session has independent:
+- Cookies
+- LocalStorage / SessionStorage
+- IndexedDB
+- Cache
+- Browsing history
+- Open tabs
+
+## Session State Persistence
+
+### Save Session State
+
+```bash
+# Save cookies, storage, and auth state
+agent-browser state save /path/to/auth-state.json
+```
+
+### Load Session State
+
+```bash
+# Restore saved state
+agent-browser state load /path/to/auth-state.json
+
+# Continue with authenticated session
+agent-browser open https://app.example.com/dashboard
+```
+
+### State File Contents
+
+```json
+{
+  "cookies": [...],
+  "localStorage": {...},
+  "sessionStorage": {...},
+  "origins": [...]
+}
+```
+
+## Common Patterns
+
+### Authenticated Session Reuse
+
+```bash
+#!/bin/bash
+# Save login state once, reuse many times
+
+STATE_FILE="/tmp/auth-state.json"
+
+# Check if we have saved state
+if [[ -f "$STATE_FILE" ]]; then
+    agent-browser state load "$STATE_FILE"
+    agent-browser open https://app.example.com/dashboard
+else
+    # Perform login
+    agent-browser open https://app.example.com/login
+    agent-browser snapshot -i
+    agent-browser fill @e1 "$USERNAME"
+    agent-browser fill @e2 "$PASSWORD"
+    agent-browser click @e3
+    agent-browser wait --load networkidle
+
+    # Save for future use
+    agent-browser state save "$STATE_FILE"
+fi
+```
+
+### Concurrent Scraping
+
+```bash
+#!/bin/bash
+# Scrape multiple sites concurrently
+
+# Start all sessions
+agent-browser --session site1 open https://site1.com &
+agent-browser --session site2 open https://site2.com &
+agent-browser --session site3 open https://site3.com &
+wait
+
+# Extract from each
+agent-browser --session site1 get text body > site1.txt
+agent-browser --session site2 get text body > site2.txt
+agent-browser --session site3 get text body > site3.txt
+
+# Cleanup
+agent-browser --session site1 close
+agent-browser --session site2 close
+agent-browser --session site3 close
+```
+
+### A/B Testing Sessions
+
+```bash
+# Test different user experiences
+agent-browser --session variant-a open "https://app.com?variant=a"
+agent-browser --session variant-b open "https://app.com?variant=b"
+
+# Compare
+agent-browser --session variant-a screenshot /tmp/variant-a.png
+agent-browser --session variant-b screenshot /tmp/variant-b.png
+```
+
+## Default Session
+
+When `--session` is omitted, commands use the default session:
+
+```bash
+# These use the same default session
+agent-browser open https://example.com
+agent-browser snapshot -i
+agent-browser close  # Closes default session
+```
+
+## Session Cleanup
+
+```bash
+# Close specific session
+agent-browser --session auth close
+
+# List active sessions
+agent-browser session list
+```
+
+## Best Practices
+
+### 1. Name Sessions Semantically
+
+```bash
+# GOOD: Clear purpose
+agent-browser --session github-auth open https://github.com
+agent-browser --session docs-scrape open https://docs.example.com
+
+# AVOID: Generic names
+agent-browser --session s1 open https://github.com
+```
+
+### 2. Always Clean Up
+
+```bash
+# Close sessions when done
+agent-browser --session auth close
+agent-browser --session scrape close
+```
+
+### 3. Handle State Files Securely
+
+```bash
+# Don't commit state files (contain auth tokens!)
+echo "*.auth-state.json" >> .gitignore
+
+# Delete after use
+rm /tmp/auth-state.json
+```
+
+### 4. Timeout Long Sessions
+
+```bash
+# Set timeout for automated scripts
+timeout 60 agent-browser --session long-task get text body
+```
--- a/plugins/compound-engineering/skills/agent-browser/references/snapshot-refs.md
+++ b/plugins/compound-engineering/skills/agent-browser/references/snapshot-refs.md
@@ -0,0 +1,194 @@
+# Snapshot and Refs
+
+Compact element references that reduce context usage dramatically for AI agents.
+
+**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
+
+## Contents
+
+- [How Refs Work](#how-refs-work)
+- [Snapshot Command](#the-snapshot-command)
+- [Using Refs](#using-refs)
+- [Ref Lifecycle](#ref-lifecycle)
+- [Best Practices](#best-practices)
+- [Ref Notation Details](#ref-notation-details)
+- [Troubleshooting](#troubleshooting)
+
+## How Refs Work
+
+Traditional approach:
+```
+Full DOM/HTML -> AI parses -> CSS selector -> Action (~3000-5000 tokens)
+```
+
+agent-browser approach:
+```
+Compact snapshot -> @refs assigned -> Direct interaction (~200-400 tokens)
+```
+
+## The Snapshot Command
+
+```bash
+# Basic snapshot (shows page structure)
+agent-browser snapshot
+
+# Interactive snapshot (-i flag) - RECOMMENDED
+agent-browser snapshot -i
+```
+
+### Snapshot Output Format
+
+```
+Page: Example Site - Home
+URL: https://example.com
+
+@e1 [header]
+  @e2 [nav]
+    @e3 [a] "Home"
+    @e4 [a] "Products"
+    @e5 [a] "About"
+  @e6 [button] "Sign In"
+
+@e7 [main]
+  @e8 [h1] "Welcome"
+  @e9 [form]
+    @e10 [input type="email"] placeholder="Email"
+    @e11 [input type="password"] placeholder="Password"
+    @e12 [button type="submit"] "Log In"
+
+@e13 [footer]
+  @e14 [a] "Privacy Policy"
+```
+
+## Using Refs
+
+Once you have refs, interact directly:
+
+```bash
+# Click the "Sign In" button
+agent-browser click @e6
+
+# Fill email input
+agent-browser fill @e10 "user@example.com"
+
+# Fill password
+agent-browser fill @e11 "password123"
+
+# Submit the form
+agent-browser click @e12
+```
+
+## Ref Lifecycle
+
+**IMPORTANT**: Refs are invalidated when the page changes!
+
+```bash
+# Get initial snapshot
+agent-browser snapshot -i
+# @e1 [button] "Next"
+
+# Click triggers page change
+agent-browser click @e1
+
+# MUST re-snapshot to get new refs!
+agent-browser snapshot -i
+# @e1 [h1] "Page 2"  <- Different element now!
+```
+
+## Best Practices
+
+### 1. Always Snapshot Before Interacting
+
+```bash
+# CORRECT
+agent-browser open https://example.com
+agent-browser snapshot -i          # Get refs first
+agent-browser click @e1            # Use ref
+
+# WRONG
+agent-browser open https://example.com
+agent-browser click @e1            # Ref doesn't exist yet!
+```
+
+### 2. Re-Snapshot After Navigation
+
+```bash
+agent-browser click @e5            # Navigates to new page
+agent-browser snapshot -i          # Get new refs
+agent-browser click @e1            # Use new refs
+```
+
+### 3. Re-Snapshot After Dynamic Changes
+
+```bash
+agent-browser click @e1            # Opens dropdown
+agent-browser snapshot -i          # See dropdown items
+agent-browser click @e7            # Select item
+```
+
+### 4. Snapshot Specific Regions
+
+For complex pages, snapshot specific areas:
+
+```bash
+# Snapshot just the form
+agent-browser snapshot @e9
+```
+
+## Ref Notation Details
+
+```
+@e1 [tag type="value"] "text content" placeholder="hint"
+|    |   |             |               |
+|    |   |             |               +- Additional attributes
+|    |   |             +- Visible text
+|    |   +- Key attributes shown
+|    +- HTML tag name
+- Unique ref ID
+```
+
+### Common Patterns
+
+```
+@e1 [button] "Submit"                    # Button with text
+@e2 [input type="email"]                 # Email input
+@e3 [input type="password"]              # Password input
+@e4 [a href="/page"] "Link Text"         # Anchor link
+@e5 [select]                             # Dropdown
+@e6 [textarea] placeholder="Message"     # Text area
+@e7 [div class="modal"]                  # Container (when relevant)
+@e8 [img alt="Logo"]                     # Image
+@e9 [checkbox] checked                   # Checked checkbox
+@e10 [radio] selected                    # Selected radio
+```
+
+## Troubleshooting
+
+### "Ref not found" Error
+
+```bash
+# Ref may have changed - re-snapshot
+agent-browser snapshot -i
+```
+
+### Element Not Visible in Snapshot
+
+```bash
+# Scroll down to reveal element
+agent-browser scroll down 1000
+agent-browser snapshot -i
+
+# Or wait for dynamic content
+agent-browser wait 1000
+agent-browser snapshot -i
+```
+
+### Too Many Elements
+
+```bash
+# Snapshot specific container
+agent-browser snapshot @e5
+
+# Or use get text for content-only extraction
+agent-browser get text @e5
+```
--- a/plugins/compound-engineering/skills/agent-browser/references/video-recording.md
+++ b/plugins/compound-engineering/skills/agent-browser/references/video-recording.md
@@ -0,0 +1,173 @@
+# Video Recording
+
+Capture browser automation as video for debugging, documentation, or verification.
+
+**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
+
+## Contents
+
+- [Basic Recording](#basic-recording)
+- [Recording Commands](#recording-commands)
+- [Use Cases](#use-cases)
+- [Best Practices](#best-practices)
+- [Output Format](#output-format)
+- [Limitations](#limitations)
+
+## Basic Recording
+
+```bash
+# Start recording
+agent-browser record start ./demo.webm
+
+# Perform actions
+agent-browser open https://example.com
+agent-browser snapshot -i
+agent-browser click @e1
+agent-browser fill @e2 "test input"
+
+# Stop and save
+agent-browser record stop
+```
+
+## Recording Commands
+
+```bash
+# Start recording to file
+agent-browser record start ./output.webm
+
+# Stop current recording
+agent-browser record stop
+
+# Restart with new file (stops current + starts new)
+agent-browser record restart ./take2.webm
+```
+
+## Use Cases
+
+### Debugging Failed Automation
+
+```bash
+#!/bin/bash
+# Record automation for debugging
+
+agent-browser record start ./debug-$(date +%Y%m%d-%H%M%S).webm
+
+# Run your automation
+agent-browser open https://app.example.com
+agent-browser snapshot -i
+agent-browser click @e1 || {
+    echo "Click failed - check recording"
+    agent-browser record stop
+    exit 1
+}
+
+agent-browser record stop
+```
+
+### Documentation Generation
+
+```bash
+#!/bin/bash
+# Record workflow for documentation
+
+agent-browser record start ./docs/how-to-login.webm
+
+agent-browser open https://app.example.com/login
+agent-browser wait 1000  # Pause for visibility
+
+agent-browser snapshot -i
+agent-browser fill @e1 "demo@example.com"
+agent-browser wait 500
+
+agent-browser fill @e2 "password"
+agent-browser wait 500
+
+agent-browser click @e3
+agent-browser wait --load networkidle
+agent-browser wait 1000  # Show result
+
+agent-browser record stop
+```
+
+### CI/CD Test Evidence
+
+```bash
+#!/bin/bash
+# Record E2E test runs for CI artifacts
+
+TEST_NAME="${1:-e2e-test}"
+RECORDING_DIR="./test-recordings"
+mkdir -p "$RECORDING_DIR"
+
+agent-browser record start "$RECORDING_DIR/$TEST_NAME-$(date +%s).webm"
+
+# Run test
+if run_e2e_test; then
+    echo "Test passed"
+else
+    echo "Test failed - recording saved"
+fi
+
+agent-browser record stop
+```
+
+## Best Practices
+
+### 1. Add Pauses for Clarity
+
+```bash
+# Slow down for human viewing
+agent-browser click @e1
+agent-browser wait 500  # Let viewer see result
+```
+
+### 2. Use Descriptive Filenames
+
+```bash
+# Include context in filename
+agent-browser record start ./recordings/login-flow-2024-01-15.webm
+agent-browser record start ./recordings/checkout-test-run-42.webm
+```
+
+### 3. Handle Recording in Error Cases
+
+```bash
+#!/bin/bash
+set -e
+
+cleanup() {
+    agent-browser record stop 2>/dev/null || true
+    agent-browser close 2>/dev/null || true
+}
+trap cleanup EXIT
+
+agent-browser record start ./automation.webm
+# ... automation steps ...
+```
+
+### 4. Combine with Screenshots
+
+```bash
+# Record video AND capture key frames
+agent-browser record start ./flow.webm
+
+agent-browser open https://example.com
+agent-browser screenshot ./screenshots/step1-homepage.png
+
+agent-browser click @e1
+agent-browser screenshot ./screenshots/step2-after-click.png
+
+agent-browser record stop
+```
+
+## Output Format
+
+- Default format: WebM (VP8/VP9 codec)
+- Compatible with all modern browsers and video players
+- Compressed but high quality
+
+## Limitations
+
+- Recording adds slight overhead to automation
+- Large recordings can consume significant disk space
+- Some headless environments may have codec limitations