feat: sync agent-browser skill with upstream vercel-labs/agent-browser
Update SKILL.md to match the latest upstream skill from vercel-labs/agent-browser, adding substantial new capabilities: - Authentication (auth vault, profiles, session persistence, state files) - Command chaining, annotated screenshots, diffing - Security features (content boundaries, domain allowlist, action policy) - iOS Simulator support, Lightpanda engine, downloads, clipboard - JS eval improvements (--stdin, -b for shell safety) - Timeout guidance, config files, session cleanup Add 7 reference docs (commands, authentication, snapshot-refs, session-management, video-recording, profiling, proxy-support) and 3 ready-to-use shell templates. Kept our YAML frontmatter, setup check section, and Playwright MCP comparison table which are unique to our plugin context.
This commit is contained in:
@@ -0,0 +1,303 @@
|
||||
# Authentication Patterns
|
||||
|
||||
Login flows, session persistence, OAuth, 2FA, and authenticated browsing.
|
||||
|
||||
**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
|
||||
|
||||
## Contents
|
||||
|
||||
- [Import Auth from Your Browser](#import-auth-from-your-browser)
|
||||
- [Persistent Profiles](#persistent-profiles)
|
||||
- [Session Persistence](#session-persistence)
|
||||
- [Basic Login Flow](#basic-login-flow)
|
||||
- [Saving Authentication State](#saving-authentication-state)
|
||||
- [Restoring Authentication](#restoring-authentication)
|
||||
- [OAuth / SSO Flows](#oauth--sso-flows)
|
||||
- [Two-Factor Authentication](#two-factor-authentication)
|
||||
- [HTTP Basic Auth](#http-basic-auth)
|
||||
- [Cookie-Based Auth](#cookie-based-auth)
|
||||
- [Token Refresh Handling](#token-refresh-handling)
|
||||
- [Security Best Practices](#security-best-practices)
|
||||
|
||||
## Import Auth from Your Browser
|
||||
|
||||
The fastest way to authenticate is to reuse cookies from a Chrome session you are already logged into.
|
||||
|
||||
**Step 1: Start Chrome with remote debugging**
|
||||
|
||||
```bash
|
||||
# macOS
|
||||
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --remote-debugging-port=9222
|
||||
|
||||
# Linux
|
||||
google-chrome --remote-debugging-port=9222
|
||||
|
||||
# Windows
|
||||
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
|
||||
```
|
||||
|
||||
Log in to your target site(s) in this Chrome window as you normally would.
|
||||
|
||||
> **Security note:** `--remote-debugging-port` exposes full browser control on localhost. Any local process can connect and read cookies, execute JS, etc. Only use on trusted machines and close Chrome when done.
|
||||
|
||||
**Step 2: Grab the auth state**
|
||||
|
||||
```bash
|
||||
# Auto-discover the running Chrome and save its cookies + localStorage
|
||||
agent-browser --auto-connect state save ./my-auth.json
|
||||
```
|
||||
|
||||
**Step 3: Reuse in automation**
|
||||
|
||||
```bash
|
||||
# Load auth at launch
|
||||
agent-browser --state ./my-auth.json open https://app.example.com/dashboard
|
||||
|
||||
# Or load into an existing session
|
||||
agent-browser state load ./my-auth.json
|
||||
agent-browser open https://app.example.com/dashboard
|
||||
```
|
||||
|
||||
This works for any site, including those with complex OAuth flows, SSO, or 2FA -- as long as Chrome already has valid session cookies.
|
||||
|
||||
> **Security note:** State files contain session tokens in plaintext. Add them to `.gitignore`, delete when no longer needed, and set `AGENT_BROWSER_ENCRYPTION_KEY` for encryption at rest. See [Security Best Practices](#security-best-practices).
|
||||
|
||||
**Tip:** Combine with `--session-name` so the imported auth auto-persists across restarts:
|
||||
|
||||
```bash
|
||||
agent-browser --session-name myapp state load ./my-auth.json
|
||||
# From now on, state is auto-saved/restored for "myapp"
|
||||
```
|
||||
|
||||
## Persistent Profiles
|
||||
|
||||
Use `--profile` to point agent-browser at a Chrome user data directory. This persists everything (cookies, IndexedDB, service workers, cache) across browser restarts without explicit save/load:
|
||||
|
||||
```bash
|
||||
# First run: login once
|
||||
agent-browser --profile ~/.myapp-profile open https://app.example.com/login
|
||||
# ... complete login flow ...
|
||||
|
||||
# All subsequent runs: already authenticated
|
||||
agent-browser --profile ~/.myapp-profile open https://app.example.com/dashboard
|
||||
```
|
||||
|
||||
Use different paths for different projects or test users:
|
||||
|
||||
```bash
|
||||
agent-browser --profile ~/.profiles/admin open https://app.example.com
|
||||
agent-browser --profile ~/.profiles/viewer open https://app.example.com
|
||||
```
|
||||
|
||||
Or set via environment variable:
|
||||
|
||||
```bash
|
||||
export AGENT_BROWSER_PROFILE=~/.myapp-profile
|
||||
agent-browser open https://app.example.com/dashboard
|
||||
```
|
||||
|
||||
## Session Persistence
|
||||
|
||||
Use `--session-name` to auto-save and restore cookies + localStorage by name, without managing files:
|
||||
|
||||
```bash
|
||||
# Auto-saves state on close, auto-restores on next launch
|
||||
agent-browser --session-name twitter open https://twitter.com
|
||||
# ... login flow ...
|
||||
agent-browser close # state saved to ~/.agent-browser/sessions/
|
||||
|
||||
# Next time: state is automatically restored
|
||||
agent-browser --session-name twitter open https://twitter.com
|
||||
```
|
||||
|
||||
Encrypt state at rest:
|
||||
|
||||
```bash
|
||||
export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32)
|
||||
agent-browser --session-name secure open https://app.example.com
|
||||
```
|
||||
|
||||
## Basic Login Flow
|
||||
|
||||
```bash
|
||||
# Navigate to login page
|
||||
agent-browser open https://app.example.com/login
|
||||
agent-browser wait --load networkidle
|
||||
|
||||
# Get form elements
|
||||
agent-browser snapshot -i
|
||||
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Sign In"
|
||||
|
||||
# Fill credentials
|
||||
agent-browser fill @e1 "user@example.com"
|
||||
agent-browser fill @e2 "password123"
|
||||
|
||||
# Submit
|
||||
agent-browser click @e3
|
||||
agent-browser wait --load networkidle
|
||||
|
||||
# Verify login succeeded
|
||||
agent-browser get url # Should be dashboard, not login
|
||||
```
|
||||
|
||||
## Saving Authentication State
|
||||
|
||||
After logging in, save state for reuse:
|
||||
|
||||
```bash
|
||||
# Login first (see above)
|
||||
agent-browser open https://app.example.com/login
|
||||
agent-browser snapshot -i
|
||||
agent-browser fill @e1 "user@example.com"
|
||||
agent-browser fill @e2 "password123"
|
||||
agent-browser click @e3
|
||||
agent-browser wait --url "**/dashboard"
|
||||
|
||||
# Save authenticated state
|
||||
agent-browser state save ./auth-state.json
|
||||
```
|
||||
|
||||
## Restoring Authentication
|
||||
|
||||
Skip login by loading saved state:
|
||||
|
||||
```bash
|
||||
# Load saved auth state
|
||||
agent-browser state load ./auth-state.json
|
||||
|
||||
# Navigate directly to protected page
|
||||
agent-browser open https://app.example.com/dashboard
|
||||
|
||||
# Verify authenticated
|
||||
agent-browser snapshot -i
|
||||
```
|
||||
|
||||
## OAuth / SSO Flows
|
||||
|
||||
For OAuth redirects:
|
||||
|
||||
```bash
|
||||
# Start OAuth flow
|
||||
agent-browser open https://app.example.com/auth/google
|
||||
|
||||
# Handle redirects automatically
|
||||
agent-browser wait --url "**/accounts.google.com**"
|
||||
agent-browser snapshot -i
|
||||
|
||||
# Fill Google credentials
|
||||
agent-browser fill @e1 "user@gmail.com"
|
||||
agent-browser click @e2 # Next button
|
||||
agent-browser wait 2000
|
||||
agent-browser snapshot -i
|
||||
agent-browser fill @e3 "password"
|
||||
agent-browser click @e4 # Sign in
|
||||
|
||||
# Wait for redirect back
|
||||
agent-browser wait --url "**/app.example.com**"
|
||||
agent-browser state save ./oauth-state.json
|
||||
```
|
||||
|
||||
## Two-Factor Authentication
|
||||
|
||||
Handle 2FA with manual intervention:
|
||||
|
||||
```bash
|
||||
# Login with credentials
|
||||
agent-browser open https://app.example.com/login --headed # Show browser
|
||||
agent-browser snapshot -i
|
||||
agent-browser fill @e1 "user@example.com"
|
||||
agent-browser fill @e2 "password123"
|
||||
agent-browser click @e3
|
||||
|
||||
# Wait for user to complete 2FA manually
|
||||
echo "Complete 2FA in the browser window..."
|
||||
agent-browser wait --url "**/dashboard" --timeout 120000
|
||||
|
||||
# Save state after 2FA
|
||||
agent-browser state save ./2fa-state.json
|
||||
```
|
||||
|
||||
## HTTP Basic Auth
|
||||
|
||||
For sites using HTTP Basic Authentication:
|
||||
|
||||
```bash
|
||||
# Set credentials before navigation
|
||||
agent-browser set credentials username password
|
||||
|
||||
# Navigate to protected resource
|
||||
agent-browser open https://protected.example.com/api
|
||||
```
|
||||
|
||||
## Cookie-Based Auth
|
||||
|
||||
Manually set authentication cookies:
|
||||
|
||||
```bash
|
||||
# Set auth cookie
|
||||
agent-browser cookies set session_token "abc123xyz"
|
||||
|
||||
# Navigate to protected page
|
||||
agent-browser open https://app.example.com/dashboard
|
||||
```
|
||||
|
||||
## Token Refresh Handling
|
||||
|
||||
For sessions with expiring tokens:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Wrapper that handles token refresh
|
||||
|
||||
STATE_FILE="./auth-state.json"
|
||||
|
||||
# Try loading existing state
|
||||
if [[ -f "$STATE_FILE" ]]; then
|
||||
agent-browser state load "$STATE_FILE"
|
||||
agent-browser open https://app.example.com/dashboard
|
||||
|
||||
# Check if session is still valid
|
||||
URL=$(agent-browser get url)
|
||||
if [[ "$URL" == *"/login"* ]]; then
|
||||
echo "Session expired, re-authenticating..."
|
||||
# Perform fresh login
|
||||
agent-browser snapshot -i
|
||||
agent-browser fill @e1 "$USERNAME"
|
||||
agent-browser fill @e2 "$PASSWORD"
|
||||
agent-browser click @e3
|
||||
agent-browser wait --url "**/dashboard"
|
||||
agent-browser state save "$STATE_FILE"
|
||||
fi
|
||||
else
|
||||
# First-time login
|
||||
agent-browser open https://app.example.com/login
|
||||
# ... login flow ...
|
||||
fi
|
||||
```
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
1. **Never commit state files** - They contain session tokens
|
||||
```bash
|
||||
echo "*.auth-state.json" >> .gitignore
|
||||
```
|
||||
|
||||
2. **Use environment variables for credentials**
|
||||
```bash
|
||||
agent-browser fill @e1 "$APP_USERNAME"
|
||||
agent-browser fill @e2 "$APP_PASSWORD"
|
||||
```
|
||||
|
||||
3. **Clean up after automation**
|
||||
```bash
|
||||
agent-browser cookies clear
|
||||
rm -f ./auth-state.json
|
||||
```
|
||||
|
||||
4. **Use short-lived sessions for CI/CD**
|
||||
```bash
|
||||
# Don't persist state in CI
|
||||
agent-browser open https://app.example.com/login
|
||||
# ... login and perform actions ...
|
||||
agent-browser close # Session ends, nothing persisted
|
||||
```
|
||||
@@ -0,0 +1,266 @@
|
||||
# Command Reference
|
||||
|
||||
Complete reference for all agent-browser commands. For quick start and common patterns, see SKILL.md.
|
||||
|
||||
## Navigation
|
||||
|
||||
```bash
|
||||
agent-browser open <url> # Navigate to URL (aliases: goto, navigate)
|
||||
# Supports: https://, http://, file://, about:, data://
|
||||
# Auto-prepends https:// if no protocol given
|
||||
agent-browser back # Go back
|
||||
agent-browser forward # Go forward
|
||||
agent-browser reload # Reload page
|
||||
agent-browser close # Close browser (aliases: quit, exit)
|
||||
agent-browser connect 9222 # Connect to browser via CDP port
|
||||
```
|
||||
|
||||
## Snapshot (page analysis)
|
||||
|
||||
```bash
|
||||
agent-browser snapshot # Full accessibility tree
|
||||
agent-browser snapshot -i # Interactive elements only (recommended)
|
||||
agent-browser snapshot -c # Compact output
|
||||
agent-browser snapshot -d 3 # Limit depth to 3
|
||||
agent-browser snapshot -s "#main" # Scope to CSS selector
|
||||
```
|
||||
|
||||
## Interactions (use @refs from snapshot)
|
||||
|
||||
```bash
|
||||
agent-browser click @e1 # Click
|
||||
agent-browser click @e1 --new-tab # Click and open in new tab
|
||||
agent-browser dblclick @e1 # Double-click
|
||||
agent-browser focus @e1 # Focus element
|
||||
agent-browser fill @e2 "text" # Clear and type
|
||||
agent-browser type @e2 "text" # Type without clearing
|
||||
agent-browser press Enter # Press key (alias: key)
|
||||
agent-browser press Control+a # Key combination
|
||||
agent-browser keydown Shift # Hold key down
|
||||
agent-browser keyup Shift # Release key
|
||||
agent-browser hover @e1 # Hover
|
||||
agent-browser check @e1 # Check checkbox
|
||||
agent-browser uncheck @e1 # Uncheck checkbox
|
||||
agent-browser select @e1 "value" # Select dropdown option
|
||||
agent-browser select @e1 "a" "b" # Select multiple options
|
||||
agent-browser scroll down 500 # Scroll page (default: down 300px)
|
||||
agent-browser scrollintoview @e1 # Scroll element into view (alias: scrollinto)
|
||||
agent-browser drag @e1 @e2 # Drag and drop
|
||||
agent-browser upload @e1 file.pdf # Upload files
|
||||
```
|
||||
|
||||
## Get Information
|
||||
|
||||
```bash
|
||||
agent-browser get text @e1 # Get element text
|
||||
agent-browser get html @e1 # Get innerHTML
|
||||
agent-browser get value @e1 # Get input value
|
||||
agent-browser get attr @e1 href # Get attribute
|
||||
agent-browser get title # Get page title
|
||||
agent-browser get url # Get current URL
|
||||
agent-browser get cdp-url # Get CDP WebSocket URL
|
||||
agent-browser get count ".item" # Count matching elements
|
||||
agent-browser get box @e1 # Get bounding box
|
||||
agent-browser get styles @e1 # Get computed styles (font, color, bg, etc.)
|
||||
```
|
||||
|
||||
## Check State
|
||||
|
||||
```bash
|
||||
agent-browser is visible @e1 # Check if visible
|
||||
agent-browser is enabled @e1 # Check if enabled
|
||||
agent-browser is checked @e1 # Check if checked
|
||||
```
|
||||
|
||||
## Screenshots and PDF
|
||||
|
||||
```bash
|
||||
agent-browser screenshot # Save to temporary directory
|
||||
agent-browser screenshot path.png # Save to specific path
|
||||
agent-browser screenshot --full # Full page
|
||||
agent-browser pdf output.pdf # Save as PDF
|
||||
```
|
||||
|
||||
## Video Recording
|
||||
|
||||
```bash
|
||||
agent-browser record start ./demo.webm # Start recording
|
||||
agent-browser click @e1 # Perform actions
|
||||
agent-browser record stop # Stop and save video
|
||||
agent-browser record restart ./take2.webm # Stop current + start new
|
||||
```
|
||||
|
||||
## Wait
|
||||
|
||||
```bash
|
||||
agent-browser wait @e1 # Wait for element
|
||||
agent-browser wait 2000 # Wait milliseconds
|
||||
agent-browser wait --text "Success" # Wait for text (or -t)
|
||||
agent-browser wait --url "**/dashboard" # Wait for URL pattern (or -u)
|
||||
agent-browser wait --load networkidle # Wait for network idle (or -l)
|
||||
agent-browser wait --fn "window.ready" # Wait for JS condition (or -f)
|
||||
```
|
||||
|
||||
## Mouse Control
|
||||
|
||||
```bash
|
||||
agent-browser mouse move 100 200 # Move mouse
|
||||
agent-browser mouse down left # Press button
|
||||
agent-browser mouse up left # Release button
|
||||
agent-browser mouse wheel 100 # Scroll wheel
|
||||
```
|
||||
|
||||
## Semantic Locators (alternative to refs)
|
||||
|
||||
```bash
|
||||
agent-browser find role button click --name "Submit"
|
||||
agent-browser find text "Sign In" click
|
||||
agent-browser find text "Sign In" click --exact # Exact match only
|
||||
agent-browser find label "Email" fill "user@test.com"
|
||||
agent-browser find placeholder "Search" type "query"
|
||||
agent-browser find alt "Logo" click
|
||||
agent-browser find title "Close" click
|
||||
agent-browser find testid "submit-btn" click
|
||||
agent-browser find first ".item" click
|
||||
agent-browser find last ".item" click
|
||||
agent-browser find nth 2 "a" hover
|
||||
```
|
||||
|
||||
## Browser Settings
|
||||
|
||||
```bash
|
||||
agent-browser set viewport 1920 1080 # Set viewport size
|
||||
agent-browser set viewport 1920 1080 2 # 2x retina (same CSS size, higher res screenshots)
|
||||
agent-browser set device "iPhone 14" # Emulate device
|
||||
agent-browser set geo 37.7749 -122.4194 # Set geolocation (alias: geolocation)
|
||||
agent-browser set offline on # Toggle offline mode
|
||||
agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
|
||||
agent-browser set credentials user pass # HTTP basic auth (alias: auth)
|
||||
agent-browser set media dark # Emulate color scheme
|
||||
agent-browser set media light reduced-motion # Light mode + reduced motion
|
||||
```
|
||||
|
||||
## Cookies and Storage
|
||||
|
||||
```bash
|
||||
agent-browser cookies # Get all cookies
|
||||
agent-browser cookies set name value # Set cookie
|
||||
agent-browser cookies clear # Clear cookies
|
||||
agent-browser storage local # Get all localStorage
|
||||
agent-browser storage local key # Get specific key
|
||||
agent-browser storage local set k v # Set value
|
||||
agent-browser storage local clear # Clear all
|
||||
```
|
||||
|
||||
## Network
|
||||
|
||||
```bash
|
||||
agent-browser network route <url> # Intercept requests
|
||||
agent-browser network route <url> --abort # Block requests
|
||||
agent-browser network route <url> --body '{}' # Mock response
|
||||
agent-browser network unroute [url] # Remove routes
|
||||
agent-browser network requests # View tracked requests
|
||||
agent-browser network requests --filter api # Filter requests
|
||||
```
|
||||
|
||||
## Tabs and Windows
|
||||
|
||||
```bash
|
||||
agent-browser tab # List tabs
|
||||
agent-browser tab new [url] # New tab
|
||||
agent-browser tab 2 # Switch to tab by index
|
||||
agent-browser tab close # Close current tab
|
||||
agent-browser tab close 2 # Close tab by index
|
||||
agent-browser window new # New window
|
||||
```
|
||||
|
||||
## Frames
|
||||
|
||||
```bash
|
||||
agent-browser frame "#iframe" # Switch to iframe
|
||||
agent-browser frame main # Back to main frame
|
||||
```
|
||||
|
||||
## Dialogs
|
||||
|
||||
```bash
|
||||
agent-browser dialog accept [text] # Accept dialog
|
||||
agent-browser dialog dismiss # Dismiss dialog
|
||||
```
|
||||
|
||||
## JavaScript
|
||||
|
||||
```bash
|
||||
agent-browser eval "document.title" # Simple expressions only
|
||||
agent-browser eval -b "<base64>" # Any JavaScript (base64 encoded)
|
||||
agent-browser eval --stdin # Read script from stdin
|
||||
```
|
||||
|
||||
Use `-b`/`--base64` or `--stdin` for reliable execution. Shell escaping with nested quotes and special characters is error-prone.
|
||||
|
||||
```bash
|
||||
# Base64 encode your script, then:
|
||||
agent-browser eval -b "ZG9jdW1lbnQucXVlcnlTZWxlY3RvcignW3NyYyo9Il9uZXh0Il0nKQ=="
|
||||
|
||||
# Or use stdin with heredoc for multiline scripts:
|
||||
cat <<'EOF' | agent-browser eval --stdin
|
||||
const links = document.querySelectorAll('a');
|
||||
Array.from(links).map(a => a.href);
|
||||
EOF
|
||||
```
|
||||
|
||||
## State Management
|
||||
|
||||
```bash
|
||||
agent-browser state save auth.json # Save cookies, storage, auth state
|
||||
agent-browser state load auth.json # Restore saved state
|
||||
```
|
||||
|
||||
## Global Options
|
||||
|
||||
```bash
|
||||
agent-browser --session <name> ... # Isolated browser session
|
||||
agent-browser --json ... # JSON output for parsing
|
||||
agent-browser --headed ... # Show browser window (not headless)
|
||||
agent-browser --full ... # Full page screenshot (-f)
|
||||
agent-browser --cdp <port> ... # Connect via Chrome DevTools Protocol
|
||||
agent-browser -p <provider> ... # Cloud browser provider (--provider)
|
||||
agent-browser --proxy <url> ... # Use proxy server
|
||||
agent-browser --proxy-bypass <hosts> # Hosts to bypass proxy
|
||||
agent-browser --headers <json> ... # HTTP headers scoped to URL's origin
|
||||
agent-browser --executable-path <p> # Custom browser executable
|
||||
agent-browser --extension <path> ... # Load browser extension (repeatable)
|
||||
agent-browser --ignore-https-errors # Ignore SSL certificate errors
|
||||
agent-browser --help # Show help (-h)
|
||||
agent-browser --version # Show version (-V)
|
||||
agent-browser <command> --help # Show detailed help for a command
|
||||
```
|
||||
|
||||
## Debugging
|
||||
|
||||
```bash
|
||||
agent-browser --headed open example.com # Show browser window
|
||||
agent-browser --cdp 9222 snapshot # Connect via CDP port
|
||||
agent-browser connect 9222 # Alternative: connect command
|
||||
agent-browser console # View console messages
|
||||
agent-browser console --clear # Clear console
|
||||
agent-browser errors # View page errors
|
||||
agent-browser errors --clear # Clear errors
|
||||
agent-browser highlight @e1 # Highlight element
|
||||
agent-browser inspect # Open Chrome DevTools for this session
|
||||
agent-browser trace start # Start recording trace
|
||||
agent-browser trace stop trace.zip # Stop and save trace
|
||||
agent-browser profiler start # Start Chrome DevTools profiling
|
||||
agent-browser profiler stop trace.json # Stop and save profile
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
AGENT_BROWSER_SESSION="mysession" # Default session name
|
||||
AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path
|
||||
AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths
|
||||
AGENT_BROWSER_PROVIDER="browserbase" # Cloud browser provider
|
||||
AGENT_BROWSER_STREAM_PORT="9223" # WebSocket streaming port
|
||||
AGENT_BROWSER_HOME="/path/to/agent-browser" # Custom install location
|
||||
```
|
||||
@@ -0,0 +1,120 @@
|
||||
# Profiling
|
||||
|
||||
Capture Chrome DevTools performance profiles during browser automation for performance analysis.
|
||||
|
||||
**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
|
||||
|
||||
## Contents
|
||||
|
||||
- [Basic Profiling](#basic-profiling)
|
||||
- [Profiler Commands](#profiler-commands)
|
||||
- [Categories](#categories)
|
||||
- [Use Cases](#use-cases)
|
||||
- [Output Format](#output-format)
|
||||
- [Viewing Profiles](#viewing-profiles)
|
||||
- [Limitations](#limitations)
|
||||
|
||||
## Basic Profiling
|
||||
|
||||
```bash
|
||||
# Start profiling
|
||||
agent-browser profiler start
|
||||
|
||||
# Perform actions
|
||||
agent-browser navigate https://example.com
|
||||
agent-browser click "#button"
|
||||
agent-browser wait 1000
|
||||
|
||||
# Stop and save
|
||||
agent-browser profiler stop ./trace.json
|
||||
```
|
||||
|
||||
## Profiler Commands
|
||||
|
||||
```bash
|
||||
# Start profiling with default categories
|
||||
agent-browser profiler start
|
||||
|
||||
# Start with custom trace categories
|
||||
agent-browser profiler start --categories "devtools.timeline,v8.execute,blink.user_timing"
|
||||
|
||||
# Stop profiling and save to file
|
||||
agent-browser profiler stop ./trace.json
|
||||
```
|
||||
|
||||
## Categories
|
||||
|
||||
The `--categories` flag accepts a comma-separated list of Chrome trace categories. Default categories include:
|
||||
|
||||
- `devtools.timeline` -- standard DevTools performance traces
|
||||
- `v8.execute` -- time spent running JavaScript
|
||||
- `blink` -- renderer events
|
||||
- `blink.user_timing` -- `performance.mark()` / `performance.measure()` calls
|
||||
- `latencyInfo` -- input-to-latency tracking
|
||||
- `renderer.scheduler` -- task scheduling and execution
|
||||
- `toplevel` -- broad-spectrum basic events
|
||||
|
||||
Several `disabled-by-default-*` categories are also included for detailed timeline, call stack, and V8 CPU profiling data.
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Diagnosing Slow Page Loads
|
||||
|
||||
```bash
|
||||
agent-browser profiler start
|
||||
agent-browser navigate https://app.example.com
|
||||
agent-browser wait --load networkidle
|
||||
agent-browser profiler stop ./page-load-profile.json
|
||||
```
|
||||
|
||||
### Profiling User Interactions
|
||||
|
||||
```bash
|
||||
agent-browser navigate https://app.example.com
|
||||
agent-browser profiler start
|
||||
agent-browser click "#submit"
|
||||
agent-browser wait 2000
|
||||
agent-browser profiler stop ./interaction-profile.json
|
||||
```
|
||||
|
||||
### CI Performance Regression Checks
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
agent-browser profiler start
|
||||
agent-browser navigate https://app.example.com
|
||||
agent-browser wait --load networkidle
|
||||
agent-browser profiler stop "./profiles/build-${BUILD_ID}.json"
|
||||
```
|
||||
|
||||
## Output Format
|
||||
|
||||
The output is a JSON file in Chrome Trace Event format:
|
||||
|
||||
```json
|
||||
{
|
||||
"traceEvents": [
|
||||
{ "cat": "devtools.timeline", "name": "RunTask", "ph": "X", "ts": 12345, "dur": 100 },
|
||||
...
|
||||
],
|
||||
"metadata": {
|
||||
"clock-domain": "LINUX_CLOCK_MONOTONIC"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `metadata.clock-domain` field is set based on the host platform (Linux or macOS). On Windows it is omitted.
|
||||
|
||||
## Viewing Profiles
|
||||
|
||||
Load the output JSON file in any of these tools:
|
||||
|
||||
- **Chrome DevTools**: Performance panel > Load profile (Ctrl+Shift+I > Performance)
|
||||
- **Perfetto UI**: https://ui.perfetto.dev/ -- drag and drop the JSON file
|
||||
- **Trace Viewer**: `chrome://tracing` in any Chromium browser
|
||||
|
||||
## Limitations
|
||||
|
||||
- Only works with Chromium-based browsers (Chrome, Edge). Not supported on Firefox or WebKit.
|
||||
- Trace data accumulates in memory while profiling is active (capped at 5 million events). Stop profiling promptly after the area of interest.
|
||||
- Data collection on stop has a 30-second timeout. If the browser is unresponsive, the stop command may fail.
|
||||
@@ -0,0 +1,194 @@
|
||||
# Proxy Support
|
||||
|
||||
Proxy configuration for geo-testing, rate limiting avoidance, and corporate environments.
|
||||
|
||||
**Related**: [commands.md](commands.md) for global options, [SKILL.md](../SKILL.md) for quick start.
|
||||
|
||||
## Contents
|
||||
|
||||
- [Basic Proxy Configuration](#basic-proxy-configuration)
|
||||
- [Authenticated Proxy](#authenticated-proxy)
|
||||
- [SOCKS Proxy](#socks-proxy)
|
||||
- [Proxy Bypass](#proxy-bypass)
|
||||
- [Common Use Cases](#common-use-cases)
|
||||
- [Verifying Proxy Connection](#verifying-proxy-connection)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
- [Best Practices](#best-practices)
|
||||
|
||||
## Basic Proxy Configuration
|
||||
|
||||
Use the `--proxy` flag or set proxy via environment variable:
|
||||
|
||||
```bash
|
||||
# Via CLI flag
|
||||
agent-browser --proxy "http://proxy.example.com:8080" open https://example.com
|
||||
|
||||
# Via environment variable
|
||||
export HTTP_PROXY="http://proxy.example.com:8080"
|
||||
agent-browser open https://example.com
|
||||
|
||||
# HTTPS proxy
|
||||
export HTTPS_PROXY="https://proxy.example.com:8080"
|
||||
agent-browser open https://example.com
|
||||
|
||||
# Both
|
||||
export HTTP_PROXY="http://proxy.example.com:8080"
|
||||
export HTTPS_PROXY="http://proxy.example.com:8080"
|
||||
agent-browser open https://example.com
|
||||
```
|
||||
|
||||
## Authenticated Proxy
|
||||
|
||||
For proxies requiring authentication:
|
||||
|
||||
```bash
|
||||
# Include credentials in URL
|
||||
export HTTP_PROXY="http://username:password@proxy.example.com:8080"
|
||||
agent-browser open https://example.com
|
||||
```
|
||||
|
||||
## SOCKS Proxy
|
||||
|
||||
```bash
|
||||
# SOCKS5 proxy
|
||||
export ALL_PROXY="socks5://proxy.example.com:1080"
|
||||
agent-browser open https://example.com
|
||||
|
||||
# SOCKS5 with auth
|
||||
export ALL_PROXY="socks5://user:pass@proxy.example.com:1080"
|
||||
agent-browser open https://example.com
|
||||
```
|
||||
|
||||
## Proxy Bypass
|
||||
|
||||
Skip proxy for specific domains using `--proxy-bypass` or `NO_PROXY`:
|
||||
|
||||
```bash
|
||||
# Via CLI flag
|
||||
agent-browser --proxy "http://proxy.example.com:8080" --proxy-bypass "localhost,*.internal.com" open https://example.com
|
||||
|
||||
# Via environment variable
|
||||
export NO_PROXY="localhost,127.0.0.1,.internal.company.com"
|
||||
agent-browser open https://internal.company.com # Direct connection
|
||||
agent-browser open https://external.com # Via proxy
|
||||
```
|
||||
|
||||
## Common Use Cases
|
||||
|
||||
### Geo-Location Testing
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Test site from different regions using geo-located proxies
|
||||
|
||||
PROXIES=(
|
||||
"http://us-proxy.example.com:8080"
|
||||
"http://eu-proxy.example.com:8080"
|
||||
"http://asia-proxy.example.com:8080"
|
||||
)
|
||||
|
||||
for proxy in "${PROXIES[@]}"; do
|
||||
export HTTP_PROXY="$proxy"
|
||||
export HTTPS_PROXY="$proxy"
|
||||
|
||||
region=$(echo "$proxy" | grep -oP '^\w+-\w+')
|
||||
echo "Testing from: $region"
|
||||
|
||||
agent-browser --session "$region" open https://example.com
|
||||
agent-browser --session "$region" screenshot "./screenshots/$region.png"
|
||||
agent-browser --session "$region" close
|
||||
done
|
||||
```
|
||||
|
||||
### Rotating Proxies for Scraping
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Rotate through proxy list to avoid rate limiting
|
||||
|
||||
PROXY_LIST=(
|
||||
"http://proxy1.example.com:8080"
|
||||
"http://proxy2.example.com:8080"
|
||||
"http://proxy3.example.com:8080"
|
||||
)
|
||||
|
||||
URLS=(
|
||||
"https://site.com/page1"
|
||||
"https://site.com/page2"
|
||||
"https://site.com/page3"
|
||||
)
|
||||
|
||||
for i in "${!URLS[@]}"; do
|
||||
proxy_index=$((i % ${#PROXY_LIST[@]}))
|
||||
export HTTP_PROXY="${PROXY_LIST[$proxy_index]}"
|
||||
export HTTPS_PROXY="${PROXY_LIST[$proxy_index]}"
|
||||
|
||||
agent-browser open "${URLS[$i]}"
|
||||
agent-browser get text body > "output-$i.txt"
|
||||
agent-browser close
|
||||
|
||||
sleep 1 # Polite delay
|
||||
done
|
||||
```
|
||||
|
||||
### Corporate Network Access
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Access internal sites via corporate proxy
|
||||
|
||||
export HTTP_PROXY="http://corpproxy.company.com:8080"
|
||||
export HTTPS_PROXY="http://corpproxy.company.com:8080"
|
||||
export NO_PROXY="localhost,127.0.0.1,.company.com"
|
||||
|
||||
# External sites go through proxy
|
||||
agent-browser open https://external-vendor.com
|
||||
|
||||
# Internal sites bypass proxy
|
||||
agent-browser open https://intranet.company.com
|
||||
```
|
||||
|
||||
## Verifying Proxy Connection
|
||||
|
||||
```bash
|
||||
# Check your apparent IP
|
||||
agent-browser open https://httpbin.org/ip
|
||||
agent-browser get text body
|
||||
# Should show proxy's IP, not your real IP
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Proxy Connection Failed
|
||||
|
||||
```bash
|
||||
# Test proxy connectivity first
|
||||
curl -x http://proxy.example.com:8080 https://httpbin.org/ip
|
||||
|
||||
# Check if proxy requires auth
|
||||
export HTTP_PROXY="http://user:pass@proxy.example.com:8080"
|
||||
```
|
||||
|
||||
### SSL/TLS Errors Through Proxy
|
||||
|
||||
Some proxies perform SSL inspection. If you encounter certificate errors:
|
||||
|
||||
```bash
|
||||
# For testing only - not recommended for production
|
||||
agent-browser open https://example.com --ignore-https-errors
|
||||
```
|
||||
|
||||
### Slow Performance
|
||||
|
||||
```bash
|
||||
# Use proxy only when necessary
|
||||
export NO_PROXY="*.cdn.com,*.static.com" # Direct CDN access
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use environment variables** - Don't hardcode proxy credentials
|
||||
2. **Set NO_PROXY appropriately** - Avoid routing local traffic through proxy
|
||||
3. **Test proxy before automation** - Verify connectivity with simple requests
|
||||
4. **Handle proxy failures gracefully** - Implement retry logic for unstable proxies
|
||||
5. **Rotate proxies for large scraping jobs** - Distribute load and avoid bans
|
||||
@@ -0,0 +1,193 @@
|
||||
# Session Management
|
||||
|
||||
Multiple isolated browser sessions with state persistence and concurrent browsing.
|
||||
|
||||
**Related**: [authentication.md](authentication.md) for login patterns, [SKILL.md](../SKILL.md) for quick start.
|
||||
|
||||
## Contents
|
||||
|
||||
- [Named Sessions](#named-sessions)
|
||||
- [Session Isolation Properties](#session-isolation-properties)
|
||||
- [Session State Persistence](#session-state-persistence)
|
||||
- [Common Patterns](#common-patterns)
|
||||
- [Default Session](#default-session)
|
||||
- [Session Cleanup](#session-cleanup)
|
||||
- [Best Practices](#best-practices)
|
||||
|
||||
## Named Sessions
|
||||
|
||||
Use `--session` flag to isolate browser contexts:
|
||||
|
||||
```bash
|
||||
# Session 1: Authentication flow
|
||||
agent-browser --session auth open https://app.example.com/login
|
||||
|
||||
# Session 2: Public browsing (separate cookies, storage)
|
||||
agent-browser --session public open https://example.com
|
||||
|
||||
# Commands are isolated by session
|
||||
agent-browser --session auth fill @e1 "user@example.com"
|
||||
agent-browser --session public get text body
|
||||
```
|
||||
|
||||
## Session Isolation Properties
|
||||
|
||||
Each session has independent:
|
||||
- Cookies
|
||||
- LocalStorage / SessionStorage
|
||||
- IndexedDB
|
||||
- Cache
|
||||
- Browsing history
|
||||
- Open tabs
|
||||
|
||||
## Session State Persistence
|
||||
|
||||
### Save Session State
|
||||
|
||||
```bash
|
||||
# Save cookies, storage, and auth state
|
||||
agent-browser state save /path/to/auth-state.json
|
||||
```
|
||||
|
||||
### Load Session State
|
||||
|
||||
```bash
|
||||
# Restore saved state
|
||||
agent-browser state load /path/to/auth-state.json
|
||||
|
||||
# Continue with authenticated session
|
||||
agent-browser open https://app.example.com/dashboard
|
||||
```
|
||||
|
||||
### State File Contents
|
||||
|
||||
```json
|
||||
{
|
||||
"cookies": [...],
|
||||
"localStorage": {...},
|
||||
"sessionStorage": {...},
|
||||
"origins": [...]
|
||||
}
|
||||
```
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Authenticated Session Reuse
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Save login state once, reuse many times
|
||||
|
||||
STATE_FILE="/tmp/auth-state.json"
|
||||
|
||||
# Check if we have saved state
|
||||
if [[ -f "$STATE_FILE" ]]; then
|
||||
agent-browser state load "$STATE_FILE"
|
||||
agent-browser open https://app.example.com/dashboard
|
||||
else
|
||||
# Perform login
|
||||
agent-browser open https://app.example.com/login
|
||||
agent-browser snapshot -i
|
||||
agent-browser fill @e1 "$USERNAME"
|
||||
agent-browser fill @e2 "$PASSWORD"
|
||||
agent-browser click @e3
|
||||
agent-browser wait --load networkidle
|
||||
|
||||
# Save for future use
|
||||
agent-browser state save "$STATE_FILE"
|
||||
fi
|
||||
```
|
||||
|
||||
### Concurrent Scraping
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Scrape multiple sites concurrently
|
||||
|
||||
# Start all sessions
|
||||
agent-browser --session site1 open https://site1.com &
|
||||
agent-browser --session site2 open https://site2.com &
|
||||
agent-browser --session site3 open https://site3.com &
|
||||
wait
|
||||
|
||||
# Extract from each
|
||||
agent-browser --session site1 get text body > site1.txt
|
||||
agent-browser --session site2 get text body > site2.txt
|
||||
agent-browser --session site3 get text body > site3.txt
|
||||
|
||||
# Cleanup
|
||||
agent-browser --session site1 close
|
||||
agent-browser --session site2 close
|
||||
agent-browser --session site3 close
|
||||
```
|
||||
|
||||
### A/B Testing Sessions
|
||||
|
||||
```bash
|
||||
# Test different user experiences
|
||||
agent-browser --session variant-a open "https://app.com?variant=a"
|
||||
agent-browser --session variant-b open "https://app.com?variant=b"
|
||||
|
||||
# Compare
|
||||
agent-browser --session variant-a screenshot /tmp/variant-a.png
|
||||
agent-browser --session variant-b screenshot /tmp/variant-b.png
|
||||
```
|
||||
|
||||
## Default Session
|
||||
|
||||
When `--session` is omitted, commands use the default session:
|
||||
|
||||
```bash
|
||||
# These use the same default session
|
||||
agent-browser open https://example.com
|
||||
agent-browser snapshot -i
|
||||
agent-browser close # Closes default session
|
||||
```
|
||||
|
||||
## Session Cleanup
|
||||
|
||||
```bash
|
||||
# Close specific session
|
||||
agent-browser --session auth close
|
||||
|
||||
# List active sessions
|
||||
agent-browser session list
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Name Sessions Semantically
|
||||
|
||||
```bash
|
||||
# GOOD: Clear purpose
|
||||
agent-browser --session github-auth open https://github.com
|
||||
agent-browser --session docs-scrape open https://docs.example.com
|
||||
|
||||
# AVOID: Generic names
|
||||
agent-browser --session s1 open https://github.com
|
||||
```
|
||||
|
||||
### 2. Always Clean Up
|
||||
|
||||
```bash
|
||||
# Close sessions when done
|
||||
agent-browser --session auth close
|
||||
agent-browser --session scrape close
|
||||
```
|
||||
|
||||
### 3. Handle State Files Securely
|
||||
|
||||
```bash
|
||||
# Don't commit state files (contain auth tokens!)
|
||||
echo "*.auth-state.json" >> .gitignore
|
||||
|
||||
# Delete after use
|
||||
rm /tmp/auth-state.json
|
||||
```
|
||||
|
||||
### 4. Timeout Long Sessions
|
||||
|
||||
```bash
|
||||
# Set timeout for automated scripts
|
||||
timeout 60 agent-browser --session long-task get text body
|
||||
```
|
||||
@@ -0,0 +1,194 @@
|
||||
# Snapshot and Refs
|
||||
|
||||
Compact element references that reduce context usage dramatically for AI agents.
|
||||
|
||||
**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
|
||||
|
||||
## Contents
|
||||
|
||||
- [How Refs Work](#how-refs-work)
|
||||
- [Snapshot Command](#the-snapshot-command)
|
||||
- [Using Refs](#using-refs)
|
||||
- [Ref Lifecycle](#ref-lifecycle)
|
||||
- [Best Practices](#best-practices)
|
||||
- [Ref Notation Details](#ref-notation-details)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
|
||||
## How Refs Work
|
||||
|
||||
Traditional approach:
|
||||
```
|
||||
Full DOM/HTML -> AI parses -> CSS selector -> Action (~3000-5000 tokens)
|
||||
```
|
||||
|
||||
agent-browser approach:
|
||||
```
|
||||
Compact snapshot -> @refs assigned -> Direct interaction (~200-400 tokens)
|
||||
```
|
||||
|
||||
## The Snapshot Command
|
||||
|
||||
```bash
|
||||
# Basic snapshot (shows page structure)
|
||||
agent-browser snapshot
|
||||
|
||||
# Interactive snapshot (-i flag) - RECOMMENDED
|
||||
agent-browser snapshot -i
|
||||
```
|
||||
|
||||
### Snapshot Output Format
|
||||
|
||||
```
|
||||
Page: Example Site - Home
|
||||
URL: https://example.com
|
||||
|
||||
@e1 [header]
|
||||
@e2 [nav]
|
||||
@e3 [a] "Home"
|
||||
@e4 [a] "Products"
|
||||
@e5 [a] "About"
|
||||
@e6 [button] "Sign In"
|
||||
|
||||
@e7 [main]
|
||||
@e8 [h1] "Welcome"
|
||||
@e9 [form]
|
||||
@e10 [input type="email"] placeholder="Email"
|
||||
@e11 [input type="password"] placeholder="Password"
|
||||
@e12 [button type="submit"] "Log In"
|
||||
|
||||
@e13 [footer]
|
||||
@e14 [a] "Privacy Policy"
|
||||
```
|
||||
|
||||
## Using Refs
|
||||
|
||||
Once you have refs, interact directly:
|
||||
|
||||
```bash
|
||||
# Click the "Sign In" button
|
||||
agent-browser click @e6
|
||||
|
||||
# Fill email input
|
||||
agent-browser fill @e10 "user@example.com"
|
||||
|
||||
# Fill password
|
||||
agent-browser fill @e11 "password123"
|
||||
|
||||
# Submit the form
|
||||
agent-browser click @e12
|
||||
```
|
||||
|
||||
## Ref Lifecycle
|
||||
|
||||
**IMPORTANT**: Refs are invalidated when the page changes!
|
||||
|
||||
```bash
|
||||
# Get initial snapshot
|
||||
agent-browser snapshot -i
|
||||
# @e1 [button] "Next"
|
||||
|
||||
# Click triggers page change
|
||||
agent-browser click @e1
|
||||
|
||||
# MUST re-snapshot to get new refs!
|
||||
agent-browser snapshot -i
|
||||
# @e1 [h1] "Page 2" <- Different element now!
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Always Snapshot Before Interacting
|
||||
|
||||
```bash
|
||||
# CORRECT
|
||||
agent-browser open https://example.com
|
||||
agent-browser snapshot -i # Get refs first
|
||||
agent-browser click @e1 # Use ref
|
||||
|
||||
# WRONG
|
||||
agent-browser open https://example.com
|
||||
agent-browser click @e1 # Ref doesn't exist yet!
|
||||
```
|
||||
|
||||
### 2. Re-Snapshot After Navigation
|
||||
|
||||
```bash
|
||||
agent-browser click @e5 # Navigates to new page
|
||||
agent-browser snapshot -i # Get new refs
|
||||
agent-browser click @e1 # Use new refs
|
||||
```
|
||||
|
||||
### 3. Re-Snapshot After Dynamic Changes
|
||||
|
||||
```bash
|
||||
agent-browser click @e1 # Opens dropdown
|
||||
agent-browser snapshot -i # See dropdown items
|
||||
agent-browser click @e7 # Select item
|
||||
```
|
||||
|
||||
### 4. Snapshot Specific Regions
|
||||
|
||||
For complex pages, snapshot specific areas:
|
||||
|
||||
```bash
|
||||
# Snapshot just the form
|
||||
agent-browser snapshot @e9
|
||||
```
|
||||
|
||||
## Ref Notation Details
|
||||
|
||||
```
|
||||
@e1 [tag type="value"] "text content" placeholder="hint"
|
||||
| | | | |
|
||||
| | | | +- Additional attributes
|
||||
| | | +- Visible text
|
||||
| | +- Key attributes shown
|
||||
| +- HTML tag name
|
||||
+- Unique ref ID
|
||||
```
|
||||
|
||||
### Common Patterns
|
||||
|
||||
```
|
||||
@e1 [button] "Submit" # Button with text
|
||||
@e2 [input type="email"] # Email input
|
||||
@e3 [input type="password"] # Password input
|
||||
@e4 [a href="/page"] "Link Text" # Anchor link
|
||||
@e5 [select] # Dropdown
|
||||
@e6 [textarea] placeholder="Message" # Text area
|
||||
@e7 [div class="modal"] # Container (when relevant)
|
||||
@e8 [img alt="Logo"] # Image
|
||||
@e9 [checkbox] checked # Checked checkbox
|
||||
@e10 [radio] selected # Selected radio
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Ref not found" Error
|
||||
|
||||
```bash
|
||||
# Ref may have changed - re-snapshot
|
||||
agent-browser snapshot -i
|
||||
```
|
||||
|
||||
### Element Not Visible in Snapshot
|
||||
|
||||
```bash
|
||||
# Scroll down to reveal element
|
||||
agent-browser scroll down 1000
|
||||
agent-browser snapshot -i
|
||||
|
||||
# Or wait for dynamic content
|
||||
agent-browser wait 1000
|
||||
agent-browser snapshot -i
|
||||
```
|
||||
|
||||
### Too Many Elements
|
||||
|
||||
```bash
|
||||
# Snapshot specific container
|
||||
agent-browser snapshot @e5
|
||||
|
||||
# Or use get text for content-only extraction
|
||||
agent-browser get text @e5
|
||||
```
|
||||
@@ -0,0 +1,173 @@
|
||||
# Video Recording
|
||||
|
||||
Capture browser automation as video for debugging, documentation, or verification.
|
||||
|
||||
**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
|
||||
|
||||
## Contents
|
||||
|
||||
- [Basic Recording](#basic-recording)
|
||||
- [Recording Commands](#recording-commands)
|
||||
- [Use Cases](#use-cases)
|
||||
- [Best Practices](#best-practices)
|
||||
- [Output Format](#output-format)
|
||||
- [Limitations](#limitations)
|
||||
|
||||
## Basic Recording
|
||||
|
||||
```bash
|
||||
# Start recording
|
||||
agent-browser record start ./demo.webm
|
||||
|
||||
# Perform actions
|
||||
agent-browser open https://example.com
|
||||
agent-browser snapshot -i
|
||||
agent-browser click @e1
|
||||
agent-browser fill @e2 "test input"
|
||||
|
||||
# Stop and save
|
||||
agent-browser record stop
|
||||
```
|
||||
|
||||
## Recording Commands
|
||||
|
||||
```bash
|
||||
# Start recording to file
|
||||
agent-browser record start ./output.webm
|
||||
|
||||
# Stop current recording
|
||||
agent-browser record stop
|
||||
|
||||
# Restart with new file (stops current + starts new)
|
||||
agent-browser record restart ./take2.webm
|
||||
```
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Debugging Failed Automation
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Record automation for debugging
|
||||
|
||||
agent-browser record start ./debug-$(date +%Y%m%d-%H%M%S).webm
|
||||
|
||||
# Run your automation
|
||||
agent-browser open https://app.example.com
|
||||
agent-browser snapshot -i
|
||||
agent-browser click @e1 || {
|
||||
echo "Click failed - check recording"
|
||||
agent-browser record stop
|
||||
exit 1
|
||||
}
|
||||
|
||||
agent-browser record stop
|
||||
```
|
||||
|
||||
### Documentation Generation
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Record workflow for documentation
|
||||
|
||||
agent-browser record start ./docs/how-to-login.webm
|
||||
|
||||
agent-browser open https://app.example.com/login
|
||||
agent-browser wait 1000 # Pause for visibility
|
||||
|
||||
agent-browser snapshot -i
|
||||
agent-browser fill @e1 "demo@example.com"
|
||||
agent-browser wait 500
|
||||
|
||||
agent-browser fill @e2 "password"
|
||||
agent-browser wait 500
|
||||
|
||||
agent-browser click @e3
|
||||
agent-browser wait --load networkidle
|
||||
agent-browser wait 1000 # Show result
|
||||
|
||||
agent-browser record stop
|
||||
```
|
||||
|
||||
### CI/CD Test Evidence
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Record E2E test runs for CI artifacts
|
||||
|
||||
TEST_NAME="${1:-e2e-test}"
|
||||
RECORDING_DIR="./test-recordings"
|
||||
mkdir -p "$RECORDING_DIR"
|
||||
|
||||
agent-browser record start "$RECORDING_DIR/$TEST_NAME-$(date +%s).webm"
|
||||
|
||||
# Run test
|
||||
if run_e2e_test; then
|
||||
echo "Test passed"
|
||||
else
|
||||
echo "Test failed - recording saved"
|
||||
fi
|
||||
|
||||
agent-browser record stop
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Add Pauses for Clarity
|
||||
|
||||
```bash
|
||||
# Slow down for human viewing
|
||||
agent-browser click @e1
|
||||
agent-browser wait 500 # Let viewer see result
|
||||
```
|
||||
|
||||
### 2. Use Descriptive Filenames
|
||||
|
||||
```bash
|
||||
# Include context in filename
|
||||
agent-browser record start ./recordings/login-flow-2024-01-15.webm
|
||||
agent-browser record start ./recordings/checkout-test-run-42.webm
|
||||
```
|
||||
|
||||
### 3. Handle Recording in Error Cases
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
cleanup() {
|
||||
agent-browser record stop 2>/dev/null || true
|
||||
agent-browser close 2>/dev/null || true
|
||||
}
|
||||
trap cleanup EXIT
|
||||
|
||||
agent-browser record start ./automation.webm
|
||||
# ... automation steps ...
|
||||
```
|
||||
|
||||
### 4. Combine with Screenshots
|
||||
|
||||
```bash
|
||||
# Record video AND capture key frames
|
||||
agent-browser record start ./flow.webm
|
||||
|
||||
agent-browser open https://example.com
|
||||
agent-browser screenshot ./screenshots/step1-homepage.png
|
||||
|
||||
agent-browser click @e1
|
||||
agent-browser screenshot ./screenshots/step2-after-click.png
|
||||
|
||||
agent-browser record stop
|
||||
```
|
||||
|
||||
## Output Format
|
||||
|
||||
- Default format: WebM (VP8/VP9 codec)
|
||||
- Compatible with all modern browsers and video players
|
||||
- Compressed but high quality
|
||||
|
||||
## Limitations
|
||||
|
||||
- Recording adds slight overhead to automation
|
||||
- Large recordings can consume significant disk space
|
||||
- Some headless environments may have codec limitations
|
||||
Reference in New Issue
Block a user