feat: sync agent-browser skill with upstream vercel-labs/agent-browser
Update SKILL.md to match the latest upstream skill from vercel-labs/agent-browser, adding substantial new capabilities: - Authentication (auth vault, profiles, session persistence, state files) - Command chaining, annotated screenshots, diffing - Security features (content boundaries, domain allowlist, action policy) - iOS Simulator support, Lightpanda engine, downloads, clipboard - JS eval improvements (--stdin, -b for shell safety) - Timeout guidance, config files, session cleanup Add 7 reference docs (commands, authentication, snapshot-refs, session-management, video-recording, profiling, proxy-support) and 3 ready-to-use shell templates. Kept our YAML frontmatter, setup check section, and Playwright MCP comparison table which are unique to our plugin context.
This commit is contained in:
@@ -0,0 +1,194 @@
|
||||
# Snapshot and Refs
|
||||
|
||||
Compact element references that reduce context usage dramatically for AI agents.
|
||||
|
||||
**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
|
||||
|
||||
## Contents
|
||||
|
||||
- [How Refs Work](#how-refs-work)
|
||||
- [Snapshot Command](#the-snapshot-command)
|
||||
- [Using Refs](#using-refs)
|
||||
- [Ref Lifecycle](#ref-lifecycle)
|
||||
- [Best Practices](#best-practices)
|
||||
- [Ref Notation Details](#ref-notation-details)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
|
||||
## How Refs Work
|
||||
|
||||
Traditional approach:
|
||||
```
|
||||
Full DOM/HTML -> AI parses -> CSS selector -> Action (~3000-5000 tokens)
|
||||
```
|
||||
|
||||
agent-browser approach:
|
||||
```
|
||||
Compact snapshot -> @refs assigned -> Direct interaction (~200-400 tokens)
|
||||
```
|
||||
|
||||
## The Snapshot Command
|
||||
|
||||
```bash
|
||||
# Basic snapshot (shows page structure)
|
||||
agent-browser snapshot
|
||||
|
||||
# Interactive snapshot (-i flag) - RECOMMENDED
|
||||
agent-browser snapshot -i
|
||||
```
|
||||
|
||||
### Snapshot Output Format
|
||||
|
||||
```
|
||||
Page: Example Site - Home
|
||||
URL: https://example.com
|
||||
|
||||
@e1 [header]
|
||||
@e2 [nav]
|
||||
@e3 [a] "Home"
|
||||
@e4 [a] "Products"
|
||||
@e5 [a] "About"
|
||||
@e6 [button] "Sign In"
|
||||
|
||||
@e7 [main]
|
||||
@e8 [h1] "Welcome"
|
||||
@e9 [form]
|
||||
@e10 [input type="email"] placeholder="Email"
|
||||
@e11 [input type="password"] placeholder="Password"
|
||||
@e12 [button type="submit"] "Log In"
|
||||
|
||||
@e13 [footer]
|
||||
@e14 [a] "Privacy Policy"
|
||||
```
|
||||
|
||||
## Using Refs
|
||||
|
||||
Once you have refs, interact directly:
|
||||
|
||||
```bash
|
||||
# Click the "Sign In" button
|
||||
agent-browser click @e6
|
||||
|
||||
# Fill email input
|
||||
agent-browser fill @e10 "user@example.com"
|
||||
|
||||
# Fill password
|
||||
agent-browser fill @e11 "password123"
|
||||
|
||||
# Submit the form
|
||||
agent-browser click @e12
|
||||
```
|
||||
|
||||
## Ref Lifecycle
|
||||
|
||||
**IMPORTANT**: Refs are invalidated when the page changes!
|
||||
|
||||
```bash
|
||||
# Get initial snapshot
|
||||
agent-browser snapshot -i
|
||||
# @e1 [button] "Next"
|
||||
|
||||
# Click triggers page change
|
||||
agent-browser click @e1
|
||||
|
||||
# MUST re-snapshot to get new refs!
|
||||
agent-browser snapshot -i
|
||||
# @e1 [h1] "Page 2" <- Different element now!
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Always Snapshot Before Interacting
|
||||
|
||||
```bash
|
||||
# CORRECT
|
||||
agent-browser open https://example.com
|
||||
agent-browser snapshot -i # Get refs first
|
||||
agent-browser click @e1 # Use ref
|
||||
|
||||
# WRONG
|
||||
agent-browser open https://example.com
|
||||
agent-browser click @e1 # Ref doesn't exist yet!
|
||||
```
|
||||
|
||||
### 2. Re-Snapshot After Navigation
|
||||
|
||||
```bash
|
||||
agent-browser click @e5 # Navigates to new page
|
||||
agent-browser snapshot -i # Get new refs
|
||||
agent-browser click @e1 # Use new refs
|
||||
```
|
||||
|
||||
### 3. Re-Snapshot After Dynamic Changes
|
||||
|
||||
```bash
|
||||
agent-browser click @e1 # Opens dropdown
|
||||
agent-browser snapshot -i # See dropdown items
|
||||
agent-browser click @e7 # Select item
|
||||
```
|
||||
|
||||
### 4. Snapshot Specific Regions
|
||||
|
||||
For complex pages, snapshot specific areas:
|
||||
|
||||
```bash
|
||||
# Snapshot just the form
|
||||
agent-browser snapshot @e9
|
||||
```
|
||||
|
||||
## Ref Notation Details
|
||||
|
||||
```
|
||||
@e1 [tag type="value"] "text content" placeholder="hint"
|
||||
| | | | |
|
||||
| | | | +- Additional attributes
|
||||
| | | +- Visible text
|
||||
| | +- Key attributes shown
|
||||
| +- HTML tag name
|
||||
+- Unique ref ID
|
||||
```
|
||||
|
||||
### Common Patterns
|
||||
|
||||
```
|
||||
@e1 [button] "Submit" # Button with text
|
||||
@e2 [input type="email"] # Email input
|
||||
@e3 [input type="password"] # Password input
|
||||
@e4 [a href="/page"] "Link Text" # Anchor link
|
||||
@e5 [select] # Dropdown
|
||||
@e6 [textarea] placeholder="Message" # Text area
|
||||
@e7 [div class="modal"] # Container (when relevant)
|
||||
@e8 [img alt="Logo"] # Image
|
||||
@e9 [checkbox] checked # Checked checkbox
|
||||
@e10 [radio] selected # Selected radio
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Ref not found" Error
|
||||
|
||||
```bash
|
||||
# Ref may have changed - re-snapshot
|
||||
agent-browser snapshot -i
|
||||
```
|
||||
|
||||
### Element Not Visible in Snapshot
|
||||
|
||||
```bash
|
||||
# Scroll down to reveal element
|
||||
agent-browser scroll down 1000
|
||||
agent-browser snapshot -i
|
||||
|
||||
# Or wait for dynamic content
|
||||
agent-browser wait 1000
|
||||
agent-browser snapshot -i
|
||||
```
|
||||
|
||||
### Too Many Elements
|
||||
|
||||
```bash
|
||||
# Snapshot specific container
|
||||
agent-browser snapshot @e5
|
||||
|
||||
# Or use get text for content-only extraction
|
||||
agent-browser get text @e5
|
||||
```
|
||||
Reference in New Issue
Block a user