Merge upstream origin/main into local fork

163 upstream commits merged. All local skills, agents, and commands
preserved. Metadata recalculated: 30 agents, 56 skills, 7 commands.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
John Lamb
2026-03-16 10:46:52 -05:00
180 changed files with 18263 additions and 10848 deletions

View File

@@ -3,9 +3,9 @@ name: agent-browser
description: Browser automation using Vercel's agent-browser CLI. Use when you need to interact with web pages, fill forms, take screenshots, or scrape data. Alternative to Playwright MCP - uses Bash commands with ref-based element selection. Triggers on "browse website", "fill form", "click button", "take screenshot", "scrape page", "web automation".
---
# agent-browser: CLI Browser Automation
# Browser Automation with agent-browser
Vercel's headless browser automation CLI designed for AI agents. Uses ref-based selection (@e1, @e2) from accessibility snapshots.
The CLI uses Chrome/Chromium via CDP directly. Install via `npm i -g agent-browser`, `brew install agent-browser`, or `cargo install agent-browser`. Run `agent-browser install` to download Chrome.
## Setup Check
@@ -23,185 +23,627 @@ agent-browser install # Downloads Chromium
## Core Workflow
**The snapshot + ref pattern is optimal for LLMs:**
Every browser automation follows this pattern:
1. **Navigate** to URL
2. **Snapshot** to get interactive elements with refs
3. **Interact** using refs (@e1, @e2, etc.)
4. **Re-snapshot** after navigation or DOM changes
1. **Navigate**: `agent-browser open <url>`
2. **Snapshot**: `agent-browser snapshot -i` (get element refs like `@e1`, `@e2`)
3. **Interact**: Use refs to click, fill, select
4. **Re-snapshot**: After navigation or DOM changes, get fresh refs
```bash
# Step 1: Open URL
agent-browser open https://example.com
# Step 2: Get interactive elements with refs
agent-browser snapshot -i --json
# Step 3: Interact using refs
agent-browser click @e1
agent-browser fill @e2 "search query"
# Step 4: Re-snapshot after changes
agent-browser open https://example.com/form
agent-browser snapshot -i
```
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"
## Key Commands
### Navigation
```bash
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browser
```
### Snapshots (Essential for AI)
```bash
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -i --json # JSON output for parsing
agent-browser snapshot -c # Compact (remove empty elements)
agent-browser snapshot -d 3 # Limit depth
```
### Interactions
```bash
agent-browser click @e1 # Click element
agent-browser dblclick @e1 # Double-click
agent-browser fill @e1 "text" # Clear and fill input
agent-browser type @e1 "text" # Type without clearing
agent-browser press Enter # Press key
agent-browser hover @e1 # Hover element
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "option" # Select dropdown option
agent-browser scroll down 500 # Scroll (up/down/left/right)
agent-browser scrollintoview @e1 # Scroll element into view
```
### Get Information
```bash
agent-browser get text @e1 # Get element text
agent-browser get html @e1 # Get element HTML
agent-browser get value @e1 # Get input value
agent-browser get attr href @e1 # Get attribute
agent-browser get title # Get page title
agent-browser get url # Get current URL
agent-browser get count "button" # Count matching elements
```
### Screenshots & PDFs
```bash
agent-browser screenshot # Viewport screenshot
agent-browser screenshot --full # Full page
agent-browser screenshot output.png # Save to file
agent-browser screenshot --full output.png # Full page to file
agent-browser pdf output.pdf # Save as PDF
```
### Wait
```bash
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait "text" # Wait for text to appear
```
## Semantic Locators (Alternative to Refs)
```bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign up" click
agent-browser find label "Email" fill "user@example.com"
agent-browser find placeholder "Search..." fill "query"
```
## Sessions (Parallel Browsers)
```bash
# Run multiple independent browser sessions
agent-browser --session browser1 open https://site1.com
agent-browser --session browser2 open https://site2.com
# List active sessions
agent-browser session list
```
## Examples
### Login Flow
```bash
agent-browser open https://app.example.com/login
agent-browser snapshot -i
# Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Sign in" [ref=e3]
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait 2000
agent-browser snapshot -i # Verify logged in
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
```
### Search and Extract
## Command Chaining
Commands can be chained with `&&` in a single shell invocation. The browser persists between commands via a background daemon, so chaining is safe and more efficient than separate calls.
```bash
agent-browser open https://news.ycombinator.com
agent-browser snapshot -i --json
# Parse JSON to find story links
agent-browser get text @e12 # Get headline text
agent-browser click @e12 # Click to open story
# Chain open + wait + snapshot in one call
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
# Chain multiple interactions
agent-browser fill @e1 "user@example.com" && agent-browser fill @e2 "password123" && agent-browser click @e3
# Navigate and capture
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser screenshot page.png
```
### Form Filling
**When to chain:** Use `&&` when you don't need to read the output of an intermediate command before proceeding (e.g., open + wait + screenshot). Run commands separately when you need to parse the output first (e.g., snapshot to discover refs, then interact using those refs).
## Handling Authentication
When automating a site that requires login, choose the approach that fits:
**Option 1: Import auth from the user's browser (fastest for one-off tasks)**
```bash
agent-browser open https://forms.example.com
# Connect to the user's running Chrome (they're already logged in)
agent-browser --auto-connect state save ./auth.json
# Use that auth state
agent-browser --state ./auth.json open https://app.example.com/dashboard
```
State files contain session tokens in plaintext -- add to `.gitignore` and delete when no longer needed. Set `AGENT_BROWSER_ENCRYPTION_KEY` for encryption at rest.
**Option 2: Persistent profile (simplest for recurring tasks)**
```bash
# First run: login manually or via automation
agent-browser --profile ~/.myapp open https://app.example.com/login
# ... fill credentials, submit ...
# All future runs: already authenticated
agent-browser --profile ~/.myapp open https://app.example.com/dashboard
```
**Option 3: Session name (auto-save/restore cookies + localStorage)**
```bash
agent-browser --session-name myapp open https://app.example.com/login
# ... login flow ...
agent-browser close # State auto-saved
# Next time: state auto-restored
agent-browser --session-name myapp open https://app.example.com/dashboard
```
**Option 4: Auth vault (credentials stored encrypted, login by name)**
```bash
echo "$PASSWORD" | agent-browser auth save myapp --url https://app.example.com/login --username user --password-stdin
agent-browser auth login myapp
```
**Option 5: State file (manual save/load)**
```bash
# After logging in:
agent-browser state save ./auth.json
# In a future session:
agent-browser state load ./auth.json
agent-browser open https://app.example.com/dashboard
```
See [references/authentication.md](references/authentication.md) for OAuth, 2FA, cookie-based auth, and token refresh patterns.
## Essential Commands
```bash
# Navigation
agent-browser open <url> # Navigate (aliases: goto, navigate)
agent-browser close # Close browser
# Snapshot
agent-browser snapshot -i # Interactive elements with refs (recommended)
agent-browser snapshot -i -C # Include cursor-interactive elements (divs with onclick, cursor:pointer)
agent-browser snapshot -s "#selector" # Scope to CSS selector
# Interaction (use @refs from snapshot)
agent-browser click @e1 # Click element
agent-browser click @e1 --new-tab # Click and open in new tab
agent-browser fill @e2 "text" # Clear and type text
agent-browser type @e2 "text" # Type without clearing
agent-browser select @e1 "option" # Select dropdown option
agent-browser check @e1 # Check checkbox
agent-browser press Enter # Press key
agent-browser keyboard type "text" # Type at current focus (no selector)
agent-browser keyboard inserttext "text" # Insert without key events
agent-browser scroll down 500 # Scroll page
agent-browser scroll down 500 --selector "div.content" # Scroll within a specific container
# Get information
agent-browser get text @e1 # Get element text
agent-browser get url # Get current URL
agent-browser get title # Get page title
agent-browser get cdp-url # Get CDP WebSocket URL
# Wait
agent-browser wait @e1 # Wait for element
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --url "**/page" # Wait for URL pattern
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Welcome" # Wait for text to appear (substring match)
agent-browser wait --fn "!document.body.innerText.includes('Loading...')" # Wait for text to disappear
agent-browser wait "#spinner" --state hidden # Wait for element to disappear
# Downloads
agent-browser download @e1 ./file.pdf # Click element to trigger download
agent-browser wait --download ./output.zip # Wait for any download to complete
agent-browser --download-path ./downloads open <url> # Set default download directory
# Viewport & Device Emulation
agent-browser set viewport 1920 1080 # Set viewport size (default: 1280x720)
agent-browser set viewport 1920 1080 2 # 2x retina (same CSS size, higher res screenshots)
agent-browser set device "iPhone 14" # Emulate device (viewport + user agent)
# Capture
agent-browser screenshot # Screenshot to temp dir
agent-browser screenshot --full # Full page screenshot
agent-browser screenshot --annotate # Annotated screenshot with numbered element labels
agent-browser screenshot --screenshot-dir ./shots # Save to custom directory
agent-browser screenshot --screenshot-format jpeg --screenshot-quality 80
agent-browser pdf output.pdf # Save as PDF
# Clipboard
agent-browser clipboard read # Read text from clipboard
agent-browser clipboard write "Hello, World!" # Write text to clipboard
agent-browser clipboard copy # Copy current selection
agent-browser clipboard paste # Paste from clipboard
# Diff (compare page states)
agent-browser diff snapshot # Compare current vs last snapshot
agent-browser diff snapshot --baseline before.txt # Compare current vs saved file
agent-browser diff screenshot --baseline before.png # Visual pixel diff
agent-browser diff url <url1> <url2> # Compare two pages
agent-browser diff url <url1> <url2> --wait-until networkidle # Custom wait strategy
agent-browser diff url <url1> <url2> --selector "#main" # Scope to element
```
## Common Patterns
### Form Submission
```bash
agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser select @e3 "United States"
agent-browser check @e4 # Agree to terms
agent-browser click @e5 # Submit button
agent-browser screenshot confirmation.png
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle
```
### Debug Mode
### Authentication with Auth Vault (Recommended)
```bash
# Run with visible browser window
agent-browser --headed open https://example.com
agent-browser --headed snapshot -i
agent-browser --headed click @e1
# Save credentials once (encrypted with AGENT_BROWSER_ENCRYPTION_KEY)
# Recommended: pipe password via stdin to avoid shell history exposure
echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin
# Login using saved profile (LLM never sees password)
agent-browser auth login github
# List/show/delete profiles
agent-browser auth list
agent-browser auth show github
agent-browser auth delete github
```
## JSON Output
Add `--json` for structured output:
### Authentication with State Persistence
```bash
# Login once and save state
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "$USERNAME"
agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
# Reuse in future sessions
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
```
### Session Persistence
```bash
# Auto-save/restore cookies and localStorage across browser restarts
agent-browser --session-name myapp open https://app.example.com/login
# ... login flow ...
agent-browser close # State auto-saved to ~/.agent-browser/sessions/
# Next time, state is auto-loaded
agent-browser --session-name myapp open https://app.example.com/dashboard
# Encrypt state at rest
export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32)
agent-browser --session-name secure open https://app.example.com
# Manage saved states
agent-browser state list
agent-browser state show myapp-default.json
agent-browser state clear myapp
agent-browser state clean --older-than 7
```
### Data Extraction
```bash
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5 # Get specific element text
agent-browser get text body > page.txt # Get all page text
# JSON output for parsing
agent-browser snapshot -i --json
agent-browser get text @e1 --json
```
Returns:
### Parallel Sessions
```bash
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com
agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i
agent-browser session list
```
### Connect to Existing Chrome
```bash
# Auto-discover running Chrome with remote debugging enabled
agent-browser --auto-connect open https://example.com
agent-browser --auto-connect snapshot
# Or with explicit CDP port
agent-browser --cdp 9222 snapshot
```
### Color Scheme (Dark Mode)
```bash
# Persistent dark mode via flag (applies to all pages and new tabs)
agent-browser --color-scheme dark open https://example.com
# Or via environment variable
AGENT_BROWSER_COLOR_SCHEME=dark agent-browser open https://example.com
# Or set during session (persists for subsequent commands)
agent-browser set media dark
```
### Viewport & Responsive Testing
```bash
# Set a custom viewport size (default is 1280x720)
agent-browser set viewport 1920 1080
agent-browser screenshot desktop.png
# Test mobile-width layout
agent-browser set viewport 375 812
agent-browser screenshot mobile.png
# Retina/HiDPI: same CSS layout at 2x pixel density
# Screenshots stay at logical viewport size, but content renders at higher DPI
agent-browser set viewport 1920 1080 2
agent-browser screenshot retina.png
# Device emulation (sets viewport + user agent in one step)
agent-browser set device "iPhone 14"
agent-browser screenshot device.png
```
The `scale` parameter (3rd argument) sets `window.devicePixelRatio` without changing CSS layout. Use it when testing retina rendering or capturing higher-resolution screenshots.
### Visual Browser (Debugging)
```bash
agent-browser --headed open https://example.com
agent-browser highlight @e1 # Highlight element
agent-browser inspect # Open Chrome DevTools for the active page
agent-browser record start demo.webm # Record session
agent-browser profiler start # Start Chrome DevTools profiling
agent-browser profiler stop trace.json # Stop and save profile (path optional)
```
Use `AGENT_BROWSER_HEADED=1` to enable headed mode via environment variable. Browser extensions work in both headed and headless mode.
### Local Files (PDFs, HTML)
```bash
# Open local files with file:// URLs
agent-browser --allow-file-access open file:///path/to/document.pdf
agent-browser --allow-file-access open file:///path/to/page.html
agent-browser screenshot output.png
```
### iOS Simulator (Mobile Safari)
```bash
# List available iOS simulators
agent-browser device list
# Launch Safari on a specific device
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
# Same workflow as desktop - snapshot, interact, re-snapshot
agent-browser -p ios snapshot -i
agent-browser -p ios tap @e1 # Tap (alias for click)
agent-browser -p ios fill @e2 "text"
agent-browser -p ios swipe up # Mobile-specific gesture
# Take screenshot
agent-browser -p ios screenshot mobile.png
# Close session (shuts down simulator)
agent-browser -p ios close
```
**Requirements:** macOS with Xcode, Appium (`npm install -g appium && appium driver install xcuitest`)
**Real devices:** Works with physical iOS devices if pre-configured. Use `--device "<UDID>"` where UDID is from `xcrun xctrace list devices`.
## Security
All security features are opt-in. By default, agent-browser imposes no restrictions on navigation, actions, or output.
### Content Boundaries (Recommended for AI Agents)
Enable `--content-boundaries` to wrap page-sourced output in markers that help LLMs distinguish tool output from untrusted page content:
```bash
export AGENT_BROWSER_CONTENT_BOUNDARIES=1
agent-browser snapshot
# Output:
# --- AGENT_BROWSER_PAGE_CONTENT nonce=<hex> origin=https://example.com ---
# [accessibility tree]
# --- END_AGENT_BROWSER_PAGE_CONTENT nonce=<hex> ---
```
### Domain Allowlist
Restrict navigation to trusted domains. Wildcards like `*.example.com` also match the bare domain `example.com`. Sub-resource requests, WebSocket, and EventSource connections to non-allowed domains are also blocked. Include CDN domains your target pages depend on:
```bash
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"
agent-browser open https://example.com # OK
agent-browser open https://malicious.com # Blocked
```
### Action Policy
Use a policy file to gate destructive actions:
```bash
export AGENT_BROWSER_ACTION_POLICY=./policy.json
```
Example `policy.json`:
```json
{ "default": "deny", "allow": ["navigate", "snapshot", "click", "scroll", "wait", "get"] }
```
Auth vault operations (`auth login`, etc.) bypass action policy but domain allowlist still applies.
### Output Limits
Prevent context flooding from large pages:
```bash
export AGENT_BROWSER_MAX_OUTPUT=50000
```
## Diffing (Verifying Changes)
Use `diff snapshot` after performing an action to verify it had the intended effect. This compares the current accessibility tree against the last snapshot taken in the session.
```bash
# Typical workflow: snapshot -> action -> diff
agent-browser snapshot -i # Take baseline snapshot
agent-browser click @e2 # Perform action
agent-browser diff snapshot # See what changed (auto-compares to last snapshot)
```
For visual regression testing or monitoring:
```bash
# Save a baseline screenshot, then compare later
agent-browser screenshot baseline.png
# ... time passes or changes are made ...
agent-browser diff screenshot --baseline baseline.png
# Compare staging vs production
agent-browser diff url https://staging.example.com https://prod.example.com --screenshot
```
`diff snapshot` output uses `+` for additions and `-` for removals, similar to git diff. `diff screenshot` produces a diff image with changed pixels highlighted in red, plus a mismatch percentage.
## Timeouts and Slow Pages
The default timeout is 25 seconds. This can be overridden with the `AGENT_BROWSER_DEFAULT_TIMEOUT` environment variable (value in milliseconds). For slow websites or large pages, use explicit waits instead of relying on the default timeout:
```bash
# Wait for network activity to settle (best for slow pages)
agent-browser wait --load networkidle
# Wait for a specific element to appear
agent-browser wait "#content"
agent-browser wait @e1
# Wait for a specific URL pattern (useful after redirects)
agent-browser wait --url "**/dashboard"
# Wait for a JavaScript condition
agent-browser wait --fn "document.readyState === 'complete'"
# Wait a fixed duration (milliseconds) as a last resort
agent-browser wait 5000
```
When dealing with consistently slow websites, use `wait --load networkidle` after `open` to ensure the page is fully loaded before taking a snapshot. If a specific element is slow to render, wait for it directly with `wait <selector>` or `wait @ref`.
## Session Management and Cleanup
When running multiple agents or automations concurrently, always use named sessions to avoid conflicts:
```bash
# Each agent gets its own isolated session
agent-browser --session agent1 open site-a.com
agent-browser --session agent2 open site-b.com
# Check active sessions
agent-browser session list
```
Always close your browser session when done to avoid leaked processes:
```bash
agent-browser close # Close default session
agent-browser --session agent1 close # Close specific session
```
If a previous session was not closed properly, the daemon may still be running. Use `agent-browser close` to clean it up before starting new work.
To auto-shutdown the daemon after a period of inactivity (useful for ephemeral/CI environments):
```bash
AGENT_BROWSER_IDLE_TIMEOUT_MS=60000 agent-browser open example.com
```
## Ref Lifecycle (Important)
Refs (`@e1`, `@e2`, etc.) are invalidated when the page changes. Always re-snapshot after:
- Clicking links or buttons that navigate
- Form submissions
- Dynamic content loading (dropdowns, modals)
```bash
agent-browser click @e5 # Navigates to new page
agent-browser snapshot -i # MUST re-snapshot
agent-browser click @e1 # Use new refs
```
## Annotated Screenshots (Vision Mode)
Use `--annotate` to take a screenshot with numbered labels overlaid on interactive elements. Each label `[N]` maps to ref `@eN`. This also caches refs, so you can interact with elements immediately without a separate snapshot.
```bash
agent-browser screenshot --annotate
# Output includes the image path and a legend:
# [1] @e1 button "Submit"
# [2] @e2 link "Home"
# [3] @e3 textbox "Email"
agent-browser click @e2 # Click using ref from annotated screenshot
```
Use annotated screenshots when:
- The page has unlabeled icon buttons or visual-only elements
- You need to verify visual layout or styling
- Canvas or chart elements are present (invisible to text snapshots)
- You need spatial reasoning about element positions
## Semantic Locators (Alternative to Refs)
When refs are unavailable or unreliable, use semantic locators:
```bash
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click
```
## JavaScript Evaluation (eval)
Use `eval` to run JavaScript in the browser context. **Shell quoting can corrupt complex expressions** -- use `--stdin` or `-b` to avoid issues.
```bash
# Simple expressions work with regular quoting
agent-browser eval 'document.title'
agent-browser eval 'document.querySelectorAll("img").length'
# Complex JS: use --stdin with heredoc (RECOMMENDED)
agent-browser eval --stdin <<'EVALEOF'
JSON.stringify(
Array.from(document.querySelectorAll("img"))
.filter(i => !i.alt)
.map(i => ({ src: i.src.split("/").pop(), width: i.width }))
)
EVALEOF
# Alternative: base64 encoding (avoids all shell escaping issues)
agent-browser eval -b "$(echo -n 'Array.from(document.querySelectorAll("a")).map(a => a.href)' | base64)"
```
**Why this matters:** When the shell processes your command, inner double quotes, `!` characters (history expansion), backticks, and `$()` can all corrupt the JavaScript before it reaches agent-browser. The `--stdin` and `-b` flags bypass shell interpretation entirely.
**Rules of thumb:**
- Single-line, no nested quotes -> regular `eval 'expression'` with single quotes is fine
- Nested quotes, arrow functions, template literals, or multiline -> use `eval --stdin <<'EVALEOF'`
- Programmatic/generated scripts -> use `eval -b` with base64
## Configuration File
Create `agent-browser.json` in the project root for persistent settings:
```json
{
"success": true,
"data": {
"refs": {
"e1": {"name": "Submit", "role": "button"},
"e2": {"name": "Email", "role": "textbox"}
},
"snapshot": "- button \"Submit\" [ref=e1]\n- textbox \"Email\" [ref=e2]"
}
"headed": true,
"proxy": "http://localhost:8080",
"profile": "./browser-data"
}
```
Priority (lowest to highest): `~/.agent-browser/config.json` < `./agent-browser.json` < env vars < CLI flags. Use `--config <path>` or `AGENT_BROWSER_CONFIG` env var for a custom config file (exits with error if missing/invalid). All CLI options map to camelCase keys (e.g., `--executable-path` -> `"executablePath"`). Boolean flags accept `true`/`false` values (e.g., `--headed false` overrides config). Extensions from user and project configs are merged, not replaced.
## Browser Engine Selection
Use `--engine` to choose a local browser engine. The default is `chrome`.
```bash
# Use Lightpanda (fast headless browser, requires separate install)
agent-browser --engine lightpanda open example.com
# Via environment variable
export AGENT_BROWSER_ENGINE=lightpanda
agent-browser open example.com
# With custom binary path
agent-browser --engine lightpanda --executable-path /path/to/lightpanda open example.com
```
Supported engines:
- `chrome` (default) -- Chrome/Chromium via CDP
- `lightpanda` -- Lightpanda headless browser via CDP (10x faster, 10x less memory than Chrome)
Lightpanda does not support `--extension`, `--profile`, `--state`, or `--allow-file-access`. Install Lightpanda from https://lightpanda.io/docs/open-source/installation.
## Deep-Dive Documentation
| Reference | When to Use |
| -------------------------------------------------------------------- | --------------------------------------------------------- |
| [references/commands.md](references/commands.md) | Full command reference with all options |
| [references/snapshot-refs.md](references/snapshot-refs.md) | Ref lifecycle, invalidation rules, troubleshooting |
| [references/session-management.md](references/session-management.md) | Parallel sessions, state persistence, concurrent scraping |
| [references/authentication.md](references/authentication.md) | Login flows, OAuth, 2FA handling, state reuse |
| [references/video-recording.md](references/video-recording.md) | Recording workflows for debugging and documentation |
| [references/profiling.md](references/profiling.md) | Chrome DevTools profiling for performance analysis |
| [references/proxy-support.md](references/proxy-support.md) | Proxy configuration, geo-testing, rotating proxies |
## Ready-to-Use Templates
| Template | Description |
| ------------------------------------------------------------------------ | ----------------------------------- |
| [templates/form-automation.sh](templates/form-automation.sh) | Form filling with validation |
| [templates/authenticated-session.sh](templates/authenticated-session.sh) | Login once, reuse state |
| [templates/capture-workflow.sh](templates/capture-workflow.sh) | Content extraction with screenshots |
```bash
./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output
```
## vs Playwright MCP
| Feature | agent-browser (CLI) | Playwright MCP |

View File

@@ -0,0 +1,303 @@
# Authentication Patterns
Login flows, session persistence, OAuth, 2FA, and authenticated browsing.
**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
## Contents
- [Import Auth from Your Browser](#import-auth-from-your-browser)
- [Persistent Profiles](#persistent-profiles)
- [Session Persistence](#session-persistence)
- [Basic Login Flow](#basic-login-flow)
- [Saving Authentication State](#saving-authentication-state)
- [Restoring Authentication](#restoring-authentication)
- [OAuth / SSO Flows](#oauth--sso-flows)
- [Two-Factor Authentication](#two-factor-authentication)
- [HTTP Basic Auth](#http-basic-auth)
- [Cookie-Based Auth](#cookie-based-auth)
- [Token Refresh Handling](#token-refresh-handling)
- [Security Best Practices](#security-best-practices)
## Import Auth from Your Browser
The fastest way to authenticate is to reuse cookies from a Chrome session you are already logged into.
**Step 1: Start Chrome with remote debugging**
```bash
# macOS
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --remote-debugging-port=9222
# Linux
google-chrome --remote-debugging-port=9222
# Windows
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
```
Log in to your target site(s) in this Chrome window as you normally would.
> **Security note:** `--remote-debugging-port` exposes full browser control on localhost. Any local process can connect and read cookies, execute JS, etc. Only use on trusted machines and close Chrome when done.
**Step 2: Grab the auth state**
```bash
# Auto-discover the running Chrome and save its cookies + localStorage
agent-browser --auto-connect state save ./my-auth.json
```
**Step 3: Reuse in automation**
```bash
# Load auth at launch
agent-browser --state ./my-auth.json open https://app.example.com/dashboard
# Or load into an existing session
agent-browser state load ./my-auth.json
agent-browser open https://app.example.com/dashboard
```
This works for any site, including those with complex OAuth flows, SSO, or 2FA -- as long as Chrome already has valid session cookies.
> **Security note:** State files contain session tokens in plaintext. Add them to `.gitignore`, delete when no longer needed, and set `AGENT_BROWSER_ENCRYPTION_KEY` for encryption at rest. See [Security Best Practices](#security-best-practices).
**Tip:** Combine with `--session-name` so the imported auth auto-persists across restarts:
```bash
agent-browser --session-name myapp state load ./my-auth.json
# From now on, state is auto-saved/restored for "myapp"
```
## Persistent Profiles
Use `--profile` to point agent-browser at a Chrome user data directory. This persists everything (cookies, IndexedDB, service workers, cache) across browser restarts without explicit save/load:
```bash
# First run: login once
agent-browser --profile ~/.myapp-profile open https://app.example.com/login
# ... complete login flow ...
# All subsequent runs: already authenticated
agent-browser --profile ~/.myapp-profile open https://app.example.com/dashboard
```
Use different paths for different projects or test users:
```bash
agent-browser --profile ~/.profiles/admin open https://app.example.com
agent-browser --profile ~/.profiles/viewer open https://app.example.com
```
Or set via environment variable:
```bash
export AGENT_BROWSER_PROFILE=~/.myapp-profile
agent-browser open https://app.example.com/dashboard
```
## Session Persistence
Use `--session-name` to auto-save and restore cookies + localStorage by name, without managing files:
```bash
# Auto-saves state on close, auto-restores on next launch
agent-browser --session-name twitter open https://twitter.com
# ... login flow ...
agent-browser close # state saved to ~/.agent-browser/sessions/
# Next time: state is automatically restored
agent-browser --session-name twitter open https://twitter.com
```
Encrypt state at rest:
```bash
export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32)
agent-browser --session-name secure open https://app.example.com
```
## Basic Login Flow
```bash
# Navigate to login page
agent-browser open https://app.example.com/login
agent-browser wait --load networkidle
# Get form elements
agent-browser snapshot -i
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Sign In"
# Fill credentials
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
# Submit
agent-browser click @e3
agent-browser wait --load networkidle
# Verify login succeeded
agent-browser get url # Should be dashboard, not login
```
## Saving Authentication State
After logging in, save state for reuse:
```bash
# Login first (see above)
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
# Save authenticated state
agent-browser state save ./auth-state.json
```
## Restoring Authentication
Skip login by loading saved state:
```bash
# Load saved auth state
agent-browser state load ./auth-state.json
# Navigate directly to protected page
agent-browser open https://app.example.com/dashboard
# Verify authenticated
agent-browser snapshot -i
```
## OAuth / SSO Flows
For OAuth redirects:
```bash
# Start OAuth flow
agent-browser open https://app.example.com/auth/google
# Handle redirects automatically
agent-browser wait --url "**/accounts.google.com**"
agent-browser snapshot -i
# Fill Google credentials
agent-browser fill @e1 "user@gmail.com"
agent-browser click @e2 # Next button
agent-browser wait 2000
agent-browser snapshot -i
agent-browser fill @e3 "password"
agent-browser click @e4 # Sign in
# Wait for redirect back
agent-browser wait --url "**/app.example.com**"
agent-browser state save ./oauth-state.json
```
## Two-Factor Authentication
Handle 2FA with manual intervention:
```bash
# Login with credentials
agent-browser open https://app.example.com/login --headed # Show browser
agent-browser snapshot -i
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
# Wait for user to complete 2FA manually
echo "Complete 2FA in the browser window..."
agent-browser wait --url "**/dashboard" --timeout 120000
# Save state after 2FA
agent-browser state save ./2fa-state.json
```
## HTTP Basic Auth
For sites using HTTP Basic Authentication:
```bash
# Set credentials before navigation
agent-browser set credentials username password
# Navigate to protected resource
agent-browser open https://protected.example.com/api
```
## Cookie-Based Auth
Manually set authentication cookies:
```bash
# Set auth cookie
agent-browser cookies set session_token "abc123xyz"
# Navigate to protected page
agent-browser open https://app.example.com/dashboard
```
## Token Refresh Handling
For sessions with expiring tokens:
```bash
#!/bin/bash
# Wrapper that handles token refresh
STATE_FILE="./auth-state.json"
# Try loading existing state
if [[ -f "$STATE_FILE" ]]; then
agent-browser state load "$STATE_FILE"
agent-browser open https://app.example.com/dashboard
# Check if session is still valid
URL=$(agent-browser get url)
if [[ "$URL" == *"/login"* ]]; then
echo "Session expired, re-authenticating..."
# Perform fresh login
agent-browser snapshot -i
agent-browser fill @e1 "$USERNAME"
agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save "$STATE_FILE"
fi
else
# First-time login
agent-browser open https://app.example.com/login
# ... login flow ...
fi
```
## Security Best Practices
1. **Never commit state files** - They contain session tokens
```bash
echo "*.auth-state.json" >> .gitignore
```
2. **Use environment variables for credentials**
```bash
agent-browser fill @e1 "$APP_USERNAME"
agent-browser fill @e2 "$APP_PASSWORD"
```
3. **Clean up after automation**
```bash
agent-browser cookies clear
rm -f ./auth-state.json
```
4. **Use short-lived sessions for CI/CD**
```bash
# Don't persist state in CI
agent-browser open https://app.example.com/login
# ... login and perform actions ...
agent-browser close # Session ends, nothing persisted
```

View File

@@ -0,0 +1,266 @@
# Command Reference
Complete reference for all agent-browser commands. For quick start and common patterns, see SKILL.md.
## Navigation
```bash
agent-browser open <url> # Navigate to URL (aliases: goto, navigate)
# Supports: https://, http://, file://, about:, data://
# Auto-prepends https:// if no protocol given
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browser (aliases: quit, exit)
agent-browser connect 9222 # Connect to browser via CDP port
```
## Snapshot (page analysis)
```bash
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3
agent-browser snapshot -s "#main" # Scope to CSS selector
```
## Interactions (use @refs from snapshot)
```bash
agent-browser click @e1 # Click
agent-browser click @e1 --new-tab # Click and open in new tab
agent-browser dblclick @e1 # Double-click
agent-browser focus @e1 # Focus element
agent-browser fill @e2 "text" # Clear and type
agent-browser type @e2 "text" # Type without clearing
agent-browser press Enter # Press key (alias: key)
agent-browser press Control+a # Key combination
agent-browser keydown Shift # Hold key down
agent-browser keyup Shift # Release key
agent-browser hover @e1 # Hover
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "value" # Select dropdown option
agent-browser select @e1 "a" "b" # Select multiple options
agent-browser scroll down 500 # Scroll page (default: down 300px)
agent-browser scrollintoview @e1 # Scroll element into view (alias: scrollinto)
agent-browser drag @e1 @e2 # Drag and drop
agent-browser upload @e1 file.pdf # Upload files
```
## Get Information
```bash
agent-browser get text @e1 # Get element text
agent-browser get html @e1 # Get innerHTML
agent-browser get value @e1 # Get input value
agent-browser get attr @e1 href # Get attribute
agent-browser get title # Get page title
agent-browser get url # Get current URL
agent-browser get cdp-url # Get CDP WebSocket URL
agent-browser get count ".item" # Count matching elements
agent-browser get box @e1 # Get bounding box
agent-browser get styles @e1 # Get computed styles (font, color, bg, etc.)
```
## Check State
```bash
agent-browser is visible @e1 # Check if visible
agent-browser is enabled @e1 # Check if enabled
agent-browser is checked @e1 # Check if checked
```
## Screenshots and PDF
```bash
agent-browser screenshot # Save to temporary directory
agent-browser screenshot path.png # Save to specific path
agent-browser screenshot --full # Full page
agent-browser pdf output.pdf # Save as PDF
```
## Video Recording
```bash
agent-browser record start ./demo.webm # Start recording
agent-browser click @e1 # Perform actions
agent-browser record stop # Stop and save video
agent-browser record restart ./take2.webm # Stop current + start new
```
## Wait
```bash
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success" # Wait for text (or -t)
agent-browser wait --url "**/dashboard" # Wait for URL pattern (or -u)
agent-browser wait --load networkidle # Wait for network idle (or -l)
agent-browser wait --fn "window.ready" # Wait for JS condition (or -f)
```
## Mouse Control
```bash
agent-browser mouse move 100 200 # Move mouse
agent-browser mouse down left # Press button
agent-browser mouse up left # Release button
agent-browser mouse wheel 100 # Scroll wheel
```
## Semantic Locators (alternative to refs)
```bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find text "Sign In" click --exact # Exact match only
agent-browser find label "Email" fill "user@test.com"
agent-browser find placeholder "Search" type "query"
agent-browser find alt "Logo" click
agent-browser find title "Close" click
agent-browser find testid "submit-btn" click
agent-browser find first ".item" click
agent-browser find last ".item" click
agent-browser find nth 2 "a" hover
```
## Browser Settings
```bash
agent-browser set viewport 1920 1080 # Set viewport size
agent-browser set viewport 1920 1080 2 # 2x retina (same CSS size, higher res screenshots)
agent-browser set device "iPhone 14" # Emulate device
agent-browser set geo 37.7749 -122.4194 # Set geolocation (alias: geolocation)
agent-browser set offline on # Toggle offline mode
agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
agent-browser set credentials user pass # HTTP basic auth (alias: auth)
agent-browser set media dark # Emulate color scheme
agent-browser set media light reduced-motion # Light mode + reduced motion
```
## Cookies and Storage
```bash
agent-browser cookies # Get all cookies
agent-browser cookies set name value # Set cookie
agent-browser cookies clear # Clear cookies
agent-browser storage local # Get all localStorage
agent-browser storage local key # Get specific key
agent-browser storage local set k v # Set value
agent-browser storage local clear # Clear all
```
## Network
```bash
agent-browser network route <url> # Intercept requests
agent-browser network route <url> --abort # Block requests
agent-browser network route <url> --body '{}' # Mock response
agent-browser network unroute [url] # Remove routes
agent-browser network requests # View tracked requests
agent-browser network requests --filter api # Filter requests
```
## Tabs and Windows
```bash
agent-browser tab # List tabs
agent-browser tab new [url] # New tab
agent-browser tab 2 # Switch to tab by index
agent-browser tab close # Close current tab
agent-browser tab close 2 # Close tab by index
agent-browser window new # New window
```
## Frames
```bash
agent-browser frame "#iframe" # Switch to iframe
agent-browser frame main # Back to main frame
```
## Dialogs
```bash
agent-browser dialog accept [text] # Accept dialog
agent-browser dialog dismiss # Dismiss dialog
```
## JavaScript
```bash
agent-browser eval "document.title" # Simple expressions only
agent-browser eval -b "<base64>" # Any JavaScript (base64 encoded)
agent-browser eval --stdin # Read script from stdin
```
Use `-b`/`--base64` or `--stdin` for reliable execution. Shell escaping with nested quotes and special characters is error-prone.
```bash
# Base64 encode your script, then:
agent-browser eval -b "ZG9jdW1lbnQucXVlcnlTZWxlY3RvcignW3NyYyo9Il9uZXh0Il0nKQ=="
# Or use stdin with heredoc for multiline scripts:
cat <<'EOF' | agent-browser eval --stdin
const links = document.querySelectorAll('a');
Array.from(links).map(a => a.href);
EOF
```
## State Management
```bash
agent-browser state save auth.json # Save cookies, storage, auth state
agent-browser state load auth.json # Restore saved state
```
## Global Options
```bash
agent-browser --session <name> ... # Isolated browser session
agent-browser --json ... # JSON output for parsing
agent-browser --headed ... # Show browser window (not headless)
agent-browser --full ... # Full page screenshot (-f)
agent-browser --cdp <port> ... # Connect via Chrome DevTools Protocol
agent-browser -p <provider> ... # Cloud browser provider (--provider)
agent-browser --proxy <url> ... # Use proxy server
agent-browser --proxy-bypass <hosts> # Hosts to bypass proxy
agent-browser --headers <json> ... # HTTP headers scoped to URL's origin
agent-browser --executable-path <p> # Custom browser executable
agent-browser --extension <path> ... # Load browser extension (repeatable)
agent-browser --ignore-https-errors # Ignore SSL certificate errors
agent-browser --help # Show help (-h)
agent-browser --version # Show version (-V)
agent-browser <command> --help # Show detailed help for a command
```
## Debugging
```bash
agent-browser --headed open example.com # Show browser window
agent-browser --cdp 9222 snapshot # Connect via CDP port
agent-browser connect 9222 # Alternative: connect command
agent-browser console # View console messages
agent-browser console --clear # Clear console
agent-browser errors # View page errors
agent-browser errors --clear # Clear errors
agent-browser highlight @e1 # Highlight element
agent-browser inspect # Open Chrome DevTools for this session
agent-browser trace start # Start recording trace
agent-browser trace stop trace.zip # Stop and save trace
agent-browser profiler start # Start Chrome DevTools profiling
agent-browser profiler stop trace.json # Stop and save profile
```
## Environment Variables
```bash
AGENT_BROWSER_SESSION="mysession" # Default session name
AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path
AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths
AGENT_BROWSER_PROVIDER="browserbase" # Cloud browser provider
AGENT_BROWSER_STREAM_PORT="9223" # WebSocket streaming port
AGENT_BROWSER_HOME="/path/to/agent-browser" # Custom install location
```

View File

@@ -0,0 +1,120 @@
# Profiling
Capture Chrome DevTools performance profiles during browser automation for performance analysis.
**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
## Contents
- [Basic Profiling](#basic-profiling)
- [Profiler Commands](#profiler-commands)
- [Categories](#categories)
- [Use Cases](#use-cases)
- [Output Format](#output-format)
- [Viewing Profiles](#viewing-profiles)
- [Limitations](#limitations)
## Basic Profiling
```bash
# Start profiling
agent-browser profiler start
# Perform actions
agent-browser navigate https://example.com
agent-browser click "#button"
agent-browser wait 1000
# Stop and save
agent-browser profiler stop ./trace.json
```
## Profiler Commands
```bash
# Start profiling with default categories
agent-browser profiler start
# Start with custom trace categories
agent-browser profiler start --categories "devtools.timeline,v8.execute,blink.user_timing"
# Stop profiling and save to file
agent-browser profiler stop ./trace.json
```
## Categories
The `--categories` flag accepts a comma-separated list of Chrome trace categories. Default categories include:
- `devtools.timeline` -- standard DevTools performance traces
- `v8.execute` -- time spent running JavaScript
- `blink` -- renderer events
- `blink.user_timing` -- `performance.mark()` / `performance.measure()` calls
- `latencyInfo` -- input-to-latency tracking
- `renderer.scheduler` -- task scheduling and execution
- `toplevel` -- broad-spectrum basic events
Several `disabled-by-default-*` categories are also included for detailed timeline, call stack, and V8 CPU profiling data.
## Use Cases
### Diagnosing Slow Page Loads
```bash
agent-browser profiler start
agent-browser navigate https://app.example.com
agent-browser wait --load networkidle
agent-browser profiler stop ./page-load-profile.json
```
### Profiling User Interactions
```bash
agent-browser navigate https://app.example.com
agent-browser profiler start
agent-browser click "#submit"
agent-browser wait 2000
agent-browser profiler stop ./interaction-profile.json
```
### CI Performance Regression Checks
```bash
#!/bin/bash
agent-browser profiler start
agent-browser navigate https://app.example.com
agent-browser wait --load networkidle
agent-browser profiler stop "./profiles/build-${BUILD_ID}.json"
```
## Output Format
The output is a JSON file in Chrome Trace Event format:
```json
{
"traceEvents": [
{ "cat": "devtools.timeline", "name": "RunTask", "ph": "X", "ts": 12345, "dur": 100 },
...
],
"metadata": {
"clock-domain": "LINUX_CLOCK_MONOTONIC"
}
}
```
The `metadata.clock-domain` field is set based on the host platform (Linux or macOS). On Windows it is omitted.
## Viewing Profiles
Load the output JSON file in any of these tools:
- **Chrome DevTools**: Performance panel > Load profile (Ctrl+Shift+I > Performance)
- **Perfetto UI**: https://ui.perfetto.dev/ -- drag and drop the JSON file
- **Trace Viewer**: `chrome://tracing` in any Chromium browser
## Limitations
- Only works with Chromium-based browsers (Chrome, Edge). Not supported on Firefox or WebKit.
- Trace data accumulates in memory while profiling is active (capped at 5 million events). Stop profiling promptly after the area of interest.
- Data collection on stop has a 30-second timeout. If the browser is unresponsive, the stop command may fail.

View File

@@ -0,0 +1,194 @@
# Proxy Support
Proxy configuration for geo-testing, rate limiting avoidance, and corporate environments.
**Related**: [commands.md](commands.md) for global options, [SKILL.md](../SKILL.md) for quick start.
## Contents
- [Basic Proxy Configuration](#basic-proxy-configuration)
- [Authenticated Proxy](#authenticated-proxy)
- [SOCKS Proxy](#socks-proxy)
- [Proxy Bypass](#proxy-bypass)
- [Common Use Cases](#common-use-cases)
- [Verifying Proxy Connection](#verifying-proxy-connection)
- [Troubleshooting](#troubleshooting)
- [Best Practices](#best-practices)
## Basic Proxy Configuration
Use the `--proxy` flag or set proxy via environment variable:
```bash
# Via CLI flag
agent-browser --proxy "http://proxy.example.com:8080" open https://example.com
# Via environment variable
export HTTP_PROXY="http://proxy.example.com:8080"
agent-browser open https://example.com
# HTTPS proxy
export HTTPS_PROXY="https://proxy.example.com:8080"
agent-browser open https://example.com
# Both
export HTTP_PROXY="http://proxy.example.com:8080"
export HTTPS_PROXY="http://proxy.example.com:8080"
agent-browser open https://example.com
```
## Authenticated Proxy
For proxies requiring authentication:
```bash
# Include credentials in URL
export HTTP_PROXY="http://username:password@proxy.example.com:8080"
agent-browser open https://example.com
```
## SOCKS Proxy
```bash
# SOCKS5 proxy
export ALL_PROXY="socks5://proxy.example.com:1080"
agent-browser open https://example.com
# SOCKS5 with auth
export ALL_PROXY="socks5://user:pass@proxy.example.com:1080"
agent-browser open https://example.com
```
## Proxy Bypass
Skip proxy for specific domains using `--proxy-bypass` or `NO_PROXY`:
```bash
# Via CLI flag
agent-browser --proxy "http://proxy.example.com:8080" --proxy-bypass "localhost,*.internal.com" open https://example.com
# Via environment variable
export NO_PROXY="localhost,127.0.0.1,.internal.company.com"
agent-browser open https://internal.company.com # Direct connection
agent-browser open https://external.com # Via proxy
```
## Common Use Cases
### Geo-Location Testing
```bash
#!/bin/bash
# Test site from different regions using geo-located proxies
PROXIES=(
"http://us-proxy.example.com:8080"
"http://eu-proxy.example.com:8080"
"http://asia-proxy.example.com:8080"
)
for proxy in "${PROXIES[@]}"; do
export HTTP_PROXY="$proxy"
export HTTPS_PROXY="$proxy"
region=$(echo "$proxy" | grep -oP '^\w+-\w+')
echo "Testing from: $region"
agent-browser --session "$region" open https://example.com
agent-browser --session "$region" screenshot "./screenshots/$region.png"
agent-browser --session "$region" close
done
```
### Rotating Proxies for Scraping
```bash
#!/bin/bash
# Rotate through proxy list to avoid rate limiting
PROXY_LIST=(
"http://proxy1.example.com:8080"
"http://proxy2.example.com:8080"
"http://proxy3.example.com:8080"
)
URLS=(
"https://site.com/page1"
"https://site.com/page2"
"https://site.com/page3"
)
for i in "${!URLS[@]}"; do
proxy_index=$((i % ${#PROXY_LIST[@]}))
export HTTP_PROXY="${PROXY_LIST[$proxy_index]}"
export HTTPS_PROXY="${PROXY_LIST[$proxy_index]}"
agent-browser open "${URLS[$i]}"
agent-browser get text body > "output-$i.txt"
agent-browser close
sleep 1 # Polite delay
done
```
### Corporate Network Access
```bash
#!/bin/bash
# Access internal sites via corporate proxy
export HTTP_PROXY="http://corpproxy.company.com:8080"
export HTTPS_PROXY="http://corpproxy.company.com:8080"
export NO_PROXY="localhost,127.0.0.1,.company.com"
# External sites go through proxy
agent-browser open https://external-vendor.com
# Internal sites bypass proxy
agent-browser open https://intranet.company.com
```
## Verifying Proxy Connection
```bash
# Check your apparent IP
agent-browser open https://httpbin.org/ip
agent-browser get text body
# Should show proxy's IP, not your real IP
```
## Troubleshooting
### Proxy Connection Failed
```bash
# Test proxy connectivity first
curl -x http://proxy.example.com:8080 https://httpbin.org/ip
# Check if proxy requires auth
export HTTP_PROXY="http://user:pass@proxy.example.com:8080"
```
### SSL/TLS Errors Through Proxy
Some proxies perform SSL inspection. If you encounter certificate errors:
```bash
# For testing only - not recommended for production
agent-browser open https://example.com --ignore-https-errors
```
### Slow Performance
```bash
# Use proxy only when necessary
export NO_PROXY="*.cdn.com,*.static.com" # Direct CDN access
```
## Best Practices
1. **Use environment variables** - Don't hardcode proxy credentials
2. **Set NO_PROXY appropriately** - Avoid routing local traffic through proxy
3. **Test proxy before automation** - Verify connectivity with simple requests
4. **Handle proxy failures gracefully** - Implement retry logic for unstable proxies
5. **Rotate proxies for large scraping jobs** - Distribute load and avoid bans

View File

@@ -0,0 +1,193 @@
# Session Management
Multiple isolated browser sessions with state persistence and concurrent browsing.
**Related**: [authentication.md](authentication.md) for login patterns, [SKILL.md](../SKILL.md) for quick start.
## Contents
- [Named Sessions](#named-sessions)
- [Session Isolation Properties](#session-isolation-properties)
- [Session State Persistence](#session-state-persistence)
- [Common Patterns](#common-patterns)
- [Default Session](#default-session)
- [Session Cleanup](#session-cleanup)
- [Best Practices](#best-practices)
## Named Sessions
Use `--session` flag to isolate browser contexts:
```bash
# Session 1: Authentication flow
agent-browser --session auth open https://app.example.com/login
# Session 2: Public browsing (separate cookies, storage)
agent-browser --session public open https://example.com
# Commands are isolated by session
agent-browser --session auth fill @e1 "user@example.com"
agent-browser --session public get text body
```
## Session Isolation Properties
Each session has independent:
- Cookies
- LocalStorage / SessionStorage
- IndexedDB
- Cache
- Browsing history
- Open tabs
## Session State Persistence
### Save Session State
```bash
# Save cookies, storage, and auth state
agent-browser state save /path/to/auth-state.json
```
### Load Session State
```bash
# Restore saved state
agent-browser state load /path/to/auth-state.json
# Continue with authenticated session
agent-browser open https://app.example.com/dashboard
```
### State File Contents
```json
{
"cookies": [...],
"localStorage": {...},
"sessionStorage": {...},
"origins": [...]
}
```
## Common Patterns
### Authenticated Session Reuse
```bash
#!/bin/bash
# Save login state once, reuse many times
STATE_FILE="/tmp/auth-state.json"
# Check if we have saved state
if [[ -f "$STATE_FILE" ]]; then
agent-browser state load "$STATE_FILE"
agent-browser open https://app.example.com/dashboard
else
# Perform login
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "$USERNAME"
agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --load networkidle
# Save for future use
agent-browser state save "$STATE_FILE"
fi
```
### Concurrent Scraping
```bash
#!/bin/bash
# Scrape multiple sites concurrently
# Start all sessions
agent-browser --session site1 open https://site1.com &
agent-browser --session site2 open https://site2.com &
agent-browser --session site3 open https://site3.com &
wait
# Extract from each
agent-browser --session site1 get text body > site1.txt
agent-browser --session site2 get text body > site2.txt
agent-browser --session site3 get text body > site3.txt
# Cleanup
agent-browser --session site1 close
agent-browser --session site2 close
agent-browser --session site3 close
```
### A/B Testing Sessions
```bash
# Test different user experiences
agent-browser --session variant-a open "https://app.com?variant=a"
agent-browser --session variant-b open "https://app.com?variant=b"
# Compare
agent-browser --session variant-a screenshot /tmp/variant-a.png
agent-browser --session variant-b screenshot /tmp/variant-b.png
```
## Default Session
When `--session` is omitted, commands use the default session:
```bash
# These use the same default session
agent-browser open https://example.com
agent-browser snapshot -i
agent-browser close # Closes default session
```
## Session Cleanup
```bash
# Close specific session
agent-browser --session auth close
# List active sessions
agent-browser session list
```
## Best Practices
### 1. Name Sessions Semantically
```bash
# GOOD: Clear purpose
agent-browser --session github-auth open https://github.com
agent-browser --session docs-scrape open https://docs.example.com
# AVOID: Generic names
agent-browser --session s1 open https://github.com
```
### 2. Always Clean Up
```bash
# Close sessions when done
agent-browser --session auth close
agent-browser --session scrape close
```
### 3. Handle State Files Securely
```bash
# Don't commit state files (contain auth tokens!)
echo "*.auth-state.json" >> .gitignore
# Delete after use
rm /tmp/auth-state.json
```
### 4. Timeout Long Sessions
```bash
# Set timeout for automated scripts
timeout 60 agent-browser --session long-task get text body
```

View File

@@ -0,0 +1,194 @@
# Snapshot and Refs
Compact element references that reduce context usage dramatically for AI agents.
**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
## Contents
- [How Refs Work](#how-refs-work)
- [Snapshot Command](#the-snapshot-command)
- [Using Refs](#using-refs)
- [Ref Lifecycle](#ref-lifecycle)
- [Best Practices](#best-practices)
- [Ref Notation Details](#ref-notation-details)
- [Troubleshooting](#troubleshooting)
## How Refs Work
Traditional approach:
```
Full DOM/HTML -> AI parses -> CSS selector -> Action (~3000-5000 tokens)
```
agent-browser approach:
```
Compact snapshot -> @refs assigned -> Direct interaction (~200-400 tokens)
```
## The Snapshot Command
```bash
# Basic snapshot (shows page structure)
agent-browser snapshot
# Interactive snapshot (-i flag) - RECOMMENDED
agent-browser snapshot -i
```
### Snapshot Output Format
```
Page: Example Site - Home
URL: https://example.com
@e1 [header]
@e2 [nav]
@e3 [a] "Home"
@e4 [a] "Products"
@e5 [a] "About"
@e6 [button] "Sign In"
@e7 [main]
@e8 [h1] "Welcome"
@e9 [form]
@e10 [input type="email"] placeholder="Email"
@e11 [input type="password"] placeholder="Password"
@e12 [button type="submit"] "Log In"
@e13 [footer]
@e14 [a] "Privacy Policy"
```
## Using Refs
Once you have refs, interact directly:
```bash
# Click the "Sign In" button
agent-browser click @e6
# Fill email input
agent-browser fill @e10 "user@example.com"
# Fill password
agent-browser fill @e11 "password123"
# Submit the form
agent-browser click @e12
```
## Ref Lifecycle
**IMPORTANT**: Refs are invalidated when the page changes!
```bash
# Get initial snapshot
agent-browser snapshot -i
# @e1 [button] "Next"
# Click triggers page change
agent-browser click @e1
# MUST re-snapshot to get new refs!
agent-browser snapshot -i
# @e1 [h1] "Page 2" <- Different element now!
```
## Best Practices
### 1. Always Snapshot Before Interacting
```bash
# CORRECT
agent-browser open https://example.com
agent-browser snapshot -i # Get refs first
agent-browser click @e1 # Use ref
# WRONG
agent-browser open https://example.com
agent-browser click @e1 # Ref doesn't exist yet!
```
### 2. Re-Snapshot After Navigation
```bash
agent-browser click @e5 # Navigates to new page
agent-browser snapshot -i # Get new refs
agent-browser click @e1 # Use new refs
```
### 3. Re-Snapshot After Dynamic Changes
```bash
agent-browser click @e1 # Opens dropdown
agent-browser snapshot -i # See dropdown items
agent-browser click @e7 # Select item
```
### 4. Snapshot Specific Regions
For complex pages, snapshot specific areas:
```bash
# Snapshot just the form
agent-browser snapshot @e9
```
## Ref Notation Details
```
@e1 [tag type="value"] "text content" placeholder="hint"
| | | | |
| | | | +- Additional attributes
| | | +- Visible text
| | +- Key attributes shown
| +- HTML tag name
+- Unique ref ID
```
### Common Patterns
```
@e1 [button] "Submit" # Button with text
@e2 [input type="email"] # Email input
@e3 [input type="password"] # Password input
@e4 [a href="/page"] "Link Text" # Anchor link
@e5 [select] # Dropdown
@e6 [textarea] placeholder="Message" # Text area
@e7 [div class="modal"] # Container (when relevant)
@e8 [img alt="Logo"] # Image
@e9 [checkbox] checked # Checked checkbox
@e10 [radio] selected # Selected radio
```
## Troubleshooting
### "Ref not found" Error
```bash
# Ref may have changed - re-snapshot
agent-browser snapshot -i
```
### Element Not Visible in Snapshot
```bash
# Scroll down to reveal element
agent-browser scroll down 1000
agent-browser snapshot -i
# Or wait for dynamic content
agent-browser wait 1000
agent-browser snapshot -i
```
### Too Many Elements
```bash
# Snapshot specific container
agent-browser snapshot @e5
# Or use get text for content-only extraction
agent-browser get text @e5
```

View File

@@ -0,0 +1,173 @@
# Video Recording
Capture browser automation as video for debugging, documentation, or verification.
**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
## Contents
- [Basic Recording](#basic-recording)
- [Recording Commands](#recording-commands)
- [Use Cases](#use-cases)
- [Best Practices](#best-practices)
- [Output Format](#output-format)
- [Limitations](#limitations)
## Basic Recording
```bash
# Start recording
agent-browser record start ./demo.webm
# Perform actions
agent-browser open https://example.com
agent-browser snapshot -i
agent-browser click @e1
agent-browser fill @e2 "test input"
# Stop and save
agent-browser record stop
```
## Recording Commands
```bash
# Start recording to file
agent-browser record start ./output.webm
# Stop current recording
agent-browser record stop
# Restart with new file (stops current + starts new)
agent-browser record restart ./take2.webm
```
## Use Cases
### Debugging Failed Automation
```bash
#!/bin/bash
# Record automation for debugging
agent-browser record start ./debug-$(date +%Y%m%d-%H%M%S).webm
# Run your automation
agent-browser open https://app.example.com
agent-browser snapshot -i
agent-browser click @e1 || {
echo "Click failed - check recording"
agent-browser record stop
exit 1
}
agent-browser record stop
```
### Documentation Generation
```bash
#!/bin/bash
# Record workflow for documentation
agent-browser record start ./docs/how-to-login.webm
agent-browser open https://app.example.com/login
agent-browser wait 1000 # Pause for visibility
agent-browser snapshot -i
agent-browser fill @e1 "demo@example.com"
agent-browser wait 500
agent-browser fill @e2 "password"
agent-browser wait 500
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser wait 1000 # Show result
agent-browser record stop
```
### CI/CD Test Evidence
```bash
#!/bin/bash
# Record E2E test runs for CI artifacts
TEST_NAME="${1:-e2e-test}"
RECORDING_DIR="./test-recordings"
mkdir -p "$RECORDING_DIR"
agent-browser record start "$RECORDING_DIR/$TEST_NAME-$(date +%s).webm"
# Run test
if run_e2e_test; then
echo "Test passed"
else
echo "Test failed - recording saved"
fi
agent-browser record stop
```
## Best Practices
### 1. Add Pauses for Clarity
```bash
# Slow down for human viewing
agent-browser click @e1
agent-browser wait 500 # Let viewer see result
```
### 2. Use Descriptive Filenames
```bash
# Include context in filename
agent-browser record start ./recordings/login-flow-2024-01-15.webm
agent-browser record start ./recordings/checkout-test-run-42.webm
```
### 3. Handle Recording in Error Cases
```bash
#!/bin/bash
set -e
cleanup() {
agent-browser record stop 2>/dev/null || true
agent-browser close 2>/dev/null || true
}
trap cleanup EXIT
agent-browser record start ./automation.webm
# ... automation steps ...
```
### 4. Combine with Screenshots
```bash
# Record video AND capture key frames
agent-browser record start ./flow.webm
agent-browser open https://example.com
agent-browser screenshot ./screenshots/step1-homepage.png
agent-browser click @e1
agent-browser screenshot ./screenshots/step2-after-click.png
agent-browser record stop
```
## Output Format
- Default format: WebM (VP8/VP9 codec)
- Compatible with all modern browsers and video players
- Compressed but high quality
## Limitations
- Recording adds slight overhead to automation
- Large recordings can consume significant disk space
- Some headless environments may have codec limitations

View File

@@ -0,0 +1,105 @@
#!/bin/bash
# Template: Authenticated Session Workflow
# Purpose: Login once, save state, reuse for subsequent runs
# Usage: ./authenticated-session.sh <login-url> [state-file]
#
# RECOMMENDED: Use the auth vault instead of this template:
# echo "<pass>" | agent-browser auth save myapp --url <login-url> --username <user> --password-stdin
# agent-browser auth login myapp
# The auth vault stores credentials securely and the LLM never sees passwords.
#
# Environment variables:
# APP_USERNAME - Login username/email
# APP_PASSWORD - Login password
#
# Two modes:
# 1. Discovery mode (default): Shows form structure so you can identify refs
# 2. Login mode: Performs actual login after you update the refs
#
# Setup steps:
# 1. Run once to see form structure (discovery mode)
# 2. Update refs in LOGIN FLOW section below
# 3. Set APP_USERNAME and APP_PASSWORD
# 4. Delete the DISCOVERY section
set -euo pipefail
LOGIN_URL="${1:?Usage: $0 <login-url> [state-file]}"
STATE_FILE="${2:-./auth-state.json}"
echo "Authentication workflow: $LOGIN_URL"
# ================================================================
# SAVED STATE: Skip login if valid saved state exists
# ================================================================
if [[ -f "$STATE_FILE" ]]; then
echo "Loading saved state from $STATE_FILE..."
if agent-browser --state "$STATE_FILE" open "$LOGIN_URL" 2>/dev/null; then
agent-browser wait --load networkidle
CURRENT_URL=$(agent-browser get url)
if [[ "$CURRENT_URL" != *"login"* ]] && [[ "$CURRENT_URL" != *"signin"* ]]; then
echo "Session restored successfully"
agent-browser snapshot -i
exit 0
fi
echo "Session expired, performing fresh login..."
agent-browser close 2>/dev/null || true
else
echo "Failed to load state, re-authenticating..."
fi
rm -f "$STATE_FILE"
fi
# ================================================================
# DISCOVERY MODE: Shows form structure (delete after setup)
# ================================================================
echo "Opening login page..."
agent-browser open "$LOGIN_URL"
agent-browser wait --load networkidle
echo ""
echo "Login form structure:"
echo "---"
agent-browser snapshot -i
echo "---"
echo ""
echo "Next steps:"
echo " 1. Note the refs: username=@e?, password=@e?, submit=@e?"
echo " 2. Update the LOGIN FLOW section below with your refs"
echo " 3. Set: export APP_USERNAME='...' APP_PASSWORD='...'"
echo " 4. Delete this DISCOVERY MODE section"
echo ""
agent-browser close
exit 0
# ================================================================
# LOGIN FLOW: Uncomment and customize after discovery
# ================================================================
# : "${APP_USERNAME:?Set APP_USERNAME environment variable}"
# : "${APP_PASSWORD:?Set APP_PASSWORD environment variable}"
#
# agent-browser open "$LOGIN_URL"
# agent-browser wait --load networkidle
# agent-browser snapshot -i
#
# # Fill credentials (update refs to match your form)
# agent-browser fill @e1 "$APP_USERNAME"
# agent-browser fill @e2 "$APP_PASSWORD"
# agent-browser click @e3
# agent-browser wait --load networkidle
#
# # Verify login succeeded
# FINAL_URL=$(agent-browser get url)
# if [[ "$FINAL_URL" == *"login"* ]] || [[ "$FINAL_URL" == *"signin"* ]]; then
# echo "Login failed - still on login page"
# agent-browser screenshot /tmp/login-failed.png
# agent-browser close
# exit 1
# fi
#
# # Save state for future runs
# echo "Saving state to $STATE_FILE"
# agent-browser state save "$STATE_FILE"
# echo "Login successful"
# agent-browser snapshot -i

View File

@@ -0,0 +1,69 @@
#!/bin/bash
# Template: Content Capture Workflow
# Purpose: Extract content from web pages (text, screenshots, PDF)
# Usage: ./capture-workflow.sh <url> [output-dir]
#
# Outputs:
# - page-full.png: Full page screenshot
# - page-structure.txt: Page element structure with refs
# - page-text.txt: All text content
# - page.pdf: PDF version
#
# Optional: Load auth state for protected pages
set -euo pipefail
TARGET_URL="${1:?Usage: $0 <url> [output-dir]}"
OUTPUT_DIR="${2:-.}"
echo "Capturing: $TARGET_URL"
mkdir -p "$OUTPUT_DIR"
# Optional: Load authentication state
# if [[ -f "./auth-state.json" ]]; then
# echo "Loading authentication state..."
# agent-browser state load "./auth-state.json"
# fi
# Navigate to target
agent-browser open "$TARGET_URL"
agent-browser wait --load networkidle
# Get metadata
TITLE=$(agent-browser get title)
URL=$(agent-browser get url)
echo "Title: $TITLE"
echo "URL: $URL"
# Capture full page screenshot
agent-browser screenshot --full "$OUTPUT_DIR/page-full.png"
echo "Saved: $OUTPUT_DIR/page-full.png"
# Get page structure with refs
agent-browser snapshot -i > "$OUTPUT_DIR/page-structure.txt"
echo "Saved: $OUTPUT_DIR/page-structure.txt"
# Extract all text content
agent-browser get text body > "$OUTPUT_DIR/page-text.txt"
echo "Saved: $OUTPUT_DIR/page-text.txt"
# Save as PDF
agent-browser pdf "$OUTPUT_DIR/page.pdf"
echo "Saved: $OUTPUT_DIR/page.pdf"
# Optional: Extract specific elements using refs from structure
# agent-browser get text @e5 > "$OUTPUT_DIR/main-content.txt"
# Optional: Handle infinite scroll pages
# for i in {1..5}; do
# agent-browser scroll down 1000
# agent-browser wait 1000
# done
# agent-browser screenshot --full "$OUTPUT_DIR/page-scrolled.png"
# Cleanup
agent-browser close
echo ""
echo "Capture complete:"
ls -la "$OUTPUT_DIR"

View File

@@ -0,0 +1,62 @@
#!/bin/bash
# Template: Form Automation Workflow
# Purpose: Fill and submit web forms with validation
# Usage: ./form-automation.sh <form-url>
#
# This template demonstrates the snapshot-interact-verify pattern:
# 1. Navigate to form
# 2. Snapshot to get element refs
# 3. Fill fields using refs
# 4. Submit and verify result
#
# Customize: Update the refs (@e1, @e2, etc.) based on your form's snapshot output
set -euo pipefail
FORM_URL="${1:?Usage: $0 <form-url>}"
echo "Form automation: $FORM_URL"
# Step 1: Navigate to form
agent-browser open "$FORM_URL"
agent-browser wait --load networkidle
# Step 2: Snapshot to discover form elements
echo ""
echo "Form structure:"
agent-browser snapshot -i
# Step 3: Fill form fields (customize these refs based on snapshot output)
#
# Common field types:
# agent-browser fill @e1 "John Doe" # Text input
# agent-browser fill @e2 "user@example.com" # Email input
# agent-browser fill @e3 "SecureP@ss123" # Password input
# agent-browser select @e4 "Option Value" # Dropdown
# agent-browser check @e5 # Checkbox
# agent-browser click @e6 # Radio button
# agent-browser fill @e7 "Multi-line text" # Textarea
# agent-browser upload @e8 /path/to/file.pdf # File upload
#
# Uncomment and modify:
# agent-browser fill @e1 "Test User"
# agent-browser fill @e2 "test@example.com"
# agent-browser click @e3 # Submit button
# Step 4: Wait for submission
# agent-browser wait --load networkidle
# agent-browser wait --url "**/success" # Or wait for redirect
# Step 5: Verify result
echo ""
echo "Result:"
agent-browser get url
agent-browser snapshot -i
# Optional: Capture evidence
agent-browser screenshot /tmp/form-result.png
echo "Screenshot saved: /tmp/form-result.png"
# Cleanup
agent-browser close
echo "Done"

View File

@@ -0,0 +1,278 @@
---
name: agent-native-audit
description: Run comprehensive agent-native architecture review with scored principles
argument-hint: "[optional: specific principle to audit]"
disable-model-invocation: true
---
# Agent-Native Architecture Audit
Conduct a comprehensive review of the codebase against agent-native architecture principles, launching parallel sub-agents for each principle and producing a scored report.
## Core Principles to Audit
1. **Action Parity** - "Whatever the user can do, the agent can do"
2. **Tools as Primitives** - "Tools provide capability, not behavior"
3. **Context Injection** - "System prompt includes dynamic context about app state"
4. **Shared Workspace** - "Agent and user work in the same data space"
5. **CRUD Completeness** - "Every entity has full CRUD (Create, Read, Update, Delete)"
6. **UI Integration** - "Agent actions immediately reflected in UI"
7. **Capability Discovery** - "Users can discover what the agent can do"
8. **Prompt-Native Features** - "Features are prompts defining outcomes, not code"
## Workflow
### Step 1: Load the Agent-Native Skill
First, invoke the agent-native-architecture skill to understand all principles:
```
/compound-engineering:agent-native-architecture
```
Select option 7 (action parity) to load the full reference material.
### Step 2: Launch Parallel Sub-Agents
Launch 8 parallel sub-agents using the Task tool with `subagent_type: Explore`, one for each principle. Each agent should:
1. Enumerate ALL instances in the codebase (user actions, tools, contexts, data stores, etc.)
2. Check compliance against the principle
3. Provide a SPECIFIC SCORE like "X out of Y (percentage%)"
4. List specific gaps and recommendations
<sub-agents>
**Agent 1: Action Parity**
```
Audit for ACTION PARITY - "Whatever the user can do, the agent can do."
Tasks:
1. Enumerate ALL user actions in frontend (API calls, button clicks, form submissions)
- Search for API service files, fetch calls, form handlers
- Check routes and components for user interactions
2. Check which have corresponding agent tools
- Search for agent tool definitions
- Map user actions to agent capabilities
3. Score: "Agent can do X out of Y user actions"
Format:
## Action Parity Audit
### User Actions Found
| Action | Location | Agent Tool | Status |
### Score: X/Y (percentage%)
### Missing Agent Tools
### Recommendations
```
**Agent 2: Tools as Primitives**
```
Audit for TOOLS AS PRIMITIVES - "Tools provide capability, not behavior."
Tasks:
1. Find and read ALL agent tool files
2. Classify each as:
- PRIMITIVE (good): read, write, store, list - enables capability without business logic
- WORKFLOW (bad): encodes business logic, makes decisions, orchestrates steps
3. Score: "X out of Y tools are proper primitives"
Format:
## Tools as Primitives Audit
### Tool Analysis
| Tool | File | Type | Reasoning |
### Score: X/Y (percentage%)
### Problematic Tools (workflows that should be primitives)
### Recommendations
```
**Agent 3: Context Injection**
```
Audit for CONTEXT INJECTION - "System prompt includes dynamic context about app state"
Tasks:
1. Find context injection code (search for "context", "system prompt", "inject")
2. Read agent prompts and system messages
3. Enumerate what IS injected vs what SHOULD be:
- Available resources (files, drafts, documents)
- User preferences/settings
- Recent activity
- Available capabilities listed
- Session history
- Workspace state
Format:
## Context Injection Audit
### Context Types Analysis
| Context Type | Injected? | Location | Notes |
### Score: X/Y (percentage%)
### Missing Context
### Recommendations
```
**Agent 4: Shared Workspace**
```
Audit for SHARED WORKSPACE - "Agent and user work in the same data space"
Tasks:
1. Identify all data stores/tables/models
2. Check if agents read/write to SAME tables or separate ones
3. Look for sandbox isolation anti-pattern (agent has separate data space)
Format:
## Shared Workspace Audit
### Data Store Analysis
| Data Store | User Access | Agent Access | Shared? |
### Score: X/Y (percentage%)
### Isolated Data (anti-pattern)
### Recommendations
```
**Agent 5: CRUD Completeness**
```
Audit for CRUD COMPLETENESS - "Every entity has full CRUD"
Tasks:
1. Identify all entities/models in the codebase
2. For each entity, check if agent tools exist for:
- Create
- Read
- Update
- Delete
3. Score per entity and overall
Format:
## CRUD Completeness Audit
### Entity CRUD Analysis
| Entity | Create | Read | Update | Delete | Score |
### Overall Score: X/Y entities with full CRUD (percentage%)
### Incomplete Entities (list missing operations)
### Recommendations
```
**Agent 6: UI Integration**
```
Audit for UI INTEGRATION - "Agent actions immediately reflected in UI"
Tasks:
1. Check how agent writes/changes propagate to frontend
2. Look for:
- Streaming updates (SSE, WebSocket)
- Polling mechanisms
- Shared state/services
- Event buses
- File watching
3. Identify "silent actions" anti-pattern (agent changes state but UI doesn't update)
Format:
## UI Integration Audit
### Agent Action → UI Update Analysis
| Agent Action | UI Mechanism | Immediate? | Notes |
### Score: X/Y (percentage%)
### Silent Actions (anti-pattern)
### Recommendations
```
**Agent 7: Capability Discovery**
```
Audit for CAPABILITY DISCOVERY - "Users can discover what the agent can do"
Tasks:
1. Check for these 7 discovery mechanisms:
- Onboarding flow showing agent capabilities
- Help documentation
- Capability hints in UI
- Agent self-describes in responses
- Suggested prompts/actions
- Empty state guidance
- Slash commands (/help, /tools)
2. Score against 7 mechanisms
Format:
## Capability Discovery Audit
### Discovery Mechanism Analysis
| Mechanism | Exists? | Location | Quality |
### Score: X/7 (percentage%)
### Missing Discovery
### Recommendations
```
**Agent 8: Prompt-Native Features**
```
Audit for PROMPT-NATIVE FEATURES - "Features are prompts defining outcomes, not code"
Tasks:
1. Read all agent prompts
2. Classify each feature/behavior as defined in:
- PROMPT (good): outcomes defined in natural language
- CODE (bad): business logic hardcoded
3. Check if behavior changes require prompt edit vs code change
Format:
## Prompt-Native Features Audit
### Feature Definition Analysis
| Feature | Defined In | Type | Notes |
### Score: X/Y (percentage%)
### Code-Defined Features (anti-pattern)
### Recommendations
```
</sub-agents>
### Step 3: Compile Summary Report
After all agents complete, compile a summary with:
```markdown
## Agent-Native Architecture Review: [Project Name]
### Overall Score Summary
| Core Principle | Score | Percentage | Status |
|----------------|-------|------------|--------|
| Action Parity | X/Y | Z% | ✅/⚠️/❌ |
| Tools as Primitives | X/Y | Z% | ✅/⚠️/❌ |
| Context Injection | X/Y | Z% | ✅/⚠️/❌ |
| Shared Workspace | X/Y | Z% | ✅/⚠️/❌ |
| CRUD Completeness | X/Y | Z% | ✅/⚠️/❌ |
| UI Integration | X/Y | Z% | ✅/⚠️/❌ |
| Capability Discovery | X/Y | Z% | ✅/⚠️/❌ |
| Prompt-Native Features | X/Y | Z% | ✅/⚠️/❌ |
**Overall Agent-Native Score: X%**
### Status Legend
- ✅ Excellent (80%+)
- ⚠️ Partial (50-79%)
- ❌ Needs Work (<50%)
### Top 10 Recommendations by Impact
| Priority | Action | Principle | Effort |
|----------|--------|-----------|--------|
### What's Working Excellently
[List top 5 strengths]
```
## Success Criteria
- [ ] All 8 sub-agents complete their audits
- [ ] Each principle has a specific numeric score (X/Y format)
- [ ] Summary table shows all scores and status indicators
- [ ] Top 10 recommendations are prioritized by impact
- [ ] Report identifies both strengths and gaps
## Optional: Single Principle Audit
If $ARGUMENTS specifies a single principle (e.g., "action parity"), only run that sub-agent and provide detailed findings for that principle alone.
Valid arguments:
- `action parity` or `1`
- `tools` or `primitives` or `2`
- `context` or `injection` or `3`
- `shared` or `workspace` or `4`
- `crud` or `5`
- `ui` or `integration` or `6`
- `discovery` or `7`
- `prompt` or `features` or `8`

View File

@@ -131,7 +131,7 @@ topic: <kebab-case-topic>
- [Any unresolved questions for the planning phase]
## Next Steps
`/workflows:plan` for implementation details
`/ce:plan` for implementation details
```
**Output Location:** `docs/brainstorms/YYYY-MM-DD-<topic>-brainstorm.md`
@@ -140,7 +140,7 @@ topic: <kebab-case-topic>
Present clear options for what to do next:
1. **Proceed to planning** → Run `/workflows:plan`
1. **Proceed to planning** → Run `/ce:plan`
2. **Refine further** → Continue exploring the design
3. **Done for now** → User will return later
@@ -187,4 +187,4 @@ Planning answers **HOW** to build it:
- Technical details and code patterns
- Testing strategy and verification
When brainstorm output exists, `/workflows:plan` should detect it and use it as input, skipping its own idea refinement phase.
When brainstorm output exists, `/ce:plan` should detect it and use it as input, skipping its own idea refinement phase.

View File

@@ -0,0 +1,145 @@
---
name: ce:brainstorm
description: Explore requirements and approaches through collaborative dialogue before planning implementation
argument-hint: "[feature idea or problem to explore]"
---
# Brainstorm a Feature or Improvement
**Note: The current year is 2026.** Use this when dating brainstorm documents.
Brainstorming helps answer **WHAT** to build through collaborative dialogue. It precedes `/ce:plan`, which answers **HOW** to build it.
**Process knowledge:** Load the `brainstorming` skill for detailed question techniques, approach exploration patterns, and YAGNI principles.
## Feature Description
<feature_description> #$ARGUMENTS </feature_description>
**If the feature description above is empty, ask the user:** "What would you like to explore? Please describe the feature, problem, or improvement you're thinking about."
Do not proceed until you have a feature description from the user.
## Execution Flow
### Phase 0: Assess Requirements Clarity
Evaluate whether brainstorming is needed based on the feature description.
**Clear requirements indicators:**
- Specific acceptance criteria provided
- Referenced existing patterns to follow
- Described exact expected behavior
- Constrained, well-defined scope
**If requirements are already clear:**
Use **AskUserQuestion tool** to suggest: "Your requirements seem detailed enough to proceed directly to planning. Should I run `/ce:plan` instead, or would you like to explore the idea further?"
### Phase 1: Understand the Idea
#### 1.1 Repository Research (Lightweight)
Run a quick repo scan to understand existing patterns:
- Task compound-engineering:research:repo-research-analyst("Understand existing patterns related to: <feature_description>")
Focus on: similar features, established patterns, CLAUDE.md guidance.
#### 1.2 Collaborative Dialogue
Use the **AskUserQuestion tool** to ask questions **one at a time**.
**Guidelines (see `brainstorming` skill for detailed techniques):**
- Prefer multiple choice when natural options exist
- Start broad (purpose, users) then narrow (constraints, edge cases)
- Validate assumptions explicitly
- Ask about success criteria
**Exit condition:** Continue until the idea is clear OR user says "proceed"
### Phase 2: Explore Approaches
Propose **2-3 concrete approaches** based on research and conversation.
For each approach, provide:
- Brief description (2-3 sentences)
- Pros and cons
- When it's best suited
Lead with your recommendation and explain why. Apply YAGNI—prefer simpler solutions.
Use **AskUserQuestion tool** to ask which approach the user prefers.
### Phase 3: Capture the Design
Write a brainstorm document to `docs/brainstorms/YYYY-MM-DD-<topic>-brainstorm.md`.
**Document structure:** See the `brainstorming` skill for the template format. Key sections: What We're Building, Why This Approach, Key Decisions, Open Questions.
Ensure `docs/brainstorms/` directory exists before writing.
**IMPORTANT:** Before proceeding to Phase 4, check if there are any Open Questions listed in the brainstorm document. If there are open questions, YOU MUST ask the user about each one using AskUserQuestion before offering to proceed to planning. Move resolved questions to a "Resolved Questions" section.
### Phase 4: Handoff
Use **AskUserQuestion tool** to present next steps:
**Question:** "Brainstorm captured. What would you like to do next?"
**Options:**
1. **Review and refine** - Improve the document through structured self-review
2. **Proceed to planning** - Run `/ce:plan` (will auto-detect this brainstorm)
3. **Share to Proof** - Upload to Proof for collaborative review and sharing
4. **Ask more questions** - I have more questions to clarify before moving on
5. **Done for now** - Return later
**If user selects "Share to Proof":**
```bash
CONTENT=$(cat docs/brainstorms/YYYY-MM-DD-<topic>-brainstorm.md)
TITLE="Brainstorm: <topic title>"
RESPONSE=$(curl -s -X POST https://www.proofeditor.ai/share/markdown \
-H "Content-Type: application/json" \
-d "$(jq -n --arg title "$TITLE" --arg markdown "$CONTENT" --arg by "ai:compound" '{title: $title, markdown: $markdown, by: $by}')")
PROOF_URL=$(echo "$RESPONSE" | jq -r '.tokenUrl')
```
Display the URL prominently: `View & collaborate in Proof: <PROOF_URL>`
If the curl fails, skip silently. Then return to the Phase 4 options.
**If user selects "Ask more questions":** YOU (Claude) return to Phase 1.2 (Collaborative Dialogue) and continue asking the USER questions one at a time to further refine the design. The user wants YOU to probe deeper - ask about edge cases, constraints, preferences, or areas not yet explored. Continue until the user is satisfied, then return to Phase 4.
**If user selects "Review and refine":**
Load the `document-review` skill and apply it to the brainstorm document.
When document-review returns "Review complete", present next steps:
1. **Move to planning** - Continue to `/ce:plan` with this document
2. **Done for now** - Brainstorming complete. To start planning later: `/ce:plan [document-path]`
## Output Summary
When complete, display:
```
Brainstorm complete!
Document: docs/brainstorms/YYYY-MM-DD-<topic>-brainstorm.md
Key decisions:
- [Decision 1]
- [Decision 2]
Next: Run `/ce:plan` when ready to implement.
```
## Important Guidelines
- **Stay focused on WHAT, not HOW** - Implementation details belong in the plan
- **Ask one question at a time** - Don't overwhelm
- **Apply YAGNI** - Prefer simpler approaches
- **Keep outputs concise** - 200-300 words per section max
NEVER CODE! Just explore and document decisions.

View File

@@ -0,0 +1,286 @@
---
name: ce:compound
description: Document a recently solved problem to compound your team's knowledge
argument-hint: "[optional: brief context about the fix]"
---
# /compound
Coordinate multiple subagents working in parallel to document a recently solved problem.
## Purpose
Captures problem solutions while context is fresh, creating structured documentation in `docs/solutions/` with YAML frontmatter for searchability and future reference. Uses parallel subagents for maximum efficiency.
**Why "compound"?** Each documented solution compounds your team's knowledge. The first time you solve a problem takes research. Document it, and the next occurrence takes minutes. Knowledge compounds.
## Usage
```bash
/ce:compound # Document the most recent fix
/ce:compound [brief context] # Provide additional context hint
```
## Execution Strategy
**Always run full mode by default.** Proceed directly to Phase 1 unless the user explicitly requests compact-safe mode (e.g., `/ce:compound --compact` or "use compact mode").
Compact-safe mode exists as a lightweight alternative — see the **Compact-Safe Mode** section below. It's there if the user wants it, not something to push.
---
### Full Mode
<critical_requirement>
**Only ONE file gets written - the final documentation.**
Phase 1 subagents return TEXT DATA to the orchestrator. They must NOT use Write, Edit, or create any files. Only the orchestrator (Phase 2) writes the final documentation file.
</critical_requirement>
### Phase 1: Parallel Research
<parallel_tasks>
Launch these subagents IN PARALLEL. Each returns text data to the orchestrator.
#### 1. **Context Analyzer**
- Extracts conversation history
- Identifies problem type, component, symptoms
- Validates against schema
- Returns: YAML frontmatter skeleton
#### 2. **Solution Extractor**
- Analyzes all investigation steps
- Identifies root cause
- Extracts working solution with code examples
- Returns: Solution content block
#### 3. **Related Docs Finder**
- Searches `docs/solutions/` for related documentation
- Identifies cross-references and links
- Finds related GitHub issues
- Returns: Links and relationships
#### 4. **Prevention Strategist**
- Develops prevention strategies
- Creates best practices guidance
- Generates test cases if applicable
- Returns: Prevention/testing content
#### 5. **Category Classifier**
- Determines optimal `docs/solutions/` category
- Validates category against schema
- Suggests filename based on slug
- Returns: Final path and filename
</parallel_tasks>
### Phase 2: Assembly & Write
<sequential_tasks>
**WAIT for all Phase 1 subagents to complete before proceeding.**
The orchestrating agent (main conversation) performs these steps:
1. Collect all text results from Phase 1 subagents
2. Assemble complete markdown file from the collected pieces
3. Validate YAML frontmatter against schema
4. Create directory if needed: `mkdir -p docs/solutions/[category]/`
5. Write the SINGLE final file: `docs/solutions/[category]/[filename].md`
</sequential_tasks>
### Phase 3: Optional Enhancement
**WAIT for Phase 2 to complete before proceeding.**
<parallel_tasks>
Based on problem type, optionally invoke specialized agents to review the documentation:
- **performance_issue** → `performance-oracle`
- **security_issue** → `security-sentinel`
- **database_issue** → `data-integrity-guardian`
- **test_failure** → `cora-test-reviewer`
- Any code-heavy issue → `kieran-rails-reviewer` + `code-simplicity-reviewer`
</parallel_tasks>
---
### Compact-Safe Mode
<critical_requirement>
**Single-pass alternative for context-constrained sessions.**
When context budget is tight, this mode skips parallel subagents entirely. The orchestrator performs all work in a single pass, producing a minimal but complete solution document.
</critical_requirement>
The orchestrator (main conversation) performs ALL of the following in one sequential pass:
1. **Extract from conversation**: Identify the problem, root cause, and solution from conversation history
2. **Classify**: Determine category and filename (same categories as full mode)
3. **Write minimal doc**: Create `docs/solutions/[category]/[filename].md` with:
- YAML frontmatter (title, category, date, tags)
- Problem description (1-2 sentences)
- Root cause (1-2 sentences)
- Solution with key code snippets
- One prevention tip
4. **Skip specialized agent reviews** (Phase 3) to conserve context
**Compact-safe output:**
```
✓ Documentation complete (compact-safe mode)
File created:
- docs/solutions/[category]/[filename].md
Note: This was created in compact-safe mode. For richer documentation
(cross-references, detailed prevention strategies, specialized reviews),
re-run /compound in a fresh session.
```
**No subagents are launched. No parallel tasks. One file written.**
---
## What It Captures
- **Problem symptom**: Exact error messages, observable behavior
- **Investigation steps tried**: What didn't work and why
- **Root cause analysis**: Technical explanation
- **Working solution**: Step-by-step fix with code examples
- **Prevention strategies**: How to avoid in future
- **Cross-references**: Links to related issues and docs
## Preconditions
<preconditions enforcement="advisory">
<check condition="problem_solved">
Problem has been solved (not in-progress)
</check>
<check condition="solution_verified">
Solution has been verified working
</check>
<check condition="non_trivial">
Non-trivial problem (not simple typo or obvious error)
</check>
</preconditions>
## What It Creates
**Organized documentation:**
- File: `docs/solutions/[category]/[filename].md`
**Categories auto-detected from problem:**
- build-errors/
- test-failures/
- runtime-errors/
- performance-issues/
- database-issues/
- security-issues/
- ui-bugs/
- integration-issues/
- logic-errors/
## Common Mistakes to Avoid
| ❌ Wrong | ✅ Correct |
|----------|-----------|
| Subagents write files like `context-analysis.md`, `solution-draft.md` | Subagents return text data; orchestrator writes one final file |
| Research and assembly run in parallel | Research completes → then assembly runs |
| Multiple files created during workflow | Single file: `docs/solutions/[category]/[filename].md` |
## Success Output
```
✓ Documentation complete
Subagent Results:
✓ Context Analyzer: Identified performance_issue in brief_system
✓ Solution Extractor: 3 code fixes
✓ Related Docs Finder: 2 related issues
✓ Prevention Strategist: Prevention strategies, test suggestions
✓ Category Classifier: `performance-issues`
Specialized Agent Reviews (Auto-Triggered):
✓ performance-oracle: Validated query optimization approach
✓ kieran-rails-reviewer: Code examples meet Rails standards
✓ code-simplicity-reviewer: Solution is appropriately minimal
✓ every-style-editor: Documentation style verified
File created:
- docs/solutions/performance-issues/n-plus-one-brief-generation.md
This documentation will be searchable for future reference when similar
issues occur in the Email Processing or Brief System modules.
What's next?
1. Continue workflow (recommended)
2. Link related documentation
3. Update other references
4. View documentation
5. Other
```
## The Compounding Philosophy
This creates a compounding knowledge system:
1. First time you solve "N+1 query in brief generation" → Research (30 min)
2. Document the solution → docs/solutions/performance-issues/n-plus-one-briefs.md (5 min)
3. Next time similar issue occurs → Quick lookup (2 min)
4. Knowledge compounds → Team gets smarter
The feedback loop:
```
Build → Test → Find Issue → Research → Improve → Document → Validate → Deploy
↑ ↓
└──────────────────────────────────────────────────────────────────────┘
```
**Each unit of engineering work should make subsequent units of work easier—not harder.**
## Auto-Invoke
<auto_invoke> <trigger_phrases> - "that worked" - "it's fixed" - "working now" - "problem solved" </trigger_phrases>
<manual_override> Use /ce:compound [context] to document immediately without waiting for auto-detection. </manual_override> </auto_invoke>
## Routes To
`compound-docs` skill
## Applicable Specialized Agents
Based on problem type, these agents can enhance documentation:
### Code Quality & Review
- **kieran-rails-reviewer**: Reviews code examples for Rails best practices
- **code-simplicity-reviewer**: Ensures solution code is minimal and clear
- **pattern-recognition-specialist**: Identifies anti-patterns or repeating issues
### Specific Domain Experts
- **performance-oracle**: Analyzes performance_issue category solutions
- **security-sentinel**: Reviews security_issue solutions for vulnerabilities
- **cora-test-reviewer**: Creates test cases for prevention strategies
- **data-integrity-guardian**: Reviews database_issue migrations and queries
### Enhancement & Documentation
- **best-practices-researcher**: Enriches solution with industry best practices
- **every-style-editor**: Reviews documentation style and clarity
- **framework-docs-researcher**: Links to Rails/gem documentation references
### When to Invoke
- **Auto-triggered** (optional): Agents can run post-documentation for enhancement
- **Manual trigger**: User can invoke agents after /ce:compound completes for deeper review
- **Customize agents**: Edit `compound-engineering.local.md` or invoke the `setup` skill to configure which review agents are used across all workflows
## Related Commands
- `/research [topic]` - Deep investigation (searches docs/solutions/ for patterns)
- `/ce:plan` - Planning workflow (references documented solutions)

View File

@@ -0,0 +1,641 @@
---
name: ce:plan
description: Transform feature descriptions into well-structured project plans following conventions
argument-hint: "[feature description, bug report, or improvement idea]"
---
# Create a plan for a new feature or bug fix
## Introduction
**Note: The current year is 2026.** Use this when dating plans and searching for recent documentation.
Transform feature descriptions, bug reports, or improvement ideas into well-structured markdown files issues that follow project conventions and best practices. This command provides flexible detail levels to match your needs.
## Feature Description
<feature_description> #$ARGUMENTS </feature_description>
**If the feature description above is empty, ask the user:** "What would you like to plan? Please describe the feature, bug fix, or improvement you have in mind."
Do not proceed until you have a clear feature description from the user.
### 0. Idea Refinement
**Check for brainstorm output first:**
Before asking questions, look for recent brainstorm documents in `docs/brainstorms/` that match this feature:
```bash
ls -la docs/brainstorms/*.md 2>/dev/null | head -10
```
**Relevance criteria:** A brainstorm is relevant if:
- The topic (from filename or YAML frontmatter) semantically matches the feature description
- Created within the last 14 days
- If multiple candidates match, use the most recent one
**If a relevant brainstorm exists:**
1. Read the brainstorm document **thoroughly** — every section matters
2. Announce: "Found brainstorm from [date]: [topic]. Using as foundation for planning."
3. Extract and carry forward **ALL** of the following into the plan:
- Key decisions and their rationale
- Chosen approach and why alternatives were rejected
- Constraints and requirements discovered during brainstorming
- Open questions (flag these for resolution during planning)
- Success criteria and scope boundaries
- Any specific technical choices or patterns discussed
4. **Skip the idea refinement questions below** — the brainstorm already answered WHAT to build
5. Use brainstorm content as the **primary input** to research and planning phases
6. **Critical: The brainstorm is the origin document.** Throughout the plan, reference specific decisions with `(see brainstorm: docs/brainstorms/<filename>)` when carrying forward conclusions. Do not paraphrase decisions in a way that loses their original context — link back to the source.
7. **Do not omit brainstorm content** — if the brainstorm discussed it, the plan must address it (even if briefly). Scan each brainstorm section before finalizing the plan to verify nothing was dropped.
**If multiple brainstorms could match:**
Use **AskUserQuestion tool** to ask which brainstorm to use, or whether to proceed without one.
**If no brainstorm found (or not relevant), run idea refinement:**
Refine the idea through collaborative dialogue using the **AskUserQuestion tool**:
- Ask questions one at a time to understand the idea fully
- Prefer multiple choice questions when natural options exist
- Focus on understanding: purpose, constraints and success criteria
- Continue until the idea is clear OR user says "proceed"
**Gather signals for research decision.** During refinement, note:
- **User's familiarity**: Do they know the codebase patterns? Are they pointing to examples?
- **User's intent**: Speed vs thoroughness? Exploration vs execution?
- **Topic risk**: Security, payments, external APIs warrant more caution
- **Uncertainty level**: Is the approach clear or open-ended?
**Skip option:** If the feature description is already detailed, offer:
"Your description is clear. Should I proceed with research, or would you like to refine it further?"
## Main Tasks
### 1. Local Research (Always Runs - Parallel)
<thinking>
First, I need to understand the project's conventions, existing patterns, and any documented learnings. This is fast and local - it informs whether external research is needed.
</thinking>
Run these agents **in parallel** to gather local context:
- Task compound-engineering:research:repo-research-analyst(feature_description)
- Task compound-engineering:research:learnings-researcher(feature_description)
**What to look for:**
- **Repo research:** existing patterns, CLAUDE.md guidance, technology familiarity, pattern consistency
- **Learnings:** documented solutions in `docs/solutions/` that might apply (gotchas, patterns, lessons learned)
These findings inform the next step.
### 1.5. Research Decision
Based on signals from Step 0 and findings from Step 1, decide on external research.
**High-risk topics → always research.** Security, payments, external APIs, data privacy. The cost of missing something is too high. This takes precedence over speed signals.
**Strong local context → skip external research.** Codebase has good patterns, CLAUDE.md has guidance, user knows what they want. External research adds little value.
**Uncertainty or unfamiliar territory → research.** User is exploring, codebase has no examples, new technology. External perspective is valuable.
**Announce the decision and proceed.** Brief explanation, then continue. User can redirect if needed.
Examples:
- "Your codebase has solid patterns for this. Proceeding without external research."
- "This involves payment processing, so I'll research current best practices first."
### 1.5b. External Research (Conditional)
**Only run if Step 1.5 indicates external research is valuable.**
Run these agents in parallel:
- Task compound-engineering:research:best-practices-researcher(feature_description)
- Task compound-engineering:research:framework-docs-researcher(feature_description)
### 1.6. Consolidate Research
After all research steps complete, consolidate findings:
- Document relevant file paths from repo research (e.g., `app/services/example_service.rb:42`)
- **Include relevant institutional learnings** from `docs/solutions/` (key insights, gotchas to avoid)
- Note external documentation URLs and best practices (if external research was done)
- List related issues or PRs discovered
- Capture CLAUDE.md conventions
**Optional validation:** Briefly summarize findings and ask if anything looks off or missing before proceeding to planning.
### 2. Issue Planning & Structure
<thinking>
Think like a product manager - what would make this issue clear and actionable? Consider multiple perspectives
</thinking>
**Title & Categorization:**
- [ ] Draft clear, searchable issue title using conventional format (e.g., `feat: Add user authentication`, `fix: Cart total calculation`)
- [ ] Determine issue type: enhancement, bug, refactor
- [ ] Convert title to filename: add today's date prefix, determine daily sequence number, strip prefix colon, kebab-case, add `-plan` suffix
- Scan `docs/plans/` for files matching today's date pattern `YYYY-MM-DD-\d{3}-`
- Find the highest existing sequence number for today
- Increment by 1, zero-padded to 3 digits (001, 002, etc.)
- Example: `feat: Add User Authentication``2026-01-21-001-feat-add-user-authentication-plan.md`
- Keep it descriptive (3-5 words after prefix) so plans are findable by context
**Stakeholder Analysis:**
- [ ] Identify who will be affected by this issue (end users, developers, operations)
- [ ] Consider implementation complexity and required expertise
**Content Planning:**
- [ ] Choose appropriate detail level based on issue complexity and audience
- [ ] List all necessary sections for the chosen template
- [ ] Gather supporting materials (error logs, screenshots, design mockups)
- [ ] Prepare code examples or reproduction steps if applicable, name the mock filenames in the lists
### 3. SpecFlow Analysis
After planning the issue structure, run SpecFlow Analyzer to validate and refine the feature specification:
- Task compound-engineering:workflow:spec-flow-analyzer(feature_description, research_findings)
**SpecFlow Analyzer Output:**
- [ ] Review SpecFlow analysis results
- [ ] Incorporate any identified gaps or edge cases into the issue
- [ ] Update acceptance criteria based on SpecFlow findings
### 4. Choose Implementation Detail Level
Select how comprehensive you want the issue to be, simpler is mostly better.
#### 📄 MINIMAL (Quick Issue)
**Best for:** Simple bugs, small improvements, clear features
**Includes:**
- Problem statement or feature description
- Basic acceptance criteria
- Essential context only
**Structure:**
````markdown
---
title: [Issue Title]
type: [feat|fix|refactor]
status: active
date: YYYY-MM-DD
origin: docs/brainstorms/YYYY-MM-DD-<topic>-brainstorm.md # if originated from brainstorm, otherwise omit
---
# [Issue Title]
[Brief problem/feature description]
## Acceptance Criteria
- [ ] Core requirement 1
- [ ] Core requirement 2
## Context
[Any critical information]
## MVP
### test.rb
```ruby
class Test
def initialize
@name = "test"
end
end
```
## Sources
- **Origin brainstorm:** [docs/brainstorms/YYYY-MM-DD-<topic>-brainstorm.md](path) — include if plan originated from a brainstorm
- Related issue: #[issue_number]
- Documentation: [relevant_docs_url]
````
#### 📋 MORE (Standard Issue)
**Best for:** Most features, complex bugs, team collaboration
**Includes everything from MINIMAL plus:**
- Detailed background and motivation
- Technical considerations
- Success metrics
- Dependencies and risks
- Basic implementation suggestions
**Structure:**
```markdown
---
title: [Issue Title]
type: [feat|fix|refactor]
status: active
date: YYYY-MM-DD
origin: docs/brainstorms/YYYY-MM-DD-<topic>-brainstorm.md # if originated from brainstorm, otherwise omit
---
# [Issue Title]
## Overview
[Comprehensive description]
## Problem Statement / Motivation
[Why this matters]
## Proposed Solution
[High-level approach]
## Technical Considerations
- Architecture impacts
- Performance implications
- Security considerations
## System-Wide Impact
- **Interaction graph**: [What callbacks/middleware/observers fire when this runs?]
- **Error propagation**: [How do errors flow across layers? Do retry strategies align?]
- **State lifecycle risks**: [Can partial failure leave orphaned/inconsistent state?]
- **API surface parity**: [What other interfaces expose similar functionality and need the same change?]
- **Integration test scenarios**: [Cross-layer scenarios that unit tests won't catch]
## Acceptance Criteria
- [ ] Detailed requirement 1
- [ ] Detailed requirement 2
- [ ] Testing requirements
## Success Metrics
[How we measure success]
## Dependencies & Risks
[What could block or complicate this]
## Sources & References
- **Origin brainstorm:** [docs/brainstorms/YYYY-MM-DD-<topic>-brainstorm.md](path) — include if plan originated from a brainstorm
- Similar implementations: [file_path:line_number]
- Best practices: [documentation_url]
- Related PRs: #[pr_number]
```
#### 📚 A LOT (Comprehensive Issue)
**Best for:** Major features, architectural changes, complex integrations
**Includes everything from MORE plus:**
- Detailed implementation plan with phases
- Alternative approaches considered
- Extensive technical specifications
- Resource requirements and timeline
- Future considerations and extensibility
- Risk mitigation strategies
- Documentation requirements
**Structure:**
```markdown
---
title: [Issue Title]
type: [feat|fix|refactor]
status: active
date: YYYY-MM-DD
origin: docs/brainstorms/YYYY-MM-DD-<topic>-brainstorm.md # if originated from brainstorm, otherwise omit
---
# [Issue Title]
## Overview
[Executive summary]
## Problem Statement
[Detailed problem analysis]
## Proposed Solution
[Comprehensive solution design]
## Technical Approach
### Architecture
[Detailed technical design]
### Implementation Phases
#### Phase 1: [Foundation]
- Tasks and deliverables
- Success criteria
- Estimated effort
#### Phase 2: [Core Implementation]
- Tasks and deliverables
- Success criteria
- Estimated effort
#### Phase 3: [Polish & Optimization]
- Tasks and deliverables
- Success criteria
- Estimated effort
## Alternative Approaches Considered
[Other solutions evaluated and why rejected]
## System-Wide Impact
### Interaction Graph
[Map the chain reaction: what callbacks, middleware, observers, and event handlers fire when this code runs? Trace at least two levels deep. Document: "Action X triggers Y, which calls Z, which persists W."]
### Error & Failure Propagation
[Trace errors from lowest layer up. List specific error classes and where they're handled. Identify retry conflicts, unhandled error types, and silent failure swallowing.]
### State Lifecycle Risks
[Walk through each step that persists state. Can partial failure orphan rows, duplicate records, or leave caches stale? Document cleanup mechanisms or their absence.]
### API Surface Parity
[List all interfaces (classes, DSLs, endpoints) that expose equivalent functionality. Note which need updating and which share the code path.]
### Integration Test Scenarios
[3-5 cross-layer test scenarios that unit tests with mocks would never catch. Include expected behavior for each.]
## Acceptance Criteria
### Functional Requirements
- [ ] Detailed functional criteria
### Non-Functional Requirements
- [ ] Performance targets
- [ ] Security requirements
- [ ] Accessibility standards
### Quality Gates
- [ ] Test coverage requirements
- [ ] Documentation completeness
- [ ] Code review approval
## Success Metrics
[Detailed KPIs and measurement methods]
## Dependencies & Prerequisites
[Detailed dependency analysis]
## Risk Analysis & Mitigation
[Comprehensive risk assessment]
## Resource Requirements
[Team, time, infrastructure needs]
## Future Considerations
[Extensibility and long-term vision]
## Documentation Plan
[What docs need updating]
## Sources & References
### Origin
- **Brainstorm document:** [docs/brainstorms/YYYY-MM-DD-<topic>-brainstorm.md](path) — include if plan originated from a brainstorm. Key decisions carried forward: [list 2-3 major decisions from brainstorm]
### Internal References
- Architecture decisions: [file_path:line_number]
- Similar features: [file_path:line_number]
- Configuration: [file_path:line_number]
### External References
- Framework documentation: [url]
- Best practices guide: [url]
- Industry standards: [url]
### Related Work
- Previous PRs: #[pr_numbers]
- Related issues: #[issue_numbers]
- Design documents: [links]
```
### 5. Issue Creation & Formatting
<thinking>
Apply best practices for clarity and actionability, making the issue easy to scan and understand
</thinking>
**Content Formatting:**
- [ ] Use clear, descriptive headings with proper hierarchy (##, ###)
- [ ] Include code examples in triple backticks with language syntax highlighting
- [ ] Add screenshots/mockups if UI-related (drag & drop or use image hosting)
- [ ] Use task lists (- [ ]) for trackable items that can be checked off
- [ ] Add collapsible sections for lengthy logs or optional details using `<details>` tags
- [ ] Apply appropriate emoji for visual scanning (🐛 bug, ✨ feature, 📚 docs, ♻️ refactor)
**Cross-Referencing:**
- [ ] Link to related issues/PRs using #number format
- [ ] Reference specific commits with SHA hashes when relevant
- [ ] Link to code using GitHub's permalink feature (press 'y' for permanent link)
- [ ] Mention relevant team members with @username if needed
- [ ] Add links to external resources with descriptive text
**Code & Examples:**
````markdown
# Good example with syntax highlighting and line references
```ruby
# app/services/user_service.rb:42
def process_user(user)
# Implementation here
end
```
# Collapsible error logs
<details>
<summary>Full error stacktrace</summary>
`Error details here...`
</details>
````
**AI-Era Considerations:**
- [ ] Account for accelerated development with AI pair programming
- [ ] Include prompts or instructions that worked well during research
- [ ] Note which AI tools were used for initial exploration (Claude, Copilot, etc.)
- [ ] Emphasize comprehensive testing given rapid implementation
- [ ] Document any AI-generated code that needs human review
### 6. Final Review & Submission
**Brainstorm cross-check (if plan originated from a brainstorm):**
Before finalizing, re-read the brainstorm document and verify:
- [ ] Every key decision from the brainstorm is reflected in the plan
- [ ] The chosen approach matches what was decided in the brainstorm
- [ ] Constraints and requirements from the brainstorm are captured in acceptance criteria
- [ ] Open questions from the brainstorm are either resolved or flagged
- [ ] The `origin:` frontmatter field points to the brainstorm file
- [ ] The Sources section includes the brainstorm with a summary of carried-forward decisions
**Pre-submission Checklist:**
- [ ] Title is searchable and descriptive
- [ ] Labels accurately categorize the issue
- [ ] All template sections are complete
- [ ] Links and references are working
- [ ] Acceptance criteria are measurable
- [ ] Add names of files in pseudo code examples and todo lists
- [ ] Add an ERD mermaid diagram if applicable for new model changes
## Write Plan File
**REQUIRED: Write the plan file to disk before presenting any options.**
```bash
mkdir -p docs/plans/
# Determine daily sequence number
today=$(date +%Y-%m-%d)
last_seq=$(ls docs/plans/${today}-*-plan.md 2>/dev/null | grep -oP "${today}-\K\d{3}" | sort -n | tail -1)
next_seq=$(printf "%03d" $(( ${last_seq:-0} + 1 )))
```
Use the Write tool to save the complete plan to `docs/plans/YYYY-MM-DD-NNN-<type>-<descriptive-name>-plan.md` (where NNN is `$next_seq` from the bash command above). This step is mandatory and cannot be skipped — even when running as part of LFG/SLFG or other automated pipelines.
Confirm: "Plan written to docs/plans/[filename]"
**Pipeline mode:** If invoked from an automated workflow (LFG, SLFG, or any `disable-model-invocation` context), skip all AskUserQuestion calls. Make decisions automatically and proceed to writing the plan without interactive prompts.
## Output Format
**Filename:** Use the date, daily sequence number, and kebab-case filename from Step 2 Title & Categorization.
```
docs/plans/YYYY-MM-DD-NNN-<type>-<descriptive-name>-plan.md
```
Examples:
- ✅ `docs/plans/2026-01-15-001-feat-user-authentication-flow-plan.md`
- ✅ `docs/plans/2026-02-03-001-fix-checkout-race-condition-plan.md`
- ✅ `docs/plans/2026-03-10-002-refactor-api-client-extraction-plan.md`
- ❌ `docs/plans/2026-01-15-feat-thing-plan.md` (missing sequence number, not descriptive)
- ❌ `docs/plans/2026-01-15-001-feat-new-feature-plan.md` (too vague - what feature?)
- ❌ `docs/plans/2026-01-15-001-feat: user auth-plan.md` (invalid characters - colon and space)
- ❌ `docs/plans/feat-user-auth-plan.md` (missing date prefix and sequence number)
## Post-Generation Options
After writing the plan file, use the **AskUserQuestion tool** to present these options:
**Question:** "Plan ready at `docs/plans/YYYY-MM-DD-NNN-<type>-<name>-plan.md`. What would you like to do next?"
**Options:**
1. **Open plan in editor** - Open the plan file for review
2. **Run `/deepen-plan`** - Enhance each section with parallel research agents (best practices, performance, UI)
3. **Review and refine** - Improve the document through structured self-review
4. **Share to Proof** - Upload to Proof for collaborative review and sharing
5. **Start `/ce:work`** - Begin implementing this plan locally
6. **Start `/ce:work` on remote** - Begin implementing in Claude Code on the web (use `&` to run in background)
7. **Create Issue** - Create issue in project tracker (GitHub/Linear)
Based on selection:
- **Open plan in editor** → Run `open docs/plans/<plan_filename>.md` to open the file in the user's default editor
- **`/deepen-plan`** → Call the /deepen-plan command with the plan file path to enhance with research
- **Review and refine** → Load `document-review` skill.
- **Share to Proof** → Upload the plan to Proof:
```bash
CONTENT=$(cat docs/plans/<plan_filename>.md)
TITLE="Plan: <plan title from frontmatter>"
RESPONSE=$(curl -s -X POST https://www.proofeditor.ai/share/markdown \
-H "Content-Type: application/json" \
-d "$(jq -n --arg title "$TITLE" --arg markdown "$CONTENT" --arg by "ai:compound" '{title: $title, markdown: $markdown, by: $by}')")
PROOF_URL=$(echo "$RESPONSE" | jq -r '.tokenUrl')
```
Display: `View & collaborate in Proof: <PROOF_URL>` — skip silently if curl fails. Then return to options.
- **`/ce:work`** → Call the /ce:work command with the plan file path
- **`/ce:work` on remote** → Run `/ce:work docs/plans/<plan_filename>.md &` to start work in background for Claude Code web
- **Create Issue** → See "Issue Creation" section below
- **Other** (automatically provided) → Accept free text for rework or specific changes
**Note:** If running `/ce:plan` with ultrathink enabled, automatically run `/deepen-plan` after plan creation for maximum depth and grounding.
Loop back to options after Simplify or Other changes until user selects `/ce:work` or another action.
## Issue Creation
When user selects "Create Issue", detect their project tracker from CLAUDE.md:
1. **Check for tracker preference** in user's CLAUDE.md (global or project):
- Look for `project_tracker: github` or `project_tracker: linear`
- Or look for mentions of "GitHub Issues" or "Linear" in their workflow section
2. **If GitHub:**
Use the title and type from Step 2 (already in context - no need to re-read the file):
```bash
gh issue create --title "<type>: <title>" --body-file <plan_path>
```
3. **If Linear:**
```bash
linear issue create --title "<title>" --description "$(cat <plan_path>)"
```
4. **If no tracker configured:**
Ask user: "Which project tracker do you use? (GitHub/Linear/Other)"
- Suggest adding `project_tracker: github` or `project_tracker: linear` to their CLAUDE.md
5. **After creation:**
- Display the issue URL
- Ask if they want to proceed to `/ce:work`
NEVER CODE! Just research and write the plan.

View File

@@ -0,0 +1,558 @@
---
name: ce:review
description: Perform exhaustive code reviews using multi-agent analysis, ultra-thinking, and worktrees
argument-hint: "[PR number, GitHub URL, branch name, or latest] [--serial]"
---
# Review Command
<command_purpose> Perform exhaustive code reviews using multi-agent analysis, ultra-thinking, and Git worktrees for deep local inspection. </command_purpose>
## Introduction
<role>Senior Code Review Architect with expertise in security, performance, architecture, and quality assurance</role>
## Prerequisites
<requirements>
- Git repository with GitHub CLI (`gh`) installed and authenticated
- Clean main/master branch
- Proper permissions to create worktrees and access the repository
- For document reviews: Path to a markdown file or document
</requirements>
## Main Tasks
### 1. Determine Review Target & Setup (ALWAYS FIRST)
<review_target> #$ARGUMENTS </review_target>
<thinking>
First, I need to determine the review target type and set up the code for analysis.
</thinking>
#### Immediate Actions:
<task_list>
- [ ] Determine review type: PR number (numeric), GitHub URL, file path (.md), or empty (current branch)
- [ ] Check current git branch
- [ ] If ALREADY on the target branch (PR branch, requested branch name, or the branch already checked out for review) → proceed with analysis on current branch
- [ ] If DIFFERENT branch than the review target → offer to use worktree: "Use git-worktree skill for isolated Call `skill: git-worktree` with branch name"
- [ ] Fetch PR metadata using `gh pr view --json` for title, body, files, linked issues
- [ ] Set up language-specific analysis tools
- [ ] Prepare security scanning environment
- [ ] Make sure we are on the branch we are reviewing. Use gh pr checkout to switch to the branch or manually checkout the branch.
Ensure that the code is ready for analysis (either in worktree or on current branch). ONLY then proceed to the next step.
</task_list>
#### Protected Artifacts
<protected_artifacts>
The following paths are compound-engineering pipeline artifacts and must never be flagged for deletion, removal, or gitignore by any review agent:
- `docs/plans/*.md` — Plan files created by `/ce:plan`. These are living documents that track implementation progress (checkboxes are checked off by `/ce:work`).
- `docs/solutions/*.md` — Solution documents created during the pipeline.
If a review agent flags any file in these directories for cleanup or removal, discard that finding during synthesis. Do not create a todo for it.
</protected_artifacts>
#### Load Review Agents
Read `compound-engineering.local.md` in the project root. If found, use `review_agents` from YAML frontmatter. If the markdown body contains review context, pass it to each agent as additional instructions.
If no settings file exists, invoke the `setup` skill to create one. Then read the newly created file and continue.
#### Choose Execution Mode
<execution_mode>
Before launching review agents, check for context constraints:
**If `--serial` flag is passed OR conversation is in a long session:**
Run agents ONE AT A TIME in sequence. Wait for each agent to complete before starting the next. This uses less context but takes longer.
**Default (parallel):**
Run all agents simultaneously for speed. If you hit context limits, retry with `--serial` flag.
**Auto-detect:** If more than 5 review agents are configured, automatically switch to serial mode and inform the user:
"Running review agents in serial mode (6+ agents configured). Use --parallel to override."
</execution_mode>
#### Parallel Agents to review the PR:
<parallel_tasks>
**Parallel mode (default for ≤5 agents):**
Run all configured review agents in parallel using Task tool. For each agent in the `review_agents` list:
```
Task {agent-name}(PR content + review context from settings body)
```
**Serial mode (--serial flag, or auto for 6+ agents):**
Run configured review agents ONE AT A TIME. For each agent in the `review_agents` list, wait for it to complete before starting the next:
```
For each agent in review_agents:
1. Task {agent-name}(PR content + review context)
2. Wait for completion
3. Collect findings
4. Proceed to next agent
```
Always run these last regardless of mode:
- Task compound-engineering:review:agent-native-reviewer(PR content) - Verify new features are agent-accessible
- Task compound-engineering:research:learnings-researcher(PR content) - Search docs/solutions/ for past issues related to this PR's modules and patterns
</parallel_tasks>
#### Conditional Agents (Run if applicable):
<conditional_agents>
These agents are run ONLY when the PR matches specific criteria. Check the PR files list to determine if they apply:
**MIGRATIONS: If PR contains database migrations, schema.rb, or data backfills:**
- Task compound-engineering:review:schema-drift-detector(PR content) - Detects unrelated schema.rb changes by cross-referencing against included migrations (run FIRST)
- Task compound-engineering:review:data-migration-expert(PR content) - Validates ID mappings match production, checks for swapped values, verifies rollback safety
- Task compound-engineering:review:deployment-verification-agent(PR content) - Creates Go/No-Go deployment checklist with SQL verification queries
**When to run:**
- PR includes files matching `db/migrate/*.rb` or `db/schema.rb`
- PR modifies columns that store IDs, enums, or mappings
- PR includes data backfill scripts or rake tasks
- PR title/body mentions: migration, backfill, data transformation, ID mapping
**What these agents check:**
- `schema-drift-detector`: Cross-references schema.rb changes against PR migrations to catch unrelated columns/indexes from local database state
- `data-migration-expert`: Verifies hard-coded mappings match production reality (prevents swapped IDs), checks for orphaned associations, validates dual-write patterns
- `deployment-verification-agent`: Produces executable pre/post-deploy checklists with SQL queries, rollback procedures, and monitoring plans
</conditional_agents>
### 2. Ultra-Thinking Deep Dive Phases
<ultrathink_instruction> For each phase below, spend maximum cognitive effort. Think step by step. Consider all angles. Question assumptions. And bring all reviews in a synthesis to the user.</ultrathink_instruction>
<deliverable>
Complete system context map with component interactions
</deliverable>
#### Phase 1: Stakeholder Perspective Analysis
<thinking_prompt> ULTRA-THINK: Put yourself in each stakeholder's shoes. What matters to them? What are their pain points? </thinking_prompt>
<stakeholder_perspectives>
1. **Developer Perspective** <questions>
- How easy is this to understand and modify?
- Are the APIs intuitive?
- Is debugging straightforward?
- Can I test this easily? </questions>
2. **Operations Perspective** <questions>
- How do I deploy this safely?
- What metrics and logs are available?
- How do I troubleshoot issues?
- What are the resource requirements? </questions>
3. **End User Perspective** <questions>
- Is the feature intuitive?
- Are error messages helpful?
- Is performance acceptable?
- Does it solve my problem? </questions>
4. **Security Team Perspective** <questions>
- What's the attack surface?
- Are there compliance requirements?
- How is data protected?
- What are the audit capabilities? </questions>
5. **Business Perspective** <questions>
- What's the ROI?
- Are there legal/compliance risks?
- How does this affect time-to-market?
- What's the total cost of ownership? </questions> </stakeholder_perspectives>
#### Phase 2: Scenario Exploration
<thinking_prompt> ULTRA-THINK: Explore edge cases and failure scenarios. What could go wrong? How does the system behave under stress? </thinking_prompt>
<scenario_checklist>
- [ ] **Happy Path**: Normal operation with valid inputs
- [ ] **Invalid Inputs**: Null, empty, malformed data
- [ ] **Boundary Conditions**: Min/max values, empty collections
- [ ] **Concurrent Access**: Race conditions, deadlocks
- [ ] **Scale Testing**: 10x, 100x, 1000x normal load
- [ ] **Network Issues**: Timeouts, partial failures
- [ ] **Resource Exhaustion**: Memory, disk, connections
- [ ] **Security Attacks**: Injection, overflow, DoS
- [ ] **Data Corruption**: Partial writes, inconsistency
- [ ] **Cascading Failures**: Downstream service issues </scenario_checklist>
### 3. Multi-Angle Review Perspectives
#### Technical Excellence Angle
- Code craftsmanship evaluation
- Engineering best practices
- Technical documentation quality
- Tooling and automation assessment
#### Business Value Angle
- Feature completeness validation
- Performance impact on users
- Cost-benefit analysis
- Time-to-market considerations
#### Risk Management Angle
- Security risk assessment
- Operational risk evaluation
- Compliance risk verification
- Technical debt accumulation
#### Team Dynamics Angle
- Code review etiquette
- Knowledge sharing effectiveness
- Collaboration patterns
- Mentoring opportunities
### 4. Simplification and Minimalism Review
Run the Task compound-engineering:review:code-simplicity-reviewer() to see if we can simplify the code.
### 5. Findings Synthesis and Todo Creation Using file-todos Skill
<critical_requirement> ALL findings MUST be stored in the todos/ directory using the file-todos skill. Create todo files immediately after synthesis - do NOT present findings for user approval first. Use the skill for structured todo management. </critical_requirement>
#### Step 1: Synthesize All Findings
<thinking>
Consolidate all agent reports into a categorized list of findings.
Remove duplicates, prioritize by severity and impact.
</thinking>
<synthesis_tasks>
- [ ] Collect findings from all parallel agents
- [ ] Surface learnings-researcher results: if past solutions are relevant, flag them as "Known Pattern" with links to docs/solutions/ files
- [ ] Discard any findings that recommend deleting or gitignoring files in `docs/plans/` or `docs/solutions/` (see Protected Artifacts above)
- [ ] Categorize by type: security, performance, architecture, quality, etc.
- [ ] Assign severity levels: 🔴 CRITICAL (P1), 🟡 IMPORTANT (P2), 🔵 NICE-TO-HAVE (P3)
- [ ] Remove duplicate or overlapping findings
- [ ] Estimate effort for each finding (Small/Medium/Large)
</synthesis_tasks>
#### Step 2: Create Todo Files Using file-todos Skill
<critical_instruction> Use the file-todos skill to create todo files for ALL findings immediately. Do NOT present findings one-by-one asking for user approval. Create all todo files in parallel using the skill, then summarize results to user. </critical_instruction>
**Implementation Options:**
**Option A: Direct File Creation (Fast)**
- Create todo files directly using Write tool
- All findings in parallel for speed
- Use standard template from `.claude/skills/file-todos/assets/todo-template.md`
- Follow naming convention: `{issue_id}-pending-{priority}-{description}.md`
**Option B: Sub-Agents in Parallel (Recommended for Scale)** For large PRs with 15+ findings, use sub-agents to create finding files in parallel:
```bash
# Launch multiple finding-creator agents in parallel
Task() - Create todos for first finding
Task() - Create todos for second finding
Task() - Create todos for third finding
etc. for each finding.
```
Sub-agents can:
- Process multiple findings simultaneously
- Write detailed todo files with all sections filled
- Organize findings by severity
- Create comprehensive Proposed Solutions
- Add acceptance criteria and work logs
- Complete much faster than sequential processing
**Execution Strategy:**
1. Synthesize all findings into categories (P1/P2/P3)
2. Group findings by severity
3. Launch 3 parallel sub-agents (one per severity level)
4. Each sub-agent creates its batch of todos using the file-todos skill
5. Consolidate results and present summary
**Process (Using file-todos Skill):**
1. For each finding:
- Determine severity (P1/P2/P3)
- Write detailed Problem Statement and Findings
- Create 2-3 Proposed Solutions with pros/cons/effort/risk
- Estimate effort (Small/Medium/Large)
- Add acceptance criteria and work log
2. Use file-todos skill for structured todo management:
```bash
skill: file-todos
```
The skill provides:
- Template location: `.claude/skills/file-todos/assets/todo-template.md`
- Naming convention: `{issue_id}-{status}-{priority}-{description}.md`
- YAML frontmatter structure: status, priority, issue_id, tags, dependencies
- All required sections: Problem Statement, Findings, Solutions, etc.
3. Create todo files in parallel:
```bash
{next_id}-pending-{priority}-{description}.md
```
4. Examples:
```
001-pending-p1-path-traversal-vulnerability.md
002-pending-p1-api-response-validation.md
003-pending-p2-concurrency-limit.md
004-pending-p3-unused-parameter.md
```
5. Follow template structure from file-todos skill: `.claude/skills/file-todos/assets/todo-template.md`
**Todo File Structure (from template):**
Each todo must include:
- **YAML frontmatter**: status, priority, issue_id, tags, dependencies
- **Problem Statement**: What's broken/missing, why it matters
- **Findings**: Discoveries from agents with evidence/location
- **Proposed Solutions**: 2-3 options, each with pros/cons/effort/risk
- **Recommended Action**: (Filled during triage, leave blank initially)
- **Technical Details**: Affected files, components, database changes
- **Acceptance Criteria**: Testable checklist items
- **Work Log**: Dated record with actions and learnings
- **Resources**: Links to PR, issues, documentation, similar patterns
**File naming convention:**
```
{issue_id}-{status}-{priority}-{description}.md
Examples:
- 001-pending-p1-security-vulnerability.md
- 002-pending-p2-performance-optimization.md
- 003-pending-p3-code-cleanup.md
```
**Status values:**
- `pending` - New findings, needs triage/decision
- `ready` - Approved by manager, ready to work
- `complete` - Work finished
**Priority values:**
- `p1` - Critical (blocks merge, security/data issues)
- `p2` - Important (should fix, architectural/performance)
- `p3` - Nice-to-have (enhancements, cleanup)
**Tagging:** Always add `code-review` tag, plus: `security`, `performance`, `architecture`, `rails`, `quality`, etc.
#### Step 3: Summary Report
After creating all todo files, present comprehensive summary:
````markdown
## ✅ Code Review Complete
**Review Target:** PR #XXXX - [PR Title] **Branch:** [branch-name]
### Findings Summary:
- **Total Findings:** [X]
- **🔴 CRITICAL (P1):** [count] - BLOCKS MERGE
- **🟡 IMPORTANT (P2):** [count] - Should Fix
- **🔵 NICE-TO-HAVE (P3):** [count] - Enhancements
### Created Todo Files:
**P1 - Critical (BLOCKS MERGE):**
- `001-pending-p1-{finding}.md` - {description}
- `002-pending-p1-{finding}.md` - {description}
**P2 - Important:**
- `003-pending-p2-{finding}.md` - {description}
- `004-pending-p2-{finding}.md` - {description}
**P3 - Nice-to-Have:**
- `005-pending-p3-{finding}.md` - {description}
### Review Agents Used:
- kieran-rails-reviewer
- security-sentinel
- performance-oracle
- architecture-strategist
- agent-native-reviewer
- [other agents]
### Next Steps:
1. **Address P1 Findings**: CRITICAL - must be fixed before merge
- Review each P1 todo in detail
- Implement fixes or request exemption
- Verify fixes before merging PR
2. **Triage All Todos**:
```bash
ls todos/*-pending-*.md # View all pending todos
/triage # Use slash command for interactive triage
```
3. **Work on Approved Todos**:
```bash
/resolve_todo_parallel # Fix all approved items efficiently
```
4. **Track Progress**:
- Rename file when status changes: pending → ready → complete
- Update Work Log as you work
- Commit todos: `git add todos/ && git commit -m "refactor: add code review findings"`
### Severity Breakdown:
**🔴 P1 (Critical - Blocks Merge):**
- Security vulnerabilities
- Data corruption risks
- Breaking changes
- Critical architectural issues
**🟡 P2 (Important - Should Fix):**
- Performance issues
- Significant architectural concerns
- Major code quality problems
- Reliability issues
**🔵 P3 (Nice-to-Have):**
- Minor improvements
- Code cleanup
- Optimization opportunities
- Documentation updates
````
### 6. End-to-End Testing (Optional)
<detect_project_type>
**First, detect the project type from PR files:**
| Indicator | Project Type |
|-----------|--------------|
| `*.xcodeproj`, `*.xcworkspace`, `Package.swift` (iOS) | iOS/macOS |
| `Gemfile`, `package.json`, `app/views/*`, `*.html.*` | Web |
| Both iOS files AND web files | Hybrid (test both) |
</detect_project_type>
<offer_testing>
After presenting the Summary Report, offer appropriate testing based on project type:
**For Web Projects:**
```markdown
**"Want to run browser tests on the affected pages?"**
1. Yes - run `/test-browser`
2. No - skip
```
**For iOS Projects:**
```markdown
**"Want to run Xcode simulator tests on the app?"**
1. Yes - run `/xcode-test`
2. No - skip
```
**For Hybrid Projects (e.g., Rails + Hotwire Native):**
```markdown
**"Want to run end-to-end tests?"**
1. Web only - run `/test-browser`
2. iOS only - run `/xcode-test`
3. Both - run both commands
4. No - skip
```
</offer_testing>
#### If User Accepts Web Testing:
Spawn a subagent to run browser tests (preserves main context):
```
Task general-purpose("Run /test-browser for PR #[number]. Test all affected pages, check for console errors, handle failures by creating todos and fixing.")
```
The subagent will:
1. Identify pages affected by the PR
2. Navigate to each page and capture snapshots (using Playwright MCP or agent-browser CLI)
3. Check for console errors
4. Test critical interactions
5. Pause for human verification on OAuth/email/payment flows
6. Create P1 todos for any failures
7. Fix and retry until all tests pass
**Standalone:** `/test-browser [PR number]`
#### If User Accepts iOS Testing:
Spawn a subagent to run Xcode tests (preserves main context):
```
Task general-purpose("Run /xcode-test for scheme [name]. Build for simulator, install, launch, take screenshots, check for crashes.")
```
The subagent will:
1. Verify XcodeBuildMCP is installed
2. Discover project and schemes
3. Build for iOS Simulator
4. Install and launch app
5. Take screenshots of key screens
6. Capture console logs for errors
7. Pause for human verification (Sign in with Apple, push, IAP)
8. Create P1 todos for any failures
9. Fix and retry until all tests pass
**Standalone:** `/xcode-test [scheme]`
### Important: P1 Findings Block Merge
Any **🔴 P1 (CRITICAL)** findings must be addressed before merging the PR. Present these prominently and ensure they're resolved before accepting the PR.

View File

@@ -0,0 +1,470 @@
---
name: ce:work
description: Execute work plans efficiently while maintaining quality and finishing features
argument-hint: "[plan file, specification, or todo file path]"
---
# Work Plan Execution Command
Execute a work plan efficiently while maintaining quality and finishing features.
## Introduction
This command takes a work document (plan, specification, or todo file) and executes it systematically. The focus is on **shipping complete features** by understanding requirements quickly, following existing patterns, and maintaining quality throughout.
## Input Document
<input_document> #$ARGUMENTS </input_document>
## Execution Workflow
### Phase 1: Quick Start
1. **Read Plan and Clarify**
- Read the work document completely
- Review any references or links provided in the plan
- If anything is unclear or ambiguous, ask clarifying questions now
- Get user approval to proceed
- **Do not skip this** - better to ask questions now than build the wrong thing
2. **Setup Environment**
First, check the current branch:
```bash
current_branch=$(git branch --show-current)
default_branch=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@')
# Fallback if remote HEAD isn't set
if [ -z "$default_branch" ]; then
default_branch=$(git rev-parse --verify origin/main >/dev/null 2>&1 && echo "main" || echo "master")
fi
```
**If already on a feature branch** (not the default branch):
- Ask: "Continue working on `[current_branch]`, or create a new branch?"
- If continuing, proceed to step 3
- If creating new, follow Option A or B below
**If on the default branch**, choose how to proceed:
**Option A: Create a new branch**
```bash
git pull origin [default_branch]
git checkout -b feature-branch-name
```
Use a meaningful name based on the work (e.g., `feat/user-authentication`, `fix/email-validation`).
**Option B: Use a worktree (recommended for parallel development)**
```bash
skill: git-worktree
# The skill will create a new branch from the default branch in an isolated worktree
```
**Option C: Continue on the default branch**
- Requires explicit user confirmation
- Only proceed after user explicitly says "yes, commit to [default_branch]"
- Never commit directly to the default branch without explicit permission
**Recommendation**: Use worktree if:
- You want to work on multiple features simultaneously
- You want to keep the default branch clean while experimenting
- You plan to switch between branches frequently
3. **Create Todo List**
- Use TodoWrite to break plan into actionable tasks
- Include dependencies between tasks
- Prioritize based on what needs to be done first
- Include testing and quality check tasks
- Keep tasks specific and completable
### Phase 2: Execute
1. **Task Execution Loop**
For each task in priority order:
```
while (tasks remain):
- Mark task as in_progress in TodoWrite
- Read any referenced files from the plan
- Look for similar patterns in codebase
- Implement following existing conventions
- Write tests for new functionality
- Run System-Wide Test Check (see below)
- Run tests after changes
- Mark task as completed in TodoWrite
- Mark off the corresponding checkbox in the plan file ([ ] → [x])
- Evaluate for incremental commit (see below)
```
**System-Wide Test Check** — Before marking a task done, pause and ask:
| Question | What to do |
|----------|------------|
| **What fires when this runs?** Callbacks, middleware, observers, event handlers — trace two levels out from your change. | Read the actual code (not docs) for callbacks on models you touch, middleware in the request chain, `after_*` hooks. |
| **Do my tests exercise the real chain?** If every dependency is mocked, the test proves your logic works *in isolation* — it says nothing about the interaction. | Write at least one integration test that uses real objects through the full callback/middleware chain. No mocks for the layers that interact. |
| **Can failure leave orphaned state?** If your code persists state (DB row, cache, file) before calling an external service, what happens when the service fails? Does retry create duplicates? | Trace the failure path with real objects. If state is created before the risky call, test that failure cleans up or that retry is idempotent. |
| **What other interfaces expose this?** Mixins, DSLs, alternative entry points (Agent vs Chat vs ChatMethods). | Grep for the method/behavior in related classes. If parity is needed, add it now — not as a follow-up. |
| **Do error strategies align across layers?** Retry middleware + application fallback + framework error handling — do they conflict or create double execution? | List the specific error classes at each layer. Verify your rescue list matches what the lower layer actually raises. |
**When to skip:** Leaf-node changes with no callbacks, no state persistence, no parallel interfaces. If the change is purely additive (new helper method, new view partial), the check takes 10 seconds and the answer is "nothing fires, skip."
**When this matters most:** Any change that touches models with callbacks, error handling with fallback/retry, or functionality exposed through multiple interfaces.
**IMPORTANT**: Always update the original plan document by checking off completed items. Use the Edit tool to change `- [ ]` to `- [x]` for each task you finish. This keeps the plan as a living document showing progress and ensures no checkboxes are left unchecked.
2. **Incremental Commits**
After completing each task, evaluate whether to create an incremental commit:
| Commit when... | Don't commit when... |
|----------------|---------------------|
| Logical unit complete (model, service, component) | Small part of a larger unit |
| Tests pass + meaningful progress | Tests failing |
| About to switch contexts (backend → frontend) | Purely scaffolding with no behavior |
| About to attempt risky/uncertain changes | Would need a "WIP" commit message |
**Heuristic:** "Can I write a commit message that describes a complete, valuable change? If yes, commit. If the message would be 'WIP' or 'partial X', wait."
**Commit workflow:**
```bash
# 1. Verify tests pass (use project's test command)
# Examples: bin/rails test, npm test, pytest, go test, etc.
# 2. Stage only files related to this logical unit (not `git add .`)
git add <files related to this logical unit>
# 3. Commit with conventional message
git commit -m "feat(scope): description of this unit"
```
**Handling merge conflicts:** If conflicts arise during rebasing or merging, resolve them immediately. Incremental commits make conflict resolution easier since each commit is small and focused.
**Note:** Incremental commits use clean conventional messages without attribution footers. The final Phase 4 commit/PR includes the full attribution.
3. **Follow Existing Patterns**
- The plan should reference similar code - read those files first
- Match naming conventions exactly
- Reuse existing components where possible
- Follow project coding standards (see CLAUDE.md)
- When in doubt, grep for similar implementations
4. **Test Continuously**
- Run relevant tests after each significant change
- Don't wait until the end to test
- Fix failures immediately
- Add new tests for new functionality
- **Unit tests with mocks prove logic in isolation. Integration tests with real objects prove the layers work together.** If your change touches callbacks, middleware, or error handling — you need both.
5. **Figma Design Sync** (if applicable)
For UI work with Figma designs:
- Implement components following design specs
- Use figma-design-sync agent iteratively to compare
- Fix visual differences identified
- Repeat until implementation matches design
6. **Track Progress**
- Keep TodoWrite updated as you complete tasks
- Note any blockers or unexpected discoveries
- Create new tasks if scope expands
- Keep user informed of major milestones
### Phase 3: Quality Check
1. **Run Core Quality Checks**
Always run before submitting:
```bash
# Run full test suite (use project's test command)
# Examples: bin/rails test, npm test, pytest, go test, etc.
# Run linting (per CLAUDE.md)
# Use linting-agent before pushing to origin
```
2. **Consider Reviewer Agents** (Optional)
Use for complex, risky, or large changes. Read agents from `compound-engineering.local.md` frontmatter (`review_agents`). If no settings file, invoke the `setup` skill to create one.
Run configured agents in parallel with Task tool. Present findings and address critical issues.
3. **Final Validation**
- All TodoWrite tasks marked completed
- All tests pass
- Linting passes
- Code follows existing patterns
- Figma designs match (if applicable)
- No console errors or warnings
4. **Prepare Operational Validation Plan** (REQUIRED)
- Add a `## Post-Deploy Monitoring & Validation` section to the PR description for every change.
- Include concrete:
- Log queries/search terms
- Metrics or dashboards to watch
- Expected healthy signals
- Failure signals and rollback/mitigation trigger
- Validation window and owner
- If there is truly no production/runtime impact, still include the section with: `No additional operational monitoring required` and a one-line reason.
### Phase 4: Ship It
1. **Create Commit**
```bash
git add .
git status # Review what's being committed
git diff --staged # Check the changes
# Commit with conventional format
git commit -m "$(cat <<'EOF'
feat(scope): description of what and why
Brief explanation if needed.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
EOF
)"
```
2. **Capture and Upload Screenshots for UI Changes** (REQUIRED for any UI work)
For **any** design changes, new views, or UI modifications, you MUST capture and upload screenshots:
**Step 1: Start dev server** (if not running)
```bash
bin/dev # Run in background
```
**Step 2: Capture screenshots with agent-browser CLI**
```bash
agent-browser open http://localhost:3000/[route]
agent-browser snapshot -i
agent-browser screenshot output.png
```
See the `agent-browser` skill for detailed usage.
**Step 3: Upload using imgup skill**
```bash
skill: imgup
# Then upload each screenshot:
imgup -h pixhost screenshot.png # pixhost works without API key
# Alternative hosts: catbox, imagebin, beeimg
```
**What to capture:**
- **New screens**: Screenshot of the new UI
- **Modified screens**: Before AND after screenshots
- **Design implementation**: Screenshot showing Figma design match
**IMPORTANT**: Always include uploaded image URLs in PR description. This provides visual context for reviewers and documents the change.
3. **Create Pull Request**
```bash
git push -u origin feature-branch-name
gh pr create --title "Feature: [Description]" --body "$(cat <<'EOF'
## Summary
- What was built
- Why it was needed
- Key decisions made
## Testing
- Tests added/modified
- Manual testing performed
## Post-Deploy Monitoring & Validation
- **What to monitor/search**
- Logs:
- Metrics/Dashboards:
- **Validation checks (queries/commands)**
- `command or query here`
- **Expected healthy behavior**
- Expected signal(s)
- **Failure signal(s) / rollback trigger**
- Trigger + immediate action
- **Validation window & owner**
- Window:
- Owner:
- **If no operational impact**
- `No additional operational monitoring required: <reason>`
## Before / After Screenshots
| Before | After |
|--------|-------|
| ![before](URL) | ![after](URL) |
## Figma Design
[Link if applicable]
---
[![Compound Engineered](https://img.shields.io/badge/Compound-Engineered-6366f1)](https://github.com/EveryInc/compound-engineering-plugin) 🤖 Generated with [Claude Code](https://claude.com/claude-code)
EOF
)"
```
4. **Update Plan Status**
If the input document has YAML frontmatter with a `status` field, update it to `completed`:
```
status: active → status: completed
```
5. **Notify User**
- Summarize what was completed
- Link to PR
- Note any follow-up work needed
- Suggest next steps if applicable
---
## Swarm Mode (Optional)
For complex plans with multiple independent workstreams, enable swarm mode for parallel execution with coordinated agents.
### When to Use Swarm Mode
| Use Swarm Mode when... | Use Standard Mode when... |
|------------------------|---------------------------|
| Plan has 5+ independent tasks | Plan is linear/sequential |
| Multiple specialists needed (review + test + implement) | Single-focus work |
| Want maximum parallelism | Simpler mental model preferred |
| Large feature with clear phases | Small feature or bug fix |
### Enabling Swarm Mode
To trigger swarm execution, say:
> "Make a Task list and launch an army of agent swarm subagents to build the plan"
Or explicitly request: "Use swarm mode for this work"
### Swarm Workflow
When swarm mode is enabled, the workflow changes:
1. **Create Team**
```
Teammate({ operation: "spawnTeam", team_name: "work-{timestamp}" })
```
2. **Create Task List with Dependencies**
- Parse plan into TaskCreate items
- Set up blockedBy relationships for sequential dependencies
- Independent tasks have no blockers (can run in parallel)
3. **Spawn Specialized Teammates**
```
Task({
team_name: "work-{timestamp}",
name: "implementer",
subagent_type: "general-purpose",
prompt: "Claim implementation tasks, execute, mark complete",
run_in_background: true
})
Task({
team_name: "work-{timestamp}",
name: "tester",
subagent_type: "general-purpose",
prompt: "Claim testing tasks, run tests, mark complete",
run_in_background: true
})
```
4. **Coordinate and Monitor**
- Team lead monitors task completion
- Spawn additional workers as phases unblock
- Handle plan approval if required
5. **Cleanup**
```
Teammate({ operation: "requestShutdown", target_agent_id: "implementer" })
Teammate({ operation: "requestShutdown", target_agent_id: "tester" })
Teammate({ operation: "cleanup" })
```
See the `orchestrating-swarms` skill for detailed swarm patterns and best practices.
---
## Key Principles
### Start Fast, Execute Faster
- Get clarification once at the start, then execute
- Don't wait for perfect understanding - ask questions and move
- The goal is to **finish the feature**, not create perfect process
### The Plan is Your Guide
- Work documents should reference similar code and patterns
- Load those references and follow them
- Don't reinvent - match what exists
### Test As You Go
- Run tests after each change, not at the end
- Fix failures immediately
- Continuous testing prevents big surprises
### Quality is Built In
- Follow existing patterns
- Write tests for new code
- Run linting before pushing
- Use reviewer agents for complex/risky changes only
### Ship Complete Features
- Mark all tasks completed before moving on
- Don't leave features 80% done
- A finished feature that ships beats a perfect feature that doesn't
## Quality Checklist
Before creating PR, verify:
- [ ] All clarifying questions asked and answered
- [ ] All TodoWrite tasks marked completed
- [ ] Tests pass (run project's test command)
- [ ] Linting passes (use linting-agent)
- [ ] Code follows existing patterns
- [ ] Figma designs match implementation (if applicable)
- [ ] Before/after screenshots captured and uploaded (for UI changes)
- [ ] Commit messages follow conventional format
- [ ] PR description includes Post-Deploy Monitoring & Validation section (or explicit no-impact rationale)
- [ ] PR description includes summary, testing notes, and screenshots
- [ ] PR description includes Compound Engineered badge
## When to Use Reviewer Agents
**Don't use by default.** Use reviewer agents only when:
- Large refactor affecting many files (10+)
- Security-sensitive changes (authentication, permissions, data access)
- Performance-critical code paths
- Complex algorithms or business logic
- User explicitly requests thorough review
For most features: tests + linting + following patterns is sufficient.
## Common Pitfalls to Avoid
- **Analysis paralysis** - Don't overthink, read the plan and execute
- **Skipping clarifying questions** - Ask now, not after building wrong thing
- **Ignoring plan references** - The plan has links for a reason
- **Testing at the end** - Test continuously or suffer later
- **Forgetting TodoWrite** - Track progress or lose track of what's done
- **80% done syndrome** - Finish the feature, don't move on early
- **Over-reviewing simple changes** - Save reviewer agents for complex work

View File

@@ -0,0 +1,138 @@
---
name: changelog
description: Create engaging changelogs for recent merges to main branch
argument-hint: "[optional: daily|weekly, or time period in days]"
disable-model-invocation: true
---
You are a witty and enthusiastic product marketer tasked with creating a fun, engaging change log for an internal development team. Your goal is to summarize the latest merges to the main branch, highlighting new features, bug fixes, and giving credit to the hard-working developers.
## Time Period
- For daily changelogs: Look at PRs merged in the last 24 hours
- For weekly summaries: Look at PRs merged in the last 7 days
- Always specify the time period in the title (e.g., "Daily" vs "Weekly")
- Default: Get the latest changes from the last day from the main branch of the repository
## PR Analysis
Analyze the provided GitHub changes and related issues. Look for:
1. New features that have been added
2. Bug fixes that have been implemented
3. Any other significant changes or improvements
4. References to specific issues and their details
5. Names of contributors who made the changes
6. Use gh cli to lookup the PRs as well and the description of the PRs
7. Check PR labels to identify feature type (feature, bug, chore, etc.)
8. Look for breaking changes and highlight them prominently
9. Include PR numbers for traceability
10. Check if PRs are linked to issues and include issue context
## Content Priorities
1. Breaking changes (if any) - MUST be at the top
2. User-facing features
3. Critical bug fixes
4. Performance improvements
5. Developer experience improvements
6. Documentation updates
## Formatting Guidelines
Now, create a change log summary with the following guidelines:
1. Keep it concise and to the point
2. Highlight the most important changes first
3. Group similar changes together (e.g., all new features, all bug fixes)
4. Include issue references where applicable
5. Mention the names of contributors, giving them credit for their work
6. Add a touch of humor or playfulness to make it engaging
7. Use emojis sparingly to add visual interest
8. Keep total message under 2000 characters for Discord
9. Use consistent emoji for each section
10. Format code/technical terms in backticks
11. Include PR numbers in parentheses (e.g., "Fixed login bug (#123)")
## Deployment Notes
When relevant, include:
- Database migrations required
- Environment variable updates needed
- Manual intervention steps post-deploy
- Dependencies that need updating
Your final output should be formatted as follows:
<change_log>
# 🚀 [Daily/Weekly] Change Log: [Current Date]
## 🚨 Breaking Changes (if any)
[List any breaking changes that require immediate attention]
## 🌟 New Features
[List new features here with PR numbers]
## 🐛 Bug Fixes
[List bug fixes here with PR numbers]
## 🛠️ Other Improvements
[List other significant changes or improvements]
## 🙌 Shoutouts
[Mention contributors and their contributions]
## 🎉 Fun Fact of the Day
[Include a brief, work-related fun fact or joke]
</change_log>
## Style Guide Review
Now review the changelog using the EVERY_WRITE_STYLE.md file and go one by one to make sure you are following the style guide. Use multiple agents, run in parallel to make it faster.
Remember, your final output should only include the content within the <change_log> tags. Do not include any of your thought process or the original data in the output.
## Discord Posting (Optional)
You can post changelogs to Discord by adding your own webhook URL:
```
# Set your Discord webhook URL
DISCORD_WEBHOOK_URL="https://discord.com/api/webhooks/YOUR_WEBHOOK_ID/YOUR_WEBHOOK_TOKEN"
# Post using curl
curl -H "Content-Type: application/json" \
-d "{\"content\": \"{{CHANGELOG}}\"}" \
$DISCORD_WEBHOOK_URL
```
To get a webhook URL, go to your Discord server → Server Settings → Integrations → Webhooks → New Webhook.
## Error Handling
- If no changes in the time period, post a "quiet day" message: "🌤️ Quiet day! No new changes merged."
- If unable to fetch PR details, list the PR numbers for manual review
- Always validate message length before posting to Discord (max 2000 chars)
## Schedule Recommendations
- Run daily at 6 AM NY time for previous day's changes
- Run weekly summary on Mondays for the previous week
- Special runs after major releases or deployments
## Audience Considerations
Adjust the tone and detail level based on the channel:
- **Dev team channels**: Include technical details, performance metrics, code snippets
- **Product team channels**: Focus on user-facing changes and business impact
- **Leadership channels**: Highlight progress on key initiatives and blockers

View File

@@ -0,0 +1,9 @@
---
name: create-agent-skill
description: Create or edit Claude Code skills with expert guidance on structure and best practices
allowed-tools: Skill(create-agent-skills)
argument-hint: "[skill description or requirements]"
disable-model-invocation: true
---
Invoke the create-agent-skills skill for: $ARGUMENTS

View File

@@ -97,22 +97,11 @@ Access individual args: `$ARGUMENTS[0]` or shorthand `$0`, `$1`, `$2`.
### Dynamic Context Injection
The `` !`command` `` syntax runs shell commands before content is sent to Claude:
Skills support dynamic context injection: prefix a backtick-wrapped shell command with an exclamation mark, and the preprocessor executes it at load time, replacing the directive with stdout. Write an exclamation mark immediately before the opening backtick of the command you want executed (for example, to inject the current git branch, write the exclamation mark followed by `git branch --show-current` wrapped in backticks).
```yaml
---
name: pr-summary
description: Summarize changes in a pull request
context: fork
agent: Explore
---
**Important:** The preprocessor scans the entire SKILL.md as plain text — it does not parse markdown. Directives inside fenced code blocks or inline code spans are still executed. If a skill documents this syntax with literal examples, the preprocessor will attempt to run them, causing load failures. To safely document this feature, describe it in prose (as done here) or place examples in a reference file, which is loaded on-demand by Claude and not preprocessed.
## Context
- PR diff: !`gh pr diff`
- Changed files: !`gh pr diff --name-only`
Summarize this pull request...
```
For a concrete example of dynamic context injection in a skill, see [official-spec.md](references/official-spec.md) § "Dynamic Context Injection".
### Running in a Subagent

View File

@@ -1,5 +1,11 @@
# Workflow: Add a Workflow to Existing Skill
## Interaction Method
If `AskUserQuestion` is available, use it for all prompts below.
If not, present each question as a numbered list and wait for a reply before proceeding to the next step. Never skip or auto-configure.
<required_reading>
**Read these reference files NOW:**
1. references/recommended-structure.md

View File

@@ -1,5 +1,11 @@
# Workflow: Create a New Skill
## Interaction Method
If `AskUserQuestion` is available, use it for all prompts below.
If not, present each question as a numbered list and wait for a reply before proceeding to the next step. For multiSelect questions, accept comma-separated numbers (e.g. `1, 3`). Never skip or auto-configure.
<required_reading>
**Read these reference files NOW:**
1. references/recommended-structure.md

View File

@@ -0,0 +1,544 @@
---
name: deepen-plan
description: Enhance a plan with parallel research agents for each section to add depth, best practices, and implementation details
argument-hint: "[path to plan file]"
---
# Deepen Plan - Power Enhancement Mode
## Introduction
**Note: The current year is 2026.** Use this when searching for recent documentation and best practices.
This command takes an existing plan (from `/ce:plan`) and enhances each section with parallel research agents. Each major element gets its own dedicated research sub-agent to find:
- Best practices and industry patterns
- Performance optimizations
- UI/UX improvements (if applicable)
- Quality enhancements and edge cases
- Real-world implementation examples
The result is a deeply grounded, production-ready plan with concrete implementation details.
## Plan File
<plan_path> #$ARGUMENTS </plan_path>
**If the plan path above is empty:**
1. Check for recent plans: `ls -la docs/plans/`
2. Ask the user: "Which plan would you like to deepen? Please provide the path (e.g., `docs/plans/2026-01-15-feat-my-feature-plan.md`)."
Do not proceed until you have a valid plan file path.
## Main Tasks
### 1. Parse and Analyze Plan Structure
<thinking>
First, read and parse the plan to identify each major section that can be enhanced with research.
</thinking>
**Read the plan file and extract:**
- [ ] Overview/Problem Statement
- [ ] Proposed Solution sections
- [ ] Technical Approach/Architecture
- [ ] Implementation phases/steps
- [ ] Code examples and file references
- [ ] Acceptance criteria
- [ ] Any UI/UX components mentioned
- [ ] Technologies/frameworks mentioned (Rails, React, Python, TypeScript, etc.)
- [ ] Domain areas (data models, APIs, UI, security, performance, etc.)
**Create a section manifest:**
```
Section 1: [Title] - [Brief description of what to research]
Section 2: [Title] - [Brief description of what to research]
...
```
### 2. Discover and Apply Available Skills
<thinking>
Dynamically discover all available skills and match them to plan sections. Don't assume what skills exist - discover them at runtime.
</thinking>
**Step 1: Discover ALL available skills from ALL sources**
```bash
# 1. Project-local skills (highest priority - project-specific)
ls .claude/skills/
# 2. User's global skills (~/.claude/)
ls ~/.claude/skills/
# 3. compound-engineering plugin skills
ls ~/.claude/plugins/cache/*/compound-engineering/*/skills/
# 4. ALL other installed plugins - check every plugin for skills
find ~/.claude/plugins/cache -type d -name "skills" 2>/dev/null
# 5. Also check installed_plugins.json for all plugin locations
cat ~/.claude/plugins/installed_plugins.json
```
**Important:** Check EVERY source. Don't assume compound-engineering is the only plugin. Use skills from ANY installed plugin that's relevant.
**Step 2: For each discovered skill, read its SKILL.md to understand what it does**
```bash
# For each skill directory found, read its documentation
cat [skill-path]/SKILL.md
```
**Step 3: Match skills to plan content**
For each skill discovered:
- Read its SKILL.md description
- Check if any plan sections match the skill's domain
- If there's a match, spawn a sub-agent to apply that skill's knowledge
**Step 4: Spawn a sub-agent for EVERY matched skill**
**CRITICAL: For EACH skill that matches, spawn a separate sub-agent and instruct it to USE that skill.**
For each matched skill:
```
Task general-purpose: "You have the [skill-name] skill available at [skill-path].
YOUR JOB: Use this skill on the plan.
1. Read the skill: cat [skill-path]/SKILL.md
2. Follow the skill's instructions exactly
3. Apply the skill to this content:
[relevant plan section or full plan]
4. Return the skill's full output
The skill tells you what to do - follow it. Execute the skill completely."
```
**Spawn ALL skill sub-agents in PARALLEL:**
- 1 sub-agent per matched skill
- Each sub-agent reads and uses its assigned skill
- All run simultaneously
- 10, 20, 30 skill sub-agents is fine
**Each sub-agent:**
1. Reads its skill's SKILL.md
2. Follows the skill's workflow/instructions
3. Applies the skill to the plan
4. Returns whatever the skill produces (code, recommendations, patterns, reviews, etc.)
**Example spawns:**
```
Task general-purpose: "Use the dhh-rails-style skill at ~/.claude/plugins/.../dhh-rails-style. Read SKILL.md and apply it to: [Rails sections of plan]"
Task general-purpose: "Use the frontend-design skill at ~/.claude/plugins/.../frontend-design. Read SKILL.md and apply it to: [UI sections of plan]"
Task general-purpose: "Use the agent-native-architecture skill at ~/.claude/plugins/.../agent-native-architecture. Read SKILL.md and apply it to: [agent/tool sections of plan]"
Task general-purpose: "Use the security-patterns skill at ~/.claude/skills/security-patterns. Read SKILL.md and apply it to: [full plan]"
```
**No limit on skill sub-agents. Spawn one for every skill that could possibly be relevant.**
### 3. Discover and Apply Learnings/Solutions
<thinking>
Check for documented learnings from /ce:compound. These are solved problems stored as markdown files. Spawn a sub-agent for each learning to check if it's relevant.
</thinking>
**LEARNINGS LOCATION - Check these exact folders:**
```
docs/solutions/ <-- PRIMARY: Project-level learnings (created by /ce:compound)
├── performance-issues/
│ └── *.md
├── debugging-patterns/
│ └── *.md
├── configuration-fixes/
│ └── *.md
├── integration-issues/
│ └── *.md
├── deployment-issues/
│ └── *.md
└── [other-categories]/
└── *.md
```
**Step 1: Find ALL learning markdown files**
Run these commands to get every learning file:
```bash
# PRIMARY LOCATION - Project learnings
find docs/solutions -name "*.md" -type f 2>/dev/null
# If docs/solutions doesn't exist, check alternate locations:
find .claude/docs -name "*.md" -type f 2>/dev/null
find ~/.claude/docs -name "*.md" -type f 2>/dev/null
```
**Step 2: Read frontmatter of each learning to filter**
Each learning file has YAML frontmatter with metadata. Read the first ~20 lines of each file to get:
```yaml
---
title: "N+1 Query Fix for Briefs"
category: performance-issues
tags: [activerecord, n-plus-one, includes, eager-loading]
module: Briefs
symptom: "Slow page load, multiple queries in logs"
root_cause: "Missing includes on association"
---
```
**For each .md file, quickly scan its frontmatter:**
```bash
# Read first 20 lines of each learning (frontmatter + summary)
head -20 docs/solutions/**/*.md
```
**Step 3: Filter - only spawn sub-agents for LIKELY relevant learnings**
Compare each learning's frontmatter against the plan:
- `tags:` - Do any tags match technologies/patterns in the plan?
- `category:` - Is this category relevant? (e.g., skip deployment-issues if plan is UI-only)
- `module:` - Does the plan touch this module?
- `symptom:` / `root_cause:` - Could this problem occur with the plan?
**SKIP learnings that are clearly not applicable:**
- Plan is frontend-only → skip `database-migrations/` learnings
- Plan is Python → skip `rails-specific/` learnings
- Plan has no auth → skip `authentication-issues/` learnings
**SPAWN sub-agents for learnings that MIGHT apply:**
- Any tag overlap with plan technologies
- Same category as plan domain
- Similar patterns or concerns
**Step 4: Spawn sub-agents for filtered learnings**
For each learning that passes the filter:
```
Task general-purpose: "
LEARNING FILE: [full path to .md file]
1. Read this learning file completely
2. This learning documents a previously solved problem
Check if this learning applies to this plan:
---
[full plan content]
---
If relevant:
- Explain specifically how it applies
- Quote the key insight or solution
- Suggest where/how to incorporate it
If NOT relevant after deeper analysis:
- Say 'Not applicable: [reason]'
"
```
**Example filtering:**
```
# Found 15 learning files, plan is about "Rails API caching"
# SPAWN (likely relevant):
docs/solutions/performance-issues/n-plus-one-queries.md # tags: [activerecord] ✓
docs/solutions/performance-issues/redis-cache-stampede.md # tags: [caching, redis] ✓
docs/solutions/configuration-fixes/redis-connection-pool.md # tags: [redis] ✓
# SKIP (clearly not applicable):
docs/solutions/deployment-issues/heroku-memory-quota.md # not about caching
docs/solutions/frontend-issues/stimulus-race-condition.md # plan is API, not frontend
docs/solutions/authentication-issues/jwt-expiry.md # plan has no auth
```
**Spawn sub-agents in PARALLEL for all filtered learnings.**
**These learnings are institutional knowledge - applying them prevents repeating past mistakes.**
### 4. Launch Per-Section Research Agents
<thinking>
For each major section in the plan, spawn dedicated sub-agents to research improvements. Use the Explore agent type for open-ended research.
</thinking>
**For each identified section, launch parallel research:**
```
Task Explore: "Research best practices, patterns, and real-world examples for: [section topic].
Find:
- Industry standards and conventions
- Performance considerations
- Common pitfalls and how to avoid them
- Documentation and tutorials
Return concrete, actionable recommendations."
```
**Also use Context7 MCP for framework documentation:**
For any technologies/frameworks mentioned in the plan, query Context7:
```
mcp__plugin_compound-engineering_context7__resolve-library-id: Find library ID for [framework]
mcp__plugin_compound-engineering_context7__query-docs: Query documentation for specific patterns
```
**Use WebSearch for current best practices:**
Search for recent (2024-2026) articles, blog posts, and documentation on topics in the plan.
### 5. Discover and Run ALL Review Agents
<thinking>
Dynamically discover every available agent and run them ALL against the plan. Don't filter, don't skip, don't assume relevance. 40+ parallel agents is fine. Use everything available.
</thinking>
**Step 1: Discover ALL available agents from ALL sources**
```bash
# 1. Project-local agents (highest priority - project-specific)
find .claude/agents -name "*.md" 2>/dev/null
# 2. User's global agents (~/.claude/)
find ~/.claude/agents -name "*.md" 2>/dev/null
# 3. compound-engineering plugin agents (all subdirectories)
find ~/.claude/plugins/cache/*/compound-engineering/*/agents -name "*.md" 2>/dev/null
# 4. ALL other installed plugins - check every plugin for agents
find ~/.claude/plugins/cache -path "*/agents/*.md" 2>/dev/null
# 5. Check installed_plugins.json to find all plugin locations
cat ~/.claude/plugins/installed_plugins.json
# 6. For local plugins (isLocal: true), check their source directories
# Parse installed_plugins.json and find local plugin paths
```
**Important:** Check EVERY source. Include agents from:
- Project `.claude/agents/`
- User's `~/.claude/agents/`
- compound-engineering plugin (but SKIP workflow/ agents - only use review/, research/, design/, docs/)
- ALL other installed plugins (agent-sdk-dev, frontend-design, etc.)
- Any local plugins
**For compound-engineering plugin specifically:**
- USE: `agents/review/*` (all reviewers)
- USE: `agents/research/*` (all researchers)
- USE: `agents/design/*` (design agents)
- USE: `agents/docs/*` (documentation agents)
- SKIP: `agents/workflow/*` (these are workflow orchestrators, not reviewers)
**Step 2: For each discovered agent, read its description**
Read the first few lines of each agent file to understand what it reviews/analyzes.
**Step 3: Launch ALL agents in parallel**
For EVERY agent discovered, launch a Task in parallel:
```
Task [agent-name]: "Review this plan using your expertise. Apply all your checks and patterns. Plan content: [full plan content]"
```
**CRITICAL RULES:**
- Do NOT filter agents by "relevance" - run them ALL
- Do NOT skip agents because they "might not apply" - let them decide
- Launch ALL agents in a SINGLE message with multiple Task tool calls
- 20, 30, 40 parallel agents is fine - use everything
- Each agent may catch something others miss
- The goal is MAXIMUM coverage, not efficiency
**Step 4: Also discover and run research agents**
Research agents (like `best-practices-researcher`, `framework-docs-researcher`, `git-history-analyzer`, `repo-research-analyst`) should also be run for relevant plan sections.
### 6. Wait for ALL Agents and Synthesize Everything
<thinking>
Wait for ALL parallel agents to complete - skills, research agents, review agents, everything. Then synthesize all findings into a comprehensive enhancement.
</thinking>
**Collect outputs from ALL sources:**
1. **Skill-based sub-agents** - Each skill's full output (code examples, patterns, recommendations)
2. **Learnings/Solutions sub-agents** - Relevant documented learnings from /ce:compound
3. **Research agents** - Best practices, documentation, real-world examples
4. **Review agents** - All feedback from every reviewer (architecture, security, performance, simplicity, etc.)
5. **Context7 queries** - Framework documentation and patterns
6. **Web searches** - Current best practices and articles
**For each agent's findings, extract:**
- [ ] Concrete recommendations (actionable items)
- [ ] Code patterns and examples (copy-paste ready)
- [ ] Anti-patterns to avoid (warnings)
- [ ] Performance considerations (metrics, benchmarks)
- [ ] Security considerations (vulnerabilities, mitigations)
- [ ] Edge cases discovered (handling strategies)
- [ ] Documentation links (references)
- [ ] Skill-specific patterns (from matched skills)
- [ ] Relevant learnings (past solutions that apply - prevent repeating mistakes)
**Deduplicate and prioritize:**
- Merge similar recommendations from multiple agents
- Prioritize by impact (high-value improvements first)
- Flag conflicting advice for human review
- Group by plan section
### 7. Enhance Plan Sections
<thinking>
Merge research findings back into the plan, adding depth without changing the original structure.
</thinking>
**Enhancement format for each section:**
```markdown
## [Original Section Title]
[Original content preserved]
### Research Insights
**Best Practices:**
- [Concrete recommendation 1]
- [Concrete recommendation 2]
**Performance Considerations:**
- [Optimization opportunity]
- [Benchmark or metric to target]
**Implementation Details:**
```[language]
// Concrete code example from research
```
**Edge Cases:**
- [Edge case 1 and how to handle]
- [Edge case 2 and how to handle]
**References:**
- [Documentation URL 1]
- [Documentation URL 2]
```
### 8. Add Enhancement Summary
At the top of the plan, add a summary section:
```markdown
## Enhancement Summary
**Deepened on:** [Date]
**Sections enhanced:** [Count]
**Research agents used:** [List]
### Key Improvements
1. [Major improvement 1]
2. [Major improvement 2]
3. [Major improvement 3]
### New Considerations Discovered
- [Important finding 1]
- [Important finding 2]
```
### 9. Update Plan File
**Write the enhanced plan:**
- Preserve original filename
- Add `-deepened` suffix if user prefers a new file
- Update any timestamps or metadata
## Output Format
Update the plan file in place (or if user requests a separate file, append `-deepened` after `-plan`, e.g., `2026-01-15-feat-auth-plan-deepened.md`).
## Quality Checks
Before finalizing:
- [ ] All original content preserved
- [ ] Research insights clearly marked and attributed
- [ ] Code examples are syntactically correct
- [ ] Links are valid and relevant
- [ ] No contradictions between sections
- [ ] Enhancement summary accurately reflects changes
## Post-Enhancement Options
After writing the enhanced plan, use the **AskUserQuestion tool** to present these options:
**Question:** "Plan deepened at `[plan_path]`. What would you like to do next?"
**Options:**
1. **View diff** - Show what was added/changed
2. **Start `/ce:work`** - Begin implementing this enhanced plan
3. **Deepen further** - Run another round of research on specific sections
4. **Revert** - Restore original plan (if backup exists)
Based on selection:
- **View diff** → Run `git diff [plan_path]` or show before/after
- **`/ce:work`** → Call the /ce:work command with the plan file path
- **Deepen further** → Ask which sections need more research, then re-run those agents
- **Revert** → Restore from git or backup
## Example Enhancement
**Before (from /workflows:plan):**
```markdown
## Technical Approach
Use React Query for data fetching with optimistic updates.
```
**After (from /workflows:deepen-plan):**
```markdown
## Technical Approach
Use React Query for data fetching with optimistic updates.
### Research Insights
**Best Practices:**
- Configure `staleTime` and `cacheTime` based on data freshness requirements
- Use `queryKey` factories for consistent cache invalidation
- Implement error boundaries around query-dependent components
**Performance Considerations:**
- Enable `refetchOnWindowFocus: false` for stable data to reduce unnecessary requests
- Use `select` option to transform and memoize data at query level
- Consider `placeholderData` for instant perceived loading
**Implementation Details:**
```typescript
// Recommended query configuration
const queryClient = new QueryClient({
defaultOptions: {
queries: {
staleTime: 5 * 60 * 1000, // 5 minutes
retry: 2,
refetchOnWindowFocus: false,
},
},
});
```
**Edge Cases:**
- Handle race conditions with `cancelQueries` on component unmount
- Implement retry logic for transient network failures
- Consider offline support with `persistQueryClient`
**References:**
- https://tanstack.com/query/latest/docs/react/guides/optimistic-updates
- https://tkdodo.eu/blog/practical-react-query
```
NEVER CODE! Just research and enhance the plan.

View File

@@ -0,0 +1,112 @@
---
name: deploy-docs
description: Validate and prepare documentation for GitHub Pages deployment
disable-model-invocation: true
---
# Deploy Documentation Command
Validate the documentation site and prepare it for GitHub Pages deployment.
## Step 1: Validate Documentation
Run these checks:
```bash
# Count components
echo "Agents: $(ls plugins/compound-engineering/agents/*.md | wc -l)"
echo "Skills: $(ls -d plugins/compound-engineering/skills/*/ 2>/dev/null | wc -l)"
# Validate JSON
cat .claude-plugin/marketplace.json | jq . > /dev/null && echo "✓ marketplace.json valid"
cat plugins/compound-engineering/.claude-plugin/plugin.json | jq . > /dev/null && echo "✓ plugin.json valid"
# Check all HTML files exist
for page in index agents commands skills mcp-servers changelog getting-started; do
if [ -f "plugins/compound-engineering/docs/pages/${page}.html" ] || [ -f "plugins/compound-engineering/docs/${page}.html" ]; then
echo "${page}.html exists"
else
echo "${page}.html MISSING"
fi
done
```
## Step 2: Check for Uncommitted Changes
```bash
git status --porcelain plugins/compound-engineering/docs/
```
If there are uncommitted changes, warn the user to commit first.
## Step 3: Deployment Instructions
Since GitHub Pages deployment requires a workflow file with special permissions, provide these instructions:
### First-time Setup
1. Create `.github/workflows/deploy-docs.yml` with the GitHub Pages workflow
2. Go to repository Settings > Pages
3. Set Source to "GitHub Actions"
### Deploying
After merging to `main`, the docs will auto-deploy. Or:
1. Go to Actions tab
2. Select "Deploy Documentation to GitHub Pages"
3. Click "Run workflow"
### Workflow File Content
```yaml
name: Deploy Documentation to GitHub Pages
on:
push:
branches: [main]
paths:
- 'plugins/compound-engineering/docs/**'
workflow_dispatch:
permissions:
contents: read
pages: write
id-token: write
concurrency:
group: "pages"
cancel-in-progress: false
jobs:
deploy:
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/configure-pages@v4
- uses: actions/upload-pages-artifact@v3
with:
path: 'plugins/compound-engineering/docs'
- uses: actions/deploy-pages@v4
```
## Step 4: Report Status
Provide a summary:
```
## Deployment Readiness
✓ All HTML pages present
✓ JSON files valid
✓ Component counts match
### Next Steps
- [ ] Commit any pending changes
- [ ] Push to main branch
- [ ] Verify GitHub Pages workflow exists
- [ ] Check deployment at https://everyinc.github.io/compound-engineering-plugin/
```

View File

@@ -36,7 +36,7 @@ Score the document against these criteria:
| **Specificity** | Concrete enough for next step (brainstorm → can plan, plan → can implement) |
| **YAGNI** | No hypothetical features, simplest approach chosen |
If invoked within a workflow (after `/workflows:brainstorm` or `/workflows:plan`), also check:
If invoked within a workflow (after `/ce:brainstorm` or `/ce:plan`), also check:
- **User intent fidelity** — Document reflects what was discussed, assumptions validated
## Step 4: Identify the Critical Improvement

View File

@@ -0,0 +1,351 @@
---
name: feature-video
description: Record a video walkthrough of a feature and add it to the PR description
argument-hint: "[PR number or 'current'] [optional: base URL, default localhost:3000]"
---
# Feature Video Walkthrough
<command_purpose>Record a video walkthrough demonstrating a feature, upload it, and add it to the PR description.</command_purpose>
## Introduction
<role>Developer Relations Engineer creating feature demo videos</role>
This command creates professional video walkthroughs of features for PR documentation:
- Records browser interactions using agent-browser CLI
- Demonstrates the complete user flow
- Uploads the video for easy sharing
- Updates the PR description with an embedded video
## Prerequisites
<requirements>
- Local development server running (e.g., `bin/dev`, `rails server`)
- agent-browser CLI installed
- Git repository with a PR to document
- `ffmpeg` installed (for video conversion)
- `rclone` configured (optional, for cloud upload - see rclone skill)
- Public R2 base URL known (for example, `https://<public-domain>.r2.dev`)
</requirements>
## Setup
**Check installation:**
```bash
command -v agent-browser >/dev/null 2>&1 && echo "Installed" || echo "NOT INSTALLED"
```
**Install if needed:**
```bash
npm install -g agent-browser && agent-browser install
```
See the `agent-browser` skill for detailed usage.
## Main Tasks
### 1. Parse Arguments
<parse_args>
**Arguments:** $ARGUMENTS
Parse the input:
- First argument: PR number or "current" (defaults to current branch's PR)
- Second argument: Base URL (defaults to `http://localhost:3000`)
```bash
# Get PR number for current branch if needed
gh pr view --json number -q '.number'
```
</parse_args>
### 2. Gather Feature Context
<gather_context>
**Get PR details:**
```bash
gh pr view [number] --json title,body,files,headRefName -q '.'
```
**Get changed files:**
```bash
gh pr view [number] --json files -q '.files[].path'
```
**Map files to testable routes** (same as playwright-test):
| File Pattern | Route(s) |
|-------------|----------|
| `app/views/users/*` | `/users`, `/users/:id`, `/users/new` |
| `app/controllers/settings_controller.rb` | `/settings` |
| `app/javascript/controllers/*_controller.js` | Pages using that Stimulus controller |
| `app/components/*_component.rb` | Pages rendering that component |
</gather_context>
### 3. Plan the Video Flow
<plan_flow>
Before recording, create a shot list:
1. **Opening shot**: Homepage or starting point (2-3 seconds)
2. **Navigation**: How user gets to the feature
3. **Feature demonstration**: Core functionality (main focus)
4. **Edge cases**: Error states, validation, etc. (if applicable)
5. **Success state**: Completed action/result
Ask user to confirm or adjust the flow:
```markdown
**Proposed Video Flow**
Based on PR #[number]: [title]
1. Start at: /[starting-route]
2. Navigate to: /[feature-route]
3. Demonstrate:
- [Action 1]
- [Action 2]
- [Action 3]
4. Show result: [success state]
Estimated duration: ~[X] seconds
Does this look right?
1. Yes, start recording
2. Modify the flow (describe changes)
3. Add specific interactions to demonstrate
```
</plan_flow>
### 4. Setup Video Recording
<setup_recording>
**Create videos directory:**
```bash
mkdir -p tmp/videos
```
**Recording approach: Use browser screenshots as frames**
agent-browser captures screenshots at key moments, then combine into video using ffmpeg:
```bash
ffmpeg -framerate 2 -pattern_type glob -i 'tmp/screenshots/*.png' -vf "scale=1280:-1" tmp/videos/feature-demo.gif
```
</setup_recording>
### 5. Record the Walkthrough
<record_walkthrough>
Execute the planned flow, capturing each step:
**Step 1: Navigate to starting point**
```bash
agent-browser open "[base-url]/[start-route]"
agent-browser wait 2000
agent-browser screenshot tmp/screenshots/01-start.png
```
**Step 2: Perform navigation/interactions**
```bash
agent-browser snapshot -i # Get refs
agent-browser click @e1 # Click navigation element
agent-browser wait 1000
agent-browser screenshot tmp/screenshots/02-navigate.png
```
**Step 3: Demonstrate feature**
```bash
agent-browser snapshot -i # Get refs for feature elements
agent-browser click @e2 # Click feature element
agent-browser wait 1000
agent-browser screenshot tmp/screenshots/03-feature.png
```
**Step 4: Capture result**
```bash
agent-browser wait 2000
agent-browser screenshot tmp/screenshots/04-result.png
```
**Create video/GIF from screenshots:**
```bash
# Create directories
mkdir -p tmp/videos tmp/screenshots
# Create MP4 video (RECOMMENDED - better quality, smaller size)
# -framerate 0.5 = 2 seconds per frame (slower playback)
# -framerate 1 = 1 second per frame
ffmpeg -y -framerate 0.5 -pattern_type glob -i 'tmp/screenshots/*.png' \
-c:v libx264 -pix_fmt yuv420p -vf "scale=1280:-2" \
tmp/videos/feature-demo.mp4
# Create low-quality GIF for preview (small file, for GitHub embed)
ffmpeg -y -framerate 0.5 -pattern_type glob -i 'tmp/screenshots/*.png' \
-vf "scale=640:-1:flags=lanczos,split[s0][s1];[s0]palettegen=max_colors=128[p];[s1][p]paletteuse" \
-loop 0 tmp/videos/feature-demo-preview.gif
```
**Note:**
- The `-2` in MP4 scale ensures height is divisible by 2 (required for H.264)
- Preview GIF uses 640px width and 128 colors to keep file size small (~100-200KB)
</record_walkthrough>
### 6. Upload the Video
<upload_video>
**Upload with rclone:**
```bash
# Check rclone is configured
rclone listremotes
# Set your public base URL (NO trailing slash)
PUBLIC_BASE_URL="https://<your-public-r2-domain>.r2.dev"
# Upload video, preview GIF, and screenshots to cloud storage
# Use --s3-no-check-bucket to avoid permission errors
rclone copy tmp/videos/ r2:kieran-claude/pr-videos/pr-[number]/ --s3-no-check-bucket --progress
rclone copy tmp/screenshots/ r2:kieran-claude/pr-videos/pr-[number]/screenshots/ --s3-no-check-bucket --progress
# List uploaded files
rclone ls r2:kieran-claude/pr-videos/pr-[number]/
# Build and validate public URLs BEFORE updating PR
VIDEO_URL="$PUBLIC_BASE_URL/pr-videos/pr-[number]/feature-demo.mp4"
PREVIEW_URL="$PUBLIC_BASE_URL/pr-videos/pr-[number]/feature-demo-preview.gif"
curl -I "$VIDEO_URL"
curl -I "$PREVIEW_URL"
# Require HTTP 200 for both URLs; stop if either fails
curl -I "$VIDEO_URL" | head -n 1 | grep -q ' 200 ' || exit 1
curl -I "$PREVIEW_URL" | head -n 1 | grep -q ' 200 ' || exit 1
```
</upload_video>
### 7. Update PR Description
<update_pr>
**Get current PR body:**
```bash
gh pr view [number] --json body -q '.body'
```
**Add video section to PR description:**
If the PR already has a video section, replace it. Otherwise, append:
**IMPORTANT:** GitHub cannot embed external MP4s directly. Use a clickable GIF that links to the video:
```markdown
## Demo
[![Feature Demo]([preview-gif-url])]([video-mp4-url])
*Click to view full video*
```
Example:
```markdown
[![Feature Demo](https://<your-public-r2-domain>.r2.dev/pr-videos/pr-137/feature-demo-preview.gif)](https://<your-public-r2-domain>.r2.dev/pr-videos/pr-137/feature-demo.mp4)
```
**Update the PR:**
```bash
gh pr edit [number] --body "[updated body with video section]"
```
**Or add as a comment if preferred:**
```bash
gh pr comment [number] --body "## Feature Demo
![Demo]([video-url])
_Automated walkthrough of the changes in this PR_"
```
</update_pr>
### 8. Cleanup
<cleanup>
```bash
# Optional: Clean up screenshots
rm -rf tmp/screenshots
# Keep videos for reference
echo "Video retained at: tmp/videos/feature-demo.gif"
```
</cleanup>
### 9. Summary
<summary>
Present completion summary:
```markdown
## Feature Video Complete
**PR:** #[number] - [title]
**Video:** [url or local path]
**Duration:** ~[X] seconds
**Format:** [GIF/MP4]
### Shots Captured
1. [Starting point] - [description]
2. [Navigation] - [description]
3. [Feature demo] - [description]
4. [Result] - [description]
### PR Updated
- [x] Video section added to PR description
- [ ] Ready for review
**Next steps:**
- Review the video to ensure it accurately demonstrates the feature
- Share with reviewers for context
```
</summary>
## Quick Usage Examples
```bash
# Record video for current branch's PR
/feature-video
# Record video for specific PR
/feature-video 847
# Record with custom base URL
/feature-video 847 http://localhost:5000
# Record for staging environment
/feature-video current https://staging.example.com
```
## Tips
- **Keep it short**: 10-30 seconds is ideal for PR demos
- **Focus on the change**: Don't include unrelated UI
- **Show before/after**: If fixing a bug, show the broken state first (if possible)
- **Annotate if needed**: Add text overlays for complex features

View File

@@ -192,7 +192,7 @@ Work logs serve as:
| Trigger | Flow | Tool |
|---------|------|------|
| Code review | `/workflows:review` → Findings → `/triage` → Todos | Review agent + skill |
| Code review | `/ce:review` → Findings → `/triage` → Todos | Review agent + skill |
| PR comments | `/resolve_pr_parallel` → Individual fixes → Todos | gh CLI + skill |
| Code TODOs | `/resolve_todo_parallel` → Fixes + Complex todos | Agent + skill |
| Planning | Brainstorm → Create todo → Work → Complete | Skill |

View File

@@ -0,0 +1,163 @@
---
name: generate_command
description: Create a new custom slash command following conventions and best practices
argument-hint: "[command purpose and requirements]"
disable-model-invocation: true
---
# Create a Custom Claude Code Command
Create a new skill in `.claude/skills/` for the requested task.
## Goal
#$ARGUMENTS
## Key Capabilities to Leverage
**File Operations:**
- Read, Edit, Write - modify files precisely
- Glob, Grep - search codebase
- MultiEdit - atomic multi-part changes
**Development:**
- Bash - run commands (git, tests, linters)
- Task - launch specialized agents for complex tasks
- TodoWrite - track progress with todo lists
**Web & APIs:**
- WebFetch, WebSearch - research documentation
- GitHub (gh cli) - PRs, issues, reviews
- Playwright - browser automation, screenshots
**Integrations:**
- AppSignal - logs and monitoring
- Context7 - framework docs
- Stripe, Todoist, Featurebase (if relevant)
## Best Practices
1. **Be specific and clear** - detailed instructions yield better results
2. **Break down complex tasks** - use step-by-step plans
3. **Use examples** - reference existing code patterns
4. **Include success criteria** - tests pass, linting clean, etc.
5. **Think first** - use "think hard" or "plan" keywords for complex problems
6. **Iterate** - guide the process step by step
## Required: YAML Frontmatter
**EVERY command MUST start with YAML frontmatter:**
```yaml
---
name: command-name
description: Brief description of what this command does (max 100 chars)
argument-hint: "[what arguments the command accepts]"
---
```
**Fields:**
- `name`: Lowercase command identifier (used internally)
- `description`: Clear, concise summary of command purpose
- `argument-hint`: Shows user what arguments are expected (e.g., `[file path]`, `[PR number]`, `[optional: format]`)
## Structure Your Command
```markdown
# [Command Name]
[Brief description of what this command does]
## Steps
1. [First step with specific details]
- Include file paths, patterns, or constraints
- Reference existing code if applicable
2. [Second step]
- Use parallel tool calls when possible
- Check/verify results
3. [Final steps]
- Run tests
- Lint code
- Commit changes (if appropriate)
## Success Criteria
- [ ] Tests pass
- [ ] Code follows style guide
- [ ] Documentation updated (if needed)
```
## Tips for Effective Commands
- **Use $ARGUMENTS** placeholder for dynamic inputs
- **Reference CLAUDE.md** patterns and conventions
- **Include verification steps** - tests, linting, visual checks
- **Be explicit about constraints** - don't modify X, use pattern Y
- **Use XML tags** for structured prompts: `<task>`, `<requirements>`, `<constraints>`
## Example Pattern
```markdown
Implement #$ARGUMENTS following these steps:
1. Research existing patterns
- Search for similar code using Grep
- Read relevant files to understand approach
2. Plan the implementation
- Think through edge cases and requirements
- Consider test cases needed
3. Implement
- Follow existing code patterns (reference specific files)
- Write tests first if doing TDD
- Ensure code follows CLAUDE.md conventions
4. Verify
- Run tests: `bin/rails test`
- Run linter: `bundle exec standardrb`
- Check changes with git diff
5. Commit (optional)
- Stage changes
- Write clear commit message
```
## Creating the Command File
1. **Create the directory** at `.claude/skills/[name]/SKILL.md`
2. **Start with YAML frontmatter** (see section above)
3. **Structure the skill** using the template above
4. **Test the skill** by using it with appropriate arguments
## Command File Template
```markdown
---
name: command-name
description: What this command does
argument-hint: "[expected arguments]"
---
# Command Title
Brief introduction of what the command does and when to use it.
## Workflow
### Step 1: [First Major Step]
Details about what to do.
### Step 2: [Second Major Step]
Details about what to do.
## Success Criteria
- [ ] Expected outcome 1
- [ ] Expected outcome 2
```

View File

@@ -38,8 +38,8 @@ git worktree add .worktrees/feature-name -b feature-name main
Use this skill in these scenarios:
1. **Code Review (`/workflows:review`)**: If NOT already on the target branch (PR branch or requested branch), offer worktree for isolated review
2. **Feature Work (`/workflows:work`)**: Always ask if user wants parallel worktree or live branch work
1. **Code Review (`/ce:review`)**: If NOT already on the target branch (PR branch or requested branch), offer worktree for isolated review
2. **Feature Work (`/ce:work`)**: Always ask if user wants parallel worktree or live branch work
3. **Parallel Development**: When working on multiple features simultaneously
4. **Cleanup**: After completing work in a worktree
@@ -47,7 +47,7 @@ Use this skill in these scenarios:
### In Claude Code Workflows
The skill is automatically called from `/workflows:review` and `/workflows:work` commands:
The skill is automatically called from `/ce:review` and `/ce:work` commands:
```
# For review: offers worktree if not on PR branch
@@ -204,7 +204,7 @@ bash ${CLAUDE_PLUGIN_ROOT}/skills/git-worktree/scripts/worktree-manager.sh clean
## Integration with Workflows
### `/workflows:review`
### `/ce:review`
Instead of always creating a worktree:
@@ -217,7 +217,7 @@ Instead of always creating a worktree:
- no → proceed with PR diff on current branch
```
### `/workflows:work`
### `/ce:work`
Always offer choice:

View File

@@ -0,0 +1,143 @@
---
name: heal-skill
description: Fix incorrect SKILL.md files when a skill has wrong instructions or outdated API references
argument-hint: "[optional: specific issue to fix]"
allowed-tools: [Read, Edit, Bash(ls:*), Bash(git:*)]
disable-model-invocation: true
---
<objective>
Update a skill's SKILL.md and related files based on corrections discovered during execution.
Analyze the conversation to detect which skill is running, reflect on what went wrong, propose specific fixes, get user approval, then apply changes with optional commit.
</objective>
<context>
Skill detection: !`ls -1 ./skills/*/SKILL.md | head -5`
</context>
<quick_start>
<workflow>
1. **Detect skill** from conversation context (invocation messages, recent SKILL.md references)
2. **Reflect** on what went wrong and how you discovered the fix
3. **Present** proposed changes with before/after diffs
4. **Get approval** before making any edits
5. **Apply** changes and optionally commit
</workflow>
</quick_start>
<process>
<step_1 name="detect_skill">
Identify the skill from conversation context:
- Look for skill invocation messages
- Check which SKILL.md was recently referenced
- Examine current task context
Set: `SKILL_NAME=[skill-name]` and `SKILL_DIR=./skills/$SKILL_NAME`
If unclear, ask the user.
</step_1>
<step_2 name="reflection_and_analysis">
Focus on $ARGUMENTS if provided, otherwise analyze broader context.
Determine:
- **What was wrong**: Quote specific sections from SKILL.md that are incorrect
- **Discovery method**: Context7, error messages, trial and error, documentation lookup
- **Root cause**: Outdated API, incorrect parameters, wrong endpoint, missing context
- **Scope of impact**: Single section or multiple? Related files affected?
- **Proposed fix**: Which files, which sections, before/after for each
</step_2>
<step_3 name="scan_affected_files">
```bash
ls -la $SKILL_DIR/
ls -la $SKILL_DIR/references/ 2>/dev/null
ls -la $SKILL_DIR/scripts/ 2>/dev/null
```
</step_3>
<step_4 name="present_proposed_changes">
Present changes in this format:
```
**Skill being healed:** [skill-name]
**Issue discovered:** [1-2 sentence summary]
**Root cause:** [brief explanation]
**Files to be modified:**
- [ ] SKILL.md
- [ ] references/[file].md
- [ ] scripts/[file].py
**Proposed changes:**
### Change 1: SKILL.md - [Section name]
**Location:** Line [X] in SKILL.md
**Current (incorrect):**
```
[exact text from current file]
```
**Corrected:**
```
[new text]
```
**Reason:** [why this fixes the issue]
[repeat for each change across all files]
**Impact assessment:**
- Affects: [authentication/API endpoints/parameters/examples/etc.]
**Verification:**
These changes will prevent: [specific error that prompted this]
```
</step_4>
<step_5 name="request_approval">
```
Should I apply these changes?
1. Yes, apply and commit all changes
2. Apply but don't commit (let me review first)
3. Revise the changes (I'll provide feedback)
4. Cancel (don't make changes)
Choose (1-4):
```
**Wait for user response. Do not proceed without approval.**
</step_5>
<step_6 name="apply_changes">
Only after approval (option 1 or 2):
1. Use Edit tool for each correction across all files
2. Read back modified sections to verify
3. If option 1, commit with structured message showing what was healed
4. Confirm completion with file list
</step_6>
</process>
<success_criteria>
- Skill correctly detected from conversation context
- All incorrect sections identified with before/after
- User approved changes before application
- All edits applied across SKILL.md and related files
- Changes verified by reading back
- Commit created if user chose option 1
- Completion confirmed with file list
</success_criteria>
<verification>
Before completing:
- Read back each modified section to confirm changes applied
- Ensure cross-file consistency (SKILL.md examples match references/)
- Verify git commit created if option 1 was selected
- Check no unintended files were modified
</verification>

View File

@@ -0,0 +1,34 @@
---
name: lfg
description: Full autonomous engineering workflow
argument-hint: "[feature description]"
disable-model-invocation: true
---
CRITICAL: You MUST execute every step below IN ORDER. Do NOT skip any step. Do NOT jump ahead to coding or implementation. The plan phase (steps 2-3) MUST be completed and verified BEFORE any work begins. Violating this order produces bad output.
1. **Optional:** If the `ralph-wiggum` skill is available, run `/ralph-wiggum:ralph-loop "finish all slash commands" --completion-promise "DONE"`. If not available or it fails, skip and continue to step 2 immediately.
2. `/ce:plan $ARGUMENTS`
GATE: STOP. Verify that `/ce:plan` produced a plan file in `docs/plans/`. If no plan file was created, run `/ce:plan $ARGUMENTS` again. Do NOT proceed to step 3 until a written plan exists.
3. `/compound-engineering:deepen-plan`
GATE: STOP. Confirm the plan has been deepened and updated. The plan file in `docs/plans/` should now contain additional detail. Do NOT proceed to step 4 without a deepened plan.
4. `/ce:work`
GATE: STOP. Verify that implementation work was performed - files were created or modified beyond the plan. Do NOT proceed to step 5 if no code changes were made.
5. `/ce:review`
6. `/compound-engineering:resolve_todo_parallel`
7. `/compound-engineering:test-browser`
8. `/compound-engineering:feature-video`
9. Output `<promise>DONE</promise>` when video is in PR
Start with step 2 now (or step 1 if ralph-wiggum is available). Remember: plan FIRST, then work. Never skip the plan.

View File

@@ -0,0 +1,185 @@
---
name: proof
description: Create, edit, comment on, and share markdown documents via Proof's web API and local bridge. Use when asked to "proof", "share a doc", "create a proof doc", "comment on a document", "suggest edits", "review in proof", or when given a proofeditor.ai URL.
allowed-tools:
- Bash
- Read
- Write
- WebFetch
---
# Proof - Collaborative Markdown Editor
Proof is a collaborative document editor for humans and agents. It supports two modes:
1. **Web API** - Create and edit shared documents via HTTP (no install needed)
2. **Local Bridge** - Drive the macOS Proof app via localhost:9847
## Web API (Primary for Sharing)
### Create a Shared Document
No authentication required. Returns a shareable URL with access token.
```bash
curl -X POST https://www.proofeditor.ai/share/markdown \
-H "Content-Type: application/json" \
-d '{"title":"My Doc","markdown":"# Hello\n\nContent here."}'
```
**Response format:**
```json
{
"slug": "abc123",
"tokenUrl": "https://www.proofeditor.ai/d/abc123?token=xxx",
"accessToken": "xxx",
"ownerSecret": "yyy",
"_links": {
"state": "https://www.proofeditor.ai/api/agent/abc123/state",
"ops": "https://www.proofeditor.ai/api/agent/abc123/ops"
}
}
```
Use the `tokenUrl` as the shareable link. The `_links` give you the exact API paths.
### Read a Shared Document
```bash
curl -s "https://www.proofeditor.ai/api/agent/{slug}/state" \
-H "x-share-token: <token>"
```
### Edit a Shared Document
All operations go to `POST https://www.proofeditor.ai/api/agent/{slug}/ops`
**Note:** Use the `/api/agent/{slug}/ops` path (from `_links` in create response), NOT `/api/documents/{slug}/ops`.
**Authentication for protected docs:**
- Header: `x-share-token: <token>` or `Authorization: Bearer <token>`
- Token comes from the URL parameter: `?token=xxx` or the `accessToken` from create response
**Comment on text:**
```json
{"op": "comment.add", "quote": "text to comment on", "by": "ai:<agent-name>", "text": "Your comment here"}
```
**Reply to a comment:**
```json
{"op": "comment.reply", "markId": "<id>", "by": "ai:<agent-name>", "text": "Reply text"}
```
**Resolve a comment:**
```json
{"op": "comment.resolve", "markId": "<id>", "by": "ai:<agent-name>"}
```
**Suggest a replacement:**
```json
{"op": "suggestion.add", "kind": "replace", "quote": "original text", "by": "ai:<agent-name>", "content": "replacement text"}
```
**Suggest a deletion:**
```json
{"op": "suggestion.add", "kind": "delete", "quote": "text to delete", "by": "ai:<agent-name>"}
```
**Bulk rewrite:**
```json
{"op": "rewrite.apply", "content": "full new markdown", "by": "ai:<agent-name>"}
```
### Known Limitations (Web API)
- `suggestion.add` with `kind: "insert"` returns Bad Request on the web ops endpoint. Use `kind: "replace"` with a broader quote instead, or use `rewrite.apply` for insertions.
- Bridge-style endpoints (`/d/{slug}/bridge/*`) require client version headers (`x-proof-client-version`, `x-proof-client-build`, `x-proof-client-protocol`) and return 426 CLIENT_UPGRADE_REQUIRED without them. Use the `/api/agent/{slug}/ops` endpoint instead.
## Local Bridge (macOS App)
Requires Proof.app running. Bridge at `http://localhost:9847`.
**Required headers:**
- `X-Agent-Id: claude` (identity for presence)
- `Content-Type: application/json`
- `X-Window-Id: <uuid>` (when multiple docs open)
### Key Endpoints
| Method | Endpoint | Purpose |
|--------|----------|---------|
| GET | `/windows` | List open documents |
| GET | `/state` | Read markdown, cursor, word count |
| GET | `/marks` | List all suggestions and comments |
| POST | `/marks/suggest-replace` | `{"quote":"old","by":"ai:<agent-name>","content":"new"}` |
| POST | `/marks/suggest-insert` | `{"quote":"after this","by":"ai:<agent-name>","content":"insert"}` |
| POST | `/marks/suggest-delete` | `{"quote":"delete this","by":"ai:<agent-name>"}` |
| POST | `/marks/comment` | `{"quote":"text","by":"ai:<agent-name>","text":"comment"}` |
| POST | `/marks/reply` | `{"markId":"<id>","by":"ai:<agent-name>","text":"reply"}` |
| POST | `/marks/resolve` | `{"markId":"<id>","by":"ai:<agent-name>"}` |
| POST | `/marks/accept` | `{"markId":"<id>"}` |
| POST | `/marks/reject` | `{"markId":"<id>"}` |
| POST | `/rewrite` | `{"content":"full markdown","by":"ai:<agent-name>"}` |
| POST | `/presence` | `{"status":"reading","summary":"..."}` |
| GET | `/events/pending` | Poll for user actions |
### Presence Statuses
`thinking`, `reading`, `idle`, `acting`, `waiting`, `completed`
## Workflow: Review a Shared Document
When given a Proof URL like `https://www.proofeditor.ai/d/abc123?token=xxx`:
1. Extract the slug (`abc123`) and token from the URL
2. Read the document state via the API
3. Add comments or suggest edits using the ops endpoint
4. The author sees changes in real-time
```bash
# Read
curl -s "https://www.proofeditor.ai/api/agent/abc123/state" \
-H "x-share-token: xxx"
# Comment
curl -X POST "https://www.proofeditor.ai/api/agent/abc123/ops" \
-H "Content-Type: application/json" \
-H "x-share-token: xxx" \
-d '{"op":"comment.add","quote":"text","by":"ai:compound","text":"comment"}'
# Suggest edit
curl -X POST "https://www.proofeditor.ai/api/agent/abc123/ops" \
-H "Content-Type: application/json" \
-H "x-share-token: xxx" \
-d '{"op":"suggestion.add","kind":"replace","quote":"old","by":"ai:compound","content":"new"}'
```
## Workflow: Create and Share a New Document
```bash
# 1. Create
RESPONSE=$(curl -s -X POST https://www.proofeditor.ai/share/markdown \
-H "Content-Type: application/json" \
-d '{"title":"My Doc","markdown":"# Title\n\nContent here."}')
# 2. Extract URL and token
URL=$(echo "$RESPONSE" | jq -r '.tokenUrl')
SLUG=$(echo "$RESPONSE" | jq -r '.slug')
TOKEN=$(echo "$RESPONSE" | jq -r '.accessToken')
# 3. Share the URL
echo "$URL"
# 4. Make edits using the ops endpoint
curl -X POST "https://www.proofeditor.ai/api/agent/$SLUG/ops" \
-H "Content-Type: application/json" \
-H "x-share-token: $TOKEN" \
-d '{"op":"comment.add","quote":"Content here","by":"ai:compound","text":"Added a note"}'
```
## Safety
- Use `/state` content as source of truth before editing
- Prefer suggest-replace over full rewrite for small changes
- Don't span table cells in a single replace
- Always include `by` field for attribution tracking

View File

@@ -0,0 +1,151 @@
---
name: report-bug
description: Report a bug in the compound-engineering plugin
argument-hint: "[optional: brief description of the bug]"
disable-model-invocation: true
---
# Report a Compounding Engineering Plugin Bug
Report bugs encountered while using the compound-engineering plugin. This command gathers structured information and creates a GitHub issue for the maintainer.
## Step 1: Gather Bug Information
Use the AskUserQuestion tool to collect the following information:
**Question 1: Bug Category**
- What type of issue are you experiencing?
- Options: Agent not working, Command not working, Skill not working, MCP server issue, Installation problem, Other
**Question 2: Specific Component**
- Which specific component is affected?
- Ask for the name of the agent, command, skill, or MCP server
**Question 3: What Happened (Actual Behavior)**
- Ask: "What happened when you used this component?"
- Get a clear description of the actual behavior
**Question 4: What Should Have Happened (Expected Behavior)**
- Ask: "What did you expect to happen instead?"
- Get a clear description of expected behavior
**Question 5: Steps to Reproduce**
- Ask: "What steps did you take before the bug occurred?"
- Get reproduction steps
**Question 6: Error Messages**
- Ask: "Did you see any error messages? If so, please share them."
- Capture any error output
## Step 2: Collect Environment Information
Automatically gather:
```bash
# Get plugin version
cat ~/.claude/plugins/installed_plugins.json 2>/dev/null | grep -A5 "compound-engineering" | head -10 || echo "Plugin info not found"
# Get Claude Code version
claude --version 2>/dev/null || echo "Claude CLI version unknown"
# Get OS info
uname -a
```
## Step 3: Format the Bug Report
Create a well-structured bug report with:
```markdown
## Bug Description
**Component:** [Type] - [Name]
**Summary:** [Brief description from argument or collected info]
## Environment
- **Plugin Version:** [from installed_plugins.json]
- **Claude Code Version:** [from claude --version]
- **OS:** [from uname]
## What Happened
[Actual behavior description]
## Expected Behavior
[Expected behavior description]
## Steps to Reproduce
1. [Step 1]
2. [Step 2]
3. [Step 3]
## Error Messages
```
[Any error output]
```
## Additional Context
[Any other relevant information]
---
*Reported via `/report-bug` command*
```
## Step 4: Create GitHub Issue
Use the GitHub CLI to create the issue:
```bash
gh issue create \
--repo EveryInc/compound-engineering-plugin \
--title "[compound-engineering] Bug: [Brief description]" \
--body "[Formatted bug report from Step 3]" \
--label "bug,compound-engineering"
```
**Note:** If labels don't exist, create without labels:
```bash
gh issue create \
--repo EveryInc/compound-engineering-plugin \
--title "[compound-engineering] Bug: [Brief description]" \
--body "[Formatted bug report]"
```
## Step 5: Confirm Submission
After the issue is created:
1. Display the issue URL to the user
2. Thank them for reporting the bug
3. Let them know the maintainer (Kieran Klaassen) will be notified
## Output Format
```
✅ Bug report submitted successfully!
Issue: https://github.com/EveryInc/compound-engineering-plugin/issues/[NUMBER]
Title: [compound-engineering] Bug: [description]
Thank you for helping improve the compound-engineering plugin!
The maintainer will review your report and respond as soon as possible.
```
## Error Handling
- If `gh` CLI is not authenticated: Prompt user to run `gh auth login` first
- If issue creation fails: Display the formatted report so user can manually create the issue
- If required information is missing: Re-prompt for that specific field
## Privacy Notice
This command does NOT collect:
- Personal information
- API keys or credentials
- Private code from your projects
- File paths beyond basic OS info
Only technical information about the bug is included in the report.

View File

@@ -0,0 +1,100 @@
---
name: reproduce-bug
description: Reproduce and investigate a bug using logs, console inspection, and browser screenshots
argument-hint: "[GitHub issue number]"
disable-model-invocation: true
---
# Reproduce Bug Command
Look at github issue #$ARGUMENTS and read the issue description and comments.
## Phase 1: Log Investigation
Run the following agents in parallel to investigate the bug:
1. Task rails-console-explorer(issue_description)
2. Task appsignal-log-investigator(issue_description)
Think about the places it could go wrong looking at the codebase. Look for logging output we can look for.
Run the agents again to find any logs that could help us reproduce the bug.
Keep running these agents until you have a good idea of what is going on.
## Phase 2: Visual Reproduction with Playwright
If the bug is UI-related or involves user flows, use Playwright to visually reproduce it:
### Step 1: Verify Server is Running
```
mcp__plugin_compound-engineering_pw__browser_navigate({ url: "http://localhost:3000" })
mcp__plugin_compound-engineering_pw__browser_snapshot({})
```
If server not running, inform user to start `bin/dev`.
### Step 2: Navigate to Affected Area
Based on the issue description, navigate to the relevant page:
```
mcp__plugin_compound-engineering_pw__browser_navigate({ url: "http://localhost:3000/[affected_route]" })
mcp__plugin_compound-engineering_pw__browser_snapshot({})
```
### Step 3: Capture Screenshots
Take screenshots at each step of reproducing the bug:
```
mcp__plugin_compound-engineering_pw__browser_take_screenshot({ filename: "bug-[issue]-step-1.png" })
```
### Step 4: Follow User Flow
Reproduce the exact steps from the issue:
1. **Read the issue's reproduction steps**
2. **Execute each step using Playwright:**
- `browser_click` for clicking elements
- `browser_type` for filling forms
- `browser_snapshot` to see the current state
- `browser_take_screenshot` to capture evidence
3. **Check for console errors:**
```
mcp__plugin_compound-engineering_pw__browser_console_messages({ level: "error" })
```
### Step 5: Capture Bug State
When you reproduce the bug:
1. Take a screenshot of the bug state
2. Capture console errors
3. Document the exact steps that triggered it
```
mcp__plugin_compound-engineering_pw__browser_take_screenshot({ filename: "bug-[issue]-reproduced.png" })
```
## Phase 3: Document Findings
**Reference Collection:**
- [ ] Document all research findings with specific file paths (e.g., `app/services/example_service.rb:42`)
- [ ] Include screenshots showing the bug reproduction
- [ ] List console errors if any
- [ ] Document the exact reproduction steps
## Phase 4: Report Back
Add a comment to the issue with:
1. **Findings** - What you discovered about the cause
2. **Reproduction Steps** - Exact steps to reproduce (verified)
3. **Screenshots** - Visual evidence of the bug (upload captured screenshots)
4. **Relevant Code** - File paths and line numbers
5. **Suggested Fix** - If you have one

View File

@@ -1,5 +1,5 @@
---
name: resolve_pr_parallel
name: resolve-pr-parallel
description: Resolve all PR comments using parallel processing. Use when addressing PR review feedback, resolving review threads, or batch-fixing PR comments.
argument-hint: "[optional: PR number or current PR]"
disable-model-invocation: true

View File

@@ -0,0 +1,35 @@
---
name: resolve_parallel
description: Resolve all TODO comments using parallel processing
argument-hint: "[optional: specific TODO pattern or file]"
disable-model-invocation: true
---
Resolve all TODO comments using parallel processing.
## Workflow
### 1. Analyze
Gather the things todo from above.
### 2. Plan
Create a TodoWrite list of all unresolved items grouped by type.Make sure to look at dependencies that might occur and prioritize the ones needed by others. For example, if you need to change a name, you must wait to do the others. Output a mermaid flow diagram showing how we can do this. Can we do everything in parallel? Do we need to do one first that leads to others in parallel? I'll put the to-dos in the mermaid diagram flowwise so the agent knows how to proceed in order.
### 3. Implement (PARALLEL)
Spawn a pr-comment-resolver agent for each unresolved item in parallel.
So if there are 3 comments, it will spawn 3 pr-comment-resolver agents in parallel. liek this
1. Task pr-comment-resolver(comment1)
2. Task pr-comment-resolver(comment2)
3. Task pr-comment-resolver(comment3)
Always run all in parallel subagents/Tasks for each Todo item.
### 4. Commit & Resolve
- Commit changes
- Push to remote

View File

@@ -0,0 +1,37 @@
---
name: resolve_todo_parallel
description: Resolve all pending CLI todos using parallel processing
argument-hint: "[optional: specific todo ID or pattern]"
---
Resolve all TODO comments using parallel processing.
## Workflow
### 1. Analyze
Get all unresolved TODOs from the /todos/\*.md directory
If any todo recommends deleting, removing, or gitignoring files in `docs/plans/` or `docs/solutions/`, skip it and mark it as `wont_fix`. These are compound-engineering pipeline artifacts that are intentional and permanent.
### 2. Plan
Create a TodoWrite list of all unresolved items grouped by type.Make sure to look at dependencies that might occur and prioritize the ones needed by others. For example, if you need to change a name, you must wait to do the others. Output a mermaid flow diagram showing how we can do this. Can we do everything in parallel? Do we need to do one first that leads to others in parallel? I'll put the to-dos in the mermaid diagram flowwise so the agent knows how to proceed in order.
### 3. Implement (PARALLEL)
Spawn a pr-comment-resolver agent for each unresolved item in parallel.
So if there are 3 comments, it will spawn 3 pr-comment-resolver agents in parallel. liek this
1. Task pr-comment-resolver(comment1)
2. Task pr-comment-resolver(comment2)
3. Task pr-comment-resolver(comment3)
Always run all in parallel subagents/Tasks for each Todo item.
### 4. Commit & Resolve
- Commit changes
- Remove the TODO from the file, and mark it as resolved.
- Push to remote

View File

@@ -6,7 +6,13 @@ disable-model-invocation: true
# Compound Engineering Setup
Interactive setup for `compound-engineering.local.md` — configures which agents run during `/workflows:review` and `/workflows:work`.
## Interaction Method
If `AskUserQuestion` is available, use it for all prompts below.
If not, present each question as a numbered list and wait for a reply before proceeding to the next step. For multiSelect questions, accept comma-separated numbers (e.g. `1, 3`). Never skip or auto-configure.
Interactive setup for `compound-engineering.local.md` — configures which agents run during `/ce:review` and `/ce:work`.
## Step 1: Check Existing Config
@@ -145,7 +151,7 @@ plan_review_agents: [{computed plan agent list}]
# Review Context
Add project-specific review instructions here.
These notes are passed to all review agents during /workflows:review and /workflows:work.
These notes are passed to all review agents during /ce:review and /ce:work.
Examples:
- "We use Turbo Frames heavily — check for frame-busting issues"

View File

@@ -1,210 +0,0 @@
---
name: skill-creator
description: Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations.
license: Complete terms in LICENSE.txt
disable-model-invocation: true
---
# Skill Creator
This skill provides guidance for creating effective skills.
## About Skills
Skills are modular, self-contained packages that extend Claude's capabilities by providing
specialized knowledge, workflows, and tools. Think of them as "onboarding guides" for specific
domains or tasks—they transform Claude from a general-purpose agent into a specialized agent
equipped with procedural knowledge that no model can fully possess.
### What Skills Provide
1. Specialized workflows - Multi-step procedures for specific domains
2. Tool integrations - Instructions for working with specific file formats or APIs
3. Domain expertise - Company-specific knowledge, schemas, business logic
4. Bundled resources - Scripts, references, and assets for complex and repetitive tasks
### Anatomy of a Skill
Every skill consists of a required SKILL.md file and optional bundled resources:
```
skill-name/
├── SKILL.md (required)
│ ├── YAML frontmatter metadata (required)
│ │ ├── name: (required)
│ │ └── description: (required)
│ └── Markdown instructions (required)
└── Bundled Resources (optional)
├── scripts/ - Executable code (Python/Bash/etc.)
├── references/ - Documentation intended to be loaded into context as needed
└── assets/ - Files used in output (templates, icons, fonts, etc.)
```
#### SKILL.md (required)
**Metadata Quality:** The `name` and `description` in YAML frontmatter determine when Claude will use the skill. Be specific about what the skill does and when to use it. Use the third-person (e.g. "This skill should be used when..." instead of "Use this skill when...").
#### Bundled Resources (optional)
##### Scripts (`scripts/`)
Executable code (Python/Bash/etc.) for tasks that require deterministic reliability or are repeatedly rewritten.
- **When to include**: When the same code is being rewritten repeatedly or deterministic reliability is needed
- **Example**: `scripts/rotate_pdf.py` for PDF rotation tasks
- **Benefits**: Token efficient, deterministic, may be executed without loading into context
- **Note**: Scripts may still need to be read by Claude for patching or environment-specific adjustments
##### References (`references/`)
Documentation and reference material intended to be loaded as needed into context to inform Claude's process and thinking.
- **When to include**: For documentation that Claude should reference while working
- **Examples**: `references/finance.md` for financial schemas, `references/mnda.md` for company NDA template, `references/policies.md` for company policies, `references/api_docs.md` for API specifications
- **Use cases**: Database schemas, API documentation, domain knowledge, company policies, detailed workflow guides
- **Benefits**: Keeps SKILL.md lean, loaded only when Claude determines it's needed
- **Best practice**: If files are large (>10k words), include grep search patterns in SKILL.md
- **Avoid duplication**: Information should live in either SKILL.md or references files, not both. Prefer references files for detailed information unless it's truly core to the skill—this keeps SKILL.md lean while making information discoverable without hogging the context window. Keep only essential procedural instructions and workflow guidance in SKILL.md; move detailed reference material, schemas, and examples to references files.
##### Assets (`assets/`)
Files not intended to be loaded into context, but rather used within the output Claude produces.
- **When to include**: When the skill needs files that will be used in the final output
- **Examples**: `assets/logo.png` for brand assets, `assets/slides.pptx` for PowerPoint templates, `assets/frontend-template/` for HTML/React boilerplate, `assets/font.ttf` for typography
- **Use cases**: Templates, images, icons, boilerplate code, fonts, sample documents that get copied or modified
- **Benefits**: Separates output resources from documentation, enables Claude to use files without loading them into context
### Progressive Disclosure Design Principle
Skills use a three-level loading system to manage context efficiently:
1. **Metadata (name + description)** - Always in context (~100 words)
2. **SKILL.md body** - When skill triggers (<5k words)
3. **Bundled resources** - As needed by Claude (Unlimited*)
*Unlimited because scripts can be executed without reading into context window.
## Skill Creation Process
To create a skill, follow the "Skill Creation Process" in order, skipping steps only if there is a clear reason why they are not applicable.
### Step 1: Understanding the Skill with Concrete Examples
Skip this step only when the skill's usage patterns are already clearly understood. It remains valuable even when working with an existing skill.
To create an effective skill, clearly understand concrete examples of how the skill will be used. This understanding can come from either direct user examples or generated examples that are validated with user feedback.
For example, when building an image-editor skill, relevant questions include:
- "What functionality should the image-editor skill support? Editing, rotating, anything else?"
- "Can you give some examples of how this skill would be used?"
- "I can imagine users asking for things like 'Remove the red-eye from this image' or 'Rotate this image'. Are there other ways you imagine this skill being used?"
- "What would a user say that should trigger this skill?"
To avoid overwhelming users, avoid asking too many questions in a single message. Start with the most important questions and follow up as needed for better effectiveness.
Conclude this step when there is a clear sense of the functionality the skill should support.
### Step 2: Planning the Reusable Skill Contents
To turn concrete examples into an effective skill, analyze each example by:
1. Considering how to execute on the example from scratch
2. Identifying what scripts, references, and assets would be helpful when executing these workflows repeatedly
Example: When building a `pdf-editor` skill to handle queries like "Help me rotate this PDF," the analysis shows:
1. Rotating a PDF requires re-writing the same code each time
2. A `scripts/rotate_pdf.py` script would be helpful to store in the skill
Example: When designing a `frontend-webapp-builder` skill for queries like "Build me a todo app" or "Build me a dashboard to track my steps," the analysis shows:
1. Writing a frontend webapp requires the same boilerplate HTML/React each time
2. An `assets/hello-world/` template containing the boilerplate HTML/React project files would be helpful to store in the skill
Example: When building a `big-query` skill to handle queries like "How many users have logged in today?" the analysis shows:
1. Querying BigQuery requires re-discovering the table schemas and relationships each time
2. A `references/schema.md` file documenting the table schemas would be helpful to store in the skill
To establish the skill's contents, analyze each concrete example to create a list of the reusable resources to include: scripts, references, and assets.
### Step 3: Initializing the Skill
At this point, it is time to actually create the skill.
Skip this step only if the skill being developed already exists, and iteration or packaging is needed. In this case, continue to the next step.
When creating a new skill from scratch, always run the `init_skill.py` script. The script conveniently generates a new template skill directory that automatically includes everything a skill requires, making the skill creation process much more efficient and reliable.
Usage:
```bash
scripts/init_skill.py <skill-name> --path <output-directory>
```
The script:
- Creates the skill directory at the specified path
- Generates a SKILL.md template with proper frontmatter and TODO placeholders
- Creates example resource directories: `scripts/`, `references/`, and `assets/`
- Adds example files in each directory that can be customized or deleted
After initialization, customize or remove the generated SKILL.md and example files as needed.
### Step 4: Edit the Skill
When editing the (newly-generated or existing) skill, remember that the skill is being created for another instance of Claude to use. Focus on including information that would be beneficial and non-obvious to Claude. Consider what procedural knowledge, domain-specific details, or reusable assets would help another Claude instance execute these tasks more effectively.
#### Start with Reusable Skill Contents
To begin implementation, start with the reusable resources identified above: `scripts/`, `references/`, and `assets/` files. Note that this step may require user input. For example, when implementing a `brand-guidelines` skill, the user may need to provide brand assets or templates to store in `assets/`, or documentation to store in `references/`.
Also, delete any example files and directories not needed for the skill. The initialization script creates example files in `scripts/`, `references/`, and `assets/` to demonstrate structure, but most skills won't need all of them.
#### Update SKILL.md
**Writing Style:** Write the entire skill using **imperative/infinitive form** (verb-first instructions), not second person. Use objective, instructional language (e.g., "To accomplish X, do Y" rather than "You should do X" or "If you need to do X"). This maintains consistency and clarity for AI consumption.
To complete SKILL.md, answer the following questions:
1. What is the purpose of the skill, in a few sentences?
2. When should the skill be used?
3. In practice, how should Claude use the skill? All reusable skill contents developed above should be referenced so that Claude knows how to use them.
### Step 5: Packaging a Skill
Once the skill is ready, it should be packaged into a distributable zip file that gets shared with the user. The packaging process automatically validates the skill first to ensure it meets all requirements:
```bash
scripts/package_skill.py <path/to/skill-folder>
```
Optional output directory specification:
```bash
scripts/package_skill.py <path/to/skill-folder> ./dist
```
The packaging script will:
1. **Validate** the skill automatically, checking:
- YAML frontmatter format and required fields
- Skill naming conventions and directory structure
- Description completeness and quality
- File organization and resource references
2. **Package** the skill if validation passes, creating a zip file named after the skill (e.g., `my-skill.zip`) that includes all files and maintains the proper directory structure for distribution.
If validation fails, the script will report the errors and exit without creating a package. Fix any validation errors and run the packaging command again.
### Step 6: Iterate
After testing the skill, users may request improvements. Often this happens right after using the skill, with fresh context of how the skill performed.
**Iteration workflow:**
1. Use the skill on real tasks
2. Notice struggles or inefficiencies
3. Identify how SKILL.md or bundled resources should be updated
4. Implement changes and test again

View File

@@ -1,303 +0,0 @@
#!/usr/bin/env python3
"""
Skill Initializer - Creates a new skill from template
Usage:
init_skill.py <skill-name> --path <path>
Examples:
init_skill.py my-new-skill --path skills/public
init_skill.py my-api-helper --path skills/private
init_skill.py custom-skill --path /custom/location
"""
import sys
from pathlib import Path
SKILL_TEMPLATE = """---
name: {skill_name}
description: [TODO: Complete and informative explanation of what the skill does and when to use it. Include WHEN to use this skill - specific scenarios, file types, or tasks that trigger it.]
---
# {skill_title}
## Overview
[TODO: 1-2 sentences explaining what this skill enables]
## Structuring This Skill
[TODO: Choose the structure that best fits this skill's purpose. Common patterns:
**1. Workflow-Based** (best for sequential processes)
- Works well when there are clear step-by-step procedures
- Example: DOCX skill with "Workflow Decision Tree""Reading""Creating""Editing"
- Structure: ## Overview → ## Workflow Decision Tree → ## Step 1 → ## Step 2...
**2. Task-Based** (best for tool collections)
- Works well when the skill offers different operations/capabilities
- Example: PDF skill with "Quick Start""Merge PDFs""Split PDFs""Extract Text"
- Structure: ## Overview → ## Quick Start → ## Task Category 1 → ## Task Category 2...
**3. Reference/Guidelines** (best for standards or specifications)
- Works well for brand guidelines, coding standards, or requirements
- Example: Brand styling with "Brand Guidelines""Colors""Typography""Features"
- Structure: ## Overview → ## Guidelines → ## Specifications → ## Usage...
**4. Capabilities-Based** (best for integrated systems)
- Works well when the skill provides multiple interrelated features
- Example: Product Management with "Core Capabilities" → numbered capability list
- Structure: ## Overview → ## Core Capabilities → ### 1. Feature → ### 2. Feature...
Patterns can be mixed and matched as needed. Most skills combine patterns (e.g., start with task-based, add workflow for complex operations).
Delete this entire "Structuring This Skill" section when done - it's just guidance.]
## [TODO: Replace with the first main section based on chosen structure]
[TODO: Add content here. See examples in existing skills:
- Code samples for technical skills
- Decision trees for complex workflows
- Concrete examples with realistic user requests
- References to scripts/templates/references as needed]
## Resources
This skill includes example resource directories that demonstrate how to organize different types of bundled resources:
### scripts/
Executable code (Python/Bash/etc.) that can be run directly to perform specific operations.
**Examples from other skills:**
- PDF skill: `fill_fillable_fields.py`, `extract_form_field_info.py` - utilities for PDF manipulation
- DOCX skill: `document.py`, `utilities.py` - Python modules for document processing
**Appropriate for:** Python scripts, shell scripts, or any executable code that performs automation, data processing, or specific operations.
**Note:** Scripts may be executed without loading into context, but can still be read by Claude for patching or environment adjustments.
### references/
Documentation and reference material intended to be loaded into context to inform Claude's process and thinking.
**Examples from other skills:**
- Product management: `communication.md`, `context_building.md` - detailed workflow guides
- BigQuery: API reference documentation and query examples
- Finance: Schema documentation, company policies
**Appropriate for:** In-depth documentation, API references, database schemas, comprehensive guides, or any detailed information that Claude should reference while working.
### assets/
Files not intended to be loaded into context, but rather used within the output Claude produces.
**Examples from other skills:**
- Brand styling: PowerPoint template files (.pptx), logo files
- Frontend builder: HTML/React boilerplate project directories
- Typography: Font files (.ttf, .woff2)
**Appropriate for:** Templates, boilerplate code, document templates, images, icons, fonts, or any files meant to be copied or used in the final output.
---
**Any unneeded directories can be deleted.** Not every skill requires all three types of resources.
"""
EXAMPLE_SCRIPT = '''#!/usr/bin/env python3
"""
Example helper script for {skill_name}
This is a placeholder script that can be executed directly.
Replace with actual implementation or delete if not needed.
Example real scripts from other skills:
- pdf/scripts/fill_fillable_fields.py - Fills PDF form fields
- pdf/scripts/convert_pdf_to_images.py - Converts PDF pages to images
"""
def main():
print("This is an example script for {skill_name}")
# TODO: Add actual script logic here
# This could be data processing, file conversion, API calls, etc.
if __name__ == "__main__":
main()
'''
EXAMPLE_REFERENCE = """# Reference Documentation for {skill_title}
This is a placeholder for detailed reference documentation.
Replace with actual reference content or delete if not needed.
Example real reference docs from other skills:
- product-management/references/communication.md - Comprehensive guide for status updates
- product-management/references/context_building.md - Deep-dive on gathering context
- bigquery/references/ - API references and query examples
## When Reference Docs Are Useful
Reference docs are ideal for:
- Comprehensive API documentation
- Detailed workflow guides
- Complex multi-step processes
- Information too lengthy for main SKILL.md
- Content that's only needed for specific use cases
## Structure Suggestions
### API Reference Example
- Overview
- Authentication
- Endpoints with examples
- Error codes
- Rate limits
### Workflow Guide Example
- Prerequisites
- Step-by-step instructions
- Common patterns
- Troubleshooting
- Best practices
"""
EXAMPLE_ASSET = """# Example Asset File
This placeholder represents where asset files would be stored.
Replace with actual asset files (templates, images, fonts, etc.) or delete if not needed.
Asset files are NOT intended to be loaded into context, but rather used within
the output Claude produces.
Example asset files from other skills:
- Brand guidelines: logo.png, slides_template.pptx
- Frontend builder: hello-world/ directory with HTML/React boilerplate
- Typography: custom-font.ttf, font-family.woff2
- Data: sample_data.csv, test_dataset.json
## Common Asset Types
- Templates: .pptx, .docx, boilerplate directories
- Images: .png, .jpg, .svg, .gif
- Fonts: .ttf, .otf, .woff, .woff2
- Boilerplate code: Project directories, starter files
- Icons: .ico, .svg
- Data files: .csv, .json, .xml, .yaml
Note: This is a text placeholder. Actual assets can be any file type.
"""
def title_case_skill_name(skill_name):
"""Convert hyphenated skill name to Title Case for display."""
return ' '.join(word.capitalize() for word in skill_name.split('-'))
def init_skill(skill_name, path):
"""
Initialize a new skill directory with template SKILL.md.
Args:
skill_name: Name of the skill
path: Path where the skill directory should be created
Returns:
Path to created skill directory, or None if error
"""
# Determine skill directory path
skill_dir = Path(path).resolve() / skill_name
# Check if directory already exists
if skill_dir.exists():
print(f"❌ Error: Skill directory already exists: {skill_dir}")
return None
# Create skill directory
try:
skill_dir.mkdir(parents=True, exist_ok=False)
print(f"✅ Created skill directory: {skill_dir}")
except Exception as e:
print(f"❌ Error creating directory: {e}")
return None
# Create SKILL.md from template
skill_title = title_case_skill_name(skill_name)
skill_content = SKILL_TEMPLATE.format(
skill_name=skill_name,
skill_title=skill_title
)
skill_md_path = skill_dir / 'SKILL.md'
try:
skill_md_path.write_text(skill_content)
print("✅ Created SKILL.md")
except Exception as e:
print(f"❌ Error creating SKILL.md: {e}")
return None
# Create resource directories with example files
try:
# Create scripts/ directory with example script
scripts_dir = skill_dir / 'scripts'
scripts_dir.mkdir(exist_ok=True)
example_script = scripts_dir / 'example.py'
example_script.write_text(EXAMPLE_SCRIPT.format(skill_name=skill_name))
example_script.chmod(0o755)
print("✅ Created scripts/example.py")
# Create references/ directory with example reference doc
references_dir = skill_dir / 'references'
references_dir.mkdir(exist_ok=True)
example_reference = references_dir / 'api_reference.md'
example_reference.write_text(EXAMPLE_REFERENCE.format(skill_title=skill_title))
print("✅ Created references/api_reference.md")
# Create assets/ directory with example asset placeholder
assets_dir = skill_dir / 'assets'
assets_dir.mkdir(exist_ok=True)
example_asset = assets_dir / 'example_asset.txt'
example_asset.write_text(EXAMPLE_ASSET)
print("✅ Created assets/example_asset.txt")
except Exception as e:
print(f"❌ Error creating resource directories: {e}")
return None
# Print next steps
print(f"\n✅ Skill '{skill_name}' initialized successfully at {skill_dir}")
print("\nNext steps:")
print("1. Edit SKILL.md to complete the TODO items and update the description")
print("2. Customize or delete the example files in scripts/, references/, and assets/")
print("3. Run the validator when ready to check the skill structure")
return skill_dir
def main():
if len(sys.argv) < 4 or sys.argv[2] != '--path':
print("Usage: init_skill.py <skill-name> --path <path>")
print("\nSkill name requirements:")
print(" - Hyphen-case identifier (e.g., 'data-analyzer')")
print(" - Lowercase letters, digits, and hyphens only")
print(" - Max 40 characters")
print(" - Must match directory name exactly")
print("\nExamples:")
print(" init_skill.py my-new-skill --path skills/public")
print(" init_skill.py my-api-helper --path skills/private")
print(" init_skill.py custom-skill --path /custom/location")
sys.exit(1)
skill_name = sys.argv[1]
path = sys.argv[3]
print(f"🚀 Initializing skill: {skill_name}")
print(f" Location: {path}")
print()
result = init_skill(skill_name, path)
if result:
sys.exit(0)
else:
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,110 +0,0 @@
#!/usr/bin/env python3
"""
Skill Packager - Creates a distributable zip file of a skill folder
Usage:
python utils/package_skill.py <path/to/skill-folder> [output-directory]
Example:
python utils/package_skill.py skills/public/my-skill
python utils/package_skill.py skills/public/my-skill ./dist
"""
import sys
import zipfile
from pathlib import Path
from quick_validate import validate_skill
def package_skill(skill_path, output_dir=None):
"""
Package a skill folder into a zip file.
Args:
skill_path: Path to the skill folder
output_dir: Optional output directory for the zip file (defaults to current directory)
Returns:
Path to the created zip file, or None if error
"""
skill_path = Path(skill_path).resolve()
# Validate skill folder exists
if not skill_path.exists():
print(f"❌ Error: Skill folder not found: {skill_path}")
return None
if not skill_path.is_dir():
print(f"❌ Error: Path is not a directory: {skill_path}")
return None
# Validate SKILL.md exists
skill_md = skill_path / "SKILL.md"
if not skill_md.exists():
print(f"❌ Error: SKILL.md not found in {skill_path}")
return None
# Run validation before packaging
print("🔍 Validating skill...")
valid, message = validate_skill(skill_path)
if not valid:
print(f"❌ Validation failed: {message}")
print(" Please fix the validation errors before packaging.")
return None
print(f"{message}\n")
# Determine output location
skill_name = skill_path.name
if output_dir:
output_path = Path(output_dir).resolve()
output_path.mkdir(parents=True, exist_ok=True)
else:
output_path = Path.cwd()
zip_filename = output_path / f"{skill_name}.zip"
# Create the zip file
try:
with zipfile.ZipFile(zip_filename, 'w', zipfile.ZIP_DEFLATED) as zipf:
# Walk through the skill directory
for file_path in skill_path.rglob('*'):
if file_path.is_file():
# Calculate the relative path within the zip
arcname = file_path.relative_to(skill_path.parent)
zipf.write(file_path, arcname)
print(f" Added: {arcname}")
print(f"\n✅ Successfully packaged skill to: {zip_filename}")
return zip_filename
except Exception as e:
print(f"❌ Error creating zip file: {e}")
return None
def main():
if len(sys.argv) < 2:
print("Usage: python utils/package_skill.py <path/to/skill-folder> [output-directory]")
print("\nExample:")
print(" python utils/package_skill.py skills/public/my-skill")
print(" python utils/package_skill.py skills/public/my-skill ./dist")
sys.exit(1)
skill_path = sys.argv[1]
output_dir = sys.argv[2] if len(sys.argv) > 2 else None
print(f"📦 Packaging skill: {skill_path}")
if output_dir:
print(f" Output directory: {output_dir}")
print()
result = package_skill(skill_path, output_dir)
if result:
sys.exit(0)
else:
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,65 +0,0 @@
#!/usr/bin/env python3
"""
Quick validation script for skills - minimal version
"""
import sys
import os
import re
from pathlib import Path
def validate_skill(skill_path):
"""Basic validation of a skill"""
skill_path = Path(skill_path)
# Check SKILL.md exists
skill_md = skill_path / 'SKILL.md'
if not skill_md.exists():
return False, "SKILL.md not found"
# Read and validate frontmatter
content = skill_md.read_text()
if not content.startswith('---'):
return False, "No YAML frontmatter found"
# Extract frontmatter
match = re.match(r'^---\n(.*?)\n---', content, re.DOTALL)
if not match:
return False, "Invalid frontmatter format"
frontmatter = match.group(1)
# Check required fields
if 'name:' not in frontmatter:
return False, "Missing 'name' in frontmatter"
if 'description:' not in frontmatter:
return False, "Missing 'description' in frontmatter"
# Extract name for validation
name_match = re.search(r'name:\s*(.+)', frontmatter)
if name_match:
name = name_match.group(1).strip()
# Check naming convention (hyphen-case: lowercase with hyphens)
if not re.match(r'^[a-z0-9-]+$', name):
return False, f"Name '{name}' should be hyphen-case (lowercase letters, digits, and hyphens only)"
if name.startswith('-') or name.endswith('-') or '--' in name:
return False, f"Name '{name}' cannot start/end with hyphen or contain consecutive hyphens"
# Extract and validate description
desc_match = re.search(r'description:\s*(.+)', frontmatter)
if desc_match:
description = desc_match.group(1).strip()
# Check for angle brackets
if '<' in description or '>' in description:
return False, "Description cannot contain angle brackets (< or >)"
return True, "Skill is valid!"
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python quick_validate.py <skill_directory>")
sys.exit(1)
valid, message = validate_skill(sys.argv[1])
print(message)
sys.exit(0 if valid else 1)

View File

@@ -0,0 +1,32 @@
---
name: slfg
description: Full autonomous engineering workflow using swarm mode for parallel execution
argument-hint: "[feature description]"
disable-model-invocation: true
---
Swarm-enabled LFG. Run these steps in order, parallelizing where indicated. Do not stop between steps — complete every step through to the end.
## Sequential Phase
1. **Optional:** If the `ralph-wiggum` skill is available, run `/ralph-wiggum:ralph-loop "finish all slash commands" --completion-promise "DONE"`. If not available or it fails, skip and continue to step 2 immediately.
2. `/ce:plan $ARGUMENTS`
3. `/compound-engineering:deepen-plan`
4. `/ce:work`**Use swarm mode**: Make a Task list and launch an army of agent swarm subagents to build the plan
## Parallel Phase
After work completes, launch steps 5 and 6 as **parallel swarm agents** (both only need code to be written):
5. `/ce:review` — spawn as background Task agent
6. `/compound-engineering:test-browser` — spawn as background Task agent
Wait for both to complete before continuing.
## Finalize Phase
7. `/compound-engineering:resolve_todo_parallel` — resolve any findings from the review
8. `/compound-engineering:feature-video` — record the final walkthrough and add to PR
9. Output `<promise>DONE</promise>` when video is in PR
Start with step 1 now.

View File

@@ -0,0 +1,393 @@
---
name: test-browser
description: Run browser tests on pages affected by current PR or branch
argument-hint: "[PR number, branch name, 'current', or --port PORT]"
---
# Browser Test Command
<command_purpose>Run end-to-end browser tests on pages affected by a PR or branch changes using agent-browser CLI.</command_purpose>
## CRITICAL: Use agent-browser CLI Only
**DO NOT use Chrome MCP tools (mcp__claude-in-chrome__*).**
This command uses the `agent-browser` CLI exclusively. The agent-browser CLI is a Bash-based tool from Vercel that runs headless Chromium. It is NOT the same as Chrome browser automation via MCP.
If you find yourself calling `mcp__claude-in-chrome__*` tools, STOP. Use `agent-browser` Bash commands instead.
## Introduction
<role>QA Engineer specializing in browser-based end-to-end testing</role>
This command tests affected pages in a real browser, catching issues that unit tests miss:
- JavaScript integration bugs
- CSS/layout regressions
- User workflow breakages
- Console errors
## Prerequisites
<requirements>
- Local development server running (e.g., `bin/dev`, `rails server`, `npm run dev`)
- agent-browser CLI installed (see Setup below)
- Git repository with changes to test
</requirements>
## Setup
**Check installation:**
```bash
command -v agent-browser >/dev/null 2>&1 && echo "Installed" || echo "NOT INSTALLED"
```
**Install if needed:**
```bash
npm install -g agent-browser
agent-browser install # Downloads Chromium (~160MB)
```
See the `agent-browser` skill for detailed usage.
## Main Tasks
### 0. Verify agent-browser Installation
Before starting ANY browser testing, verify agent-browser is installed:
```bash
command -v agent-browser >/dev/null 2>&1 && echo "Ready" || (echo "Installing..." && npm install -g agent-browser && agent-browser install)
```
If installation fails, inform the user and stop.
### 1. Ask Browser Mode
<ask_browser_mode>
Before starting tests, ask user if they want to watch the browser:
Use AskUserQuestion with:
- Question: "Do you want to watch the browser tests run?"
- Options:
1. **Headed (watch)** - Opens visible browser window so you can see tests run
2. **Headless (faster)** - Runs in background, faster but invisible
Store the choice and use `--headed` flag when user selects "Headed".
</ask_browser_mode>
### 2. Determine Test Scope
<test_target> $ARGUMENTS </test_target>
<determine_scope>
**If PR number provided:**
```bash
gh pr view [number] --json files -q '.files[].path'
```
**If 'current' or empty:**
```bash
git diff --name-only main...HEAD
```
**If branch name provided:**
```bash
git diff --name-only main...[branch]
```
</determine_scope>
### 3. Map Files to Routes
<file_to_route_mapping>
Map changed files to testable routes:
| File Pattern | Route(s) |
|-------------|----------|
| `app/views/users/*` | `/users`, `/users/:id`, `/users/new` |
| `app/controllers/settings_controller.rb` | `/settings` |
| `app/javascript/controllers/*_controller.js` | Pages using that Stimulus controller |
| `app/components/*_component.rb` | Pages rendering that component |
| `app/views/layouts/*` | All pages (test homepage at minimum) |
| `app/assets/stylesheets/*` | Visual regression on key pages |
| `app/helpers/*_helper.rb` | Pages using that helper |
| `src/app/*` (Next.js) | Corresponding routes |
| `src/components/*` | Pages using those components |
Build a list of URLs to test based on the mapping.
</file_to_route_mapping>
### 4. Detect Dev Server Port
<detect_port>
Determine the dev server port using this priority order:
**Priority 1: Explicit argument**
If the user passed a port number (e.g., `/test-browser 5000` or `/test-browser --port 5000`), use that port directly.
**Priority 2: CLAUDE.md / project instructions**
```bash
# Check CLAUDE.md for port references
grep -Eio '(port\s*[:=]\s*|localhost:)([0-9]{4,5})' CLAUDE.md 2>/dev/null | grep -Eo '[0-9]{4,5}' | head -1
```
**Priority 3: package.json scripts**
```bash
# Check dev/start scripts for --port flags
grep -Eo '\-\-port[= ]+[0-9]{4,5}' package.json 2>/dev/null | grep -Eo '[0-9]{4,5}' | head -1
```
**Priority 4: Environment files**
```bash
# Check .env, .env.local, .env.development for PORT=
grep -h '^PORT=' .env .env.local .env.development 2>/dev/null | tail -1 | cut -d= -f2
```
**Priority 5: Default fallback**
If none of the above yields a port, default to `3000`.
Store the result in a `PORT` variable for use in all subsequent steps.
```bash
# Combined detection (run this)
PORT="${EXPLICIT_PORT:-}"
if [ -z "$PORT" ]; then
PORT=$(grep -Eio '(port\s*[:=]\s*|localhost:)([0-9]{4,5})' CLAUDE.md 2>/dev/null | grep -Eo '[0-9]{4,5}' | head -1)
fi
if [ -z "$PORT" ]; then
PORT=$(grep -Eo '\-\-port[= ]+[0-9]{4,5}' package.json 2>/dev/null | grep -Eo '[0-9]{4,5}' | head -1)
fi
if [ -z "$PORT" ]; then
PORT=$(grep -h '^PORT=' .env .env.local .env.development 2>/dev/null | tail -1 | cut -d= -f2)
fi
PORT="${PORT:-3000}"
echo "Using dev server port: $PORT"
```
</detect_port>
### 5. Verify Server is Running
<check_server>
Before testing, verify the local server is accessible using the detected port:
```bash
agent-browser open http://localhost:${PORT}
agent-browser snapshot -i
```
If server is not running, inform user:
```markdown
**Server not running on port ${PORT}**
Please start your development server:
- Rails: `bin/dev` or `rails server`
- Node/Next.js: `npm run dev`
- Custom port: `/test-browser --port <your-port>`
Then run `/test-browser` again.
```
</check_server>
### 6. Test Each Affected Page
<test_pages>
For each affected route, use agent-browser CLI commands (NOT Chrome MCP):
**Step 1: Navigate and capture snapshot**
```bash
agent-browser open "http://localhost:${PORT}/[route]"
agent-browser snapshot -i
```
**Step 2: For headed mode (visual debugging)**
```bash
agent-browser --headed open "http://localhost:${PORT}/[route]"
agent-browser --headed snapshot -i
```
**Step 3: Verify key elements**
- Use `agent-browser snapshot -i` to get interactive elements with refs
- Page title/heading present
- Primary content rendered
- No error messages visible
- Forms have expected fields
**Step 4: Test critical interactions**
```bash
agent-browser click @e1 # Use ref from snapshot
agent-browser snapshot -i
```
**Step 5: Take screenshots**
```bash
agent-browser screenshot page-name.png
agent-browser screenshot --full page-name-full.png # Full page
```
</test_pages>
### 7. Human Verification (When Required)
<human_verification>
Pause for human input when testing touches:
| Flow Type | What to Ask |
|-----------|-------------|
| OAuth | "Please sign in with [provider] and confirm it works" |
| Email | "Check your inbox for the test email and confirm receipt" |
| Payments | "Complete a test purchase in sandbox mode" |
| SMS | "Verify you received the SMS code" |
| External APIs | "Confirm the [service] integration is working" |
Use AskUserQuestion:
```markdown
**Human Verification Needed**
This test touches the [flow type]. Please:
1. [Action to take]
2. [What to verify]
Did it work correctly?
1. Yes - continue testing
2. No - describe the issue
```
</human_verification>
### 8. Handle Failures
<failure_handling>
When a test fails:
1. **Document the failure:**
- Screenshot the error state: `agent-browser screenshot error.png`
- Note the exact reproduction steps
2. **Ask user how to proceed:**
```markdown
**Test Failed: [route]**
Issue: [description]
Console errors: [if any]
How to proceed?
1. Fix now - I'll help debug and fix
2. Create todo - Add to todos/ for later
3. Skip - Continue testing other pages
```
3. **If "Fix now":**
- Investigate the issue
- Propose a fix
- Apply fix
- Re-run the failing test
4. **If "Create todo":**
- Create `{id}-pending-p1-browser-test-{description}.md`
- Continue testing
5. **If "Skip":**
- Log as skipped
- Continue testing
</failure_handling>
### 9. Test Summary
<test_summary>
After all tests complete, present summary:
```markdown
## Browser Test Results
**Test Scope:** PR #[number] / [branch name]
**Server:** http://localhost:${PORT}
### Pages Tested: [count]
| Route | Status | Notes |
|-------|--------|-------|
| `/users` | Pass | |
| `/settings` | Pass | |
| `/dashboard` | Fail | Console error: [msg] |
| `/checkout` | Skip | Requires payment credentials |
### Console Errors: [count]
- [List any errors found]
### Human Verifications: [count]
- OAuth flow: Confirmed
- Email delivery: Confirmed
### Failures: [count]
- `/dashboard` - [issue description]
### Created Todos: [count]
- `005-pending-p1-browser-test-dashboard-error.md`
### Result: [PASS / FAIL / PARTIAL]
```
</test_summary>
## Quick Usage Examples
```bash
# Test current branch changes (auto-detects port)
/test-browser
# Test specific PR
/test-browser 847
# Test specific branch
/test-browser feature/new-dashboard
# Test on a specific port
/test-browser --port 5000
```
## agent-browser CLI Reference
**ALWAYS use these Bash commands. NEVER use mcp__claude-in-chrome__* tools.**
```bash
# Navigation
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser close # Close browser
# Snapshots (get element refs)
agent-browser snapshot -i # Interactive elements with refs (@e1, @e2, etc.)
agent-browser snapshot -i --json # JSON output
# Interactions (use refs from snapshot)
agent-browser click @e1 # Click element
agent-browser fill @e1 "text" # Fill input
agent-browser type @e1 "text" # Type without clearing
agent-browser press Enter # Press key
# Screenshots
agent-browser screenshot out.png # Viewport screenshot
agent-browser screenshot --full out.png # Full page screenshot
# Headed mode (visible browser)
agent-browser --headed open <url> # Open with visible browser
agent-browser --headed click @e1 # Click in visible browser
# Wait
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
```

View File

@@ -0,0 +1,332 @@
---
name: test-xcode
description: Build and test iOS apps on simulator using XcodeBuildMCP
argument-hint: "[scheme name or 'current' to use default]"
disable-model-invocation: true
---
# Xcode Test Command
<command_purpose>Build, install, and test iOS apps on the simulator using XcodeBuildMCP. Captures screenshots, logs, and verifies app behavior.</command_purpose>
## Introduction
<role>iOS QA Engineer specializing in simulator-based testing</role>
This command tests iOS/macOS apps by:
- Building for simulator
- Installing and launching the app
- Taking screenshots of key screens
- Capturing console logs for errors
- Supporting human verification for external flows
## Prerequisites
<requirements>
- Xcode installed with command-line tools
- XcodeBuildMCP server connected
- Valid Xcode project or workspace
- At least one iOS Simulator available
</requirements>
## Main Tasks
### 0. Verify XcodeBuildMCP is Installed
<check_mcp_installed>
**First, check if XcodeBuildMCP tools are available.**
Try calling:
```
mcp__xcodebuildmcp__list_simulators({})
```
**If the tool is not found or errors:**
Tell the user:
```markdown
**XcodeBuildMCP not installed**
Please install the XcodeBuildMCP server first:
\`\`\`bash
claude mcp add XcodeBuildMCP -- npx xcodebuildmcp@latest
\`\`\`
Then restart Claude Code and run `/xcode-test` again.
```
**Do NOT proceed** until XcodeBuildMCP is confirmed working.
</check_mcp_installed>
### 1. Discover Project and Scheme
<discover_project>
**Find available projects:**
```
mcp__xcodebuildmcp__discover_projs({})
```
**List schemes for the project:**
```
mcp__xcodebuildmcp__list_schemes({ project_path: "/path/to/Project.xcodeproj" })
```
**If argument provided:**
- Use the specified scheme name
- Or "current" to use the default/last-used scheme
</discover_project>
### 2. Boot Simulator
<boot_simulator>
**List available simulators:**
```
mcp__xcodebuildmcp__list_simulators({})
```
**Boot preferred simulator (iPhone 15 Pro recommended):**
```
mcp__xcodebuildmcp__boot_simulator({ simulator_id: "[uuid]" })
```
**Wait for simulator to be ready:**
Check simulator state before proceeding with installation.
</boot_simulator>
### 3. Build the App
<build_app>
**Build for iOS Simulator:**
```
mcp__xcodebuildmcp__build_ios_sim_app({
project_path: "/path/to/Project.xcodeproj",
scheme: "[scheme_name]"
})
```
**Handle build failures:**
- Capture build errors
- Create P1 todo for each build error
- Report to user with specific error details
**On success:**
- Note the built app path for installation
- Proceed to installation step
</build_app>
### 4. Install and Launch
<install_launch>
**Install app on simulator:**
```
mcp__xcodebuildmcp__install_app_on_simulator({
app_path: "/path/to/built/App.app",
simulator_id: "[uuid]"
})
```
**Launch the app:**
```
mcp__xcodebuildmcp__launch_app_on_simulator({
bundle_id: "[app.bundle.id]",
simulator_id: "[uuid]"
})
```
**Start capturing logs:**
```
mcp__xcodebuildmcp__capture_sim_logs({
simulator_id: "[uuid]",
bundle_id: "[app.bundle.id]"
})
```
</install_launch>
### 5. Test Key Screens
<test_screens>
For each key screen in the app:
**Take screenshot:**
```
mcp__xcodebuildmcp__take_screenshot({
simulator_id: "[uuid]",
filename: "screen-[name].png"
})
```
**Review screenshot for:**
- UI elements rendered correctly
- No error messages visible
- Expected content displayed
- Layout looks correct
**Check logs for errors:**
```
mcp__xcodebuildmcp__get_sim_logs({ simulator_id: "[uuid]" })
```
Look for:
- Crashes
- Exceptions
- Error-level log messages
- Failed network requests
</test_screens>
### 6. Human Verification (When Required)
<human_verification>
Pause for human input when testing touches:
| Flow Type | What to Ask |
|-----------|-------------|
| Sign in with Apple | "Please complete Sign in with Apple on the simulator" |
| Push notifications | "Send a test push and confirm it appears" |
| In-app purchases | "Complete a sandbox purchase" |
| Camera/Photos | "Grant permissions and verify camera works" |
| Location | "Allow location access and verify map updates" |
Use AskUserQuestion:
```markdown
**Human Verification Needed**
This test requires [flow type]. Please:
1. [Action to take on simulator]
2. [What to verify]
Did it work correctly?
1. Yes - continue testing
2. No - describe the issue
```
</human_verification>
### 7. Handle Failures
<failure_handling>
When a test fails:
1. **Document the failure:**
- Take screenshot of error state
- Capture console logs
- Note reproduction steps
2. **Ask user how to proceed:**
```markdown
**Test Failed: [screen/feature]**
Issue: [description]
Logs: [relevant error messages]
How to proceed?
1. Fix now - I'll help debug and fix
2. Create todo - Add to todos/ for later
3. Skip - Continue testing other screens
```
3. **If "Fix now":**
- Investigate the issue in code
- Propose a fix
- Rebuild and retest
4. **If "Create todo":**
- Create `{id}-pending-p1-xcode-{description}.md`
- Continue testing
</failure_handling>
### 8. Test Summary
<test_summary>
After all tests complete, present summary:
```markdown
## 📱 Xcode Test Results
**Project:** [project name]
**Scheme:** [scheme name]
**Simulator:** [simulator name]
### Build: ✅ Success / ❌ Failed
### Screens Tested: [count]
| Screen | Status | Notes |
|--------|--------|-------|
| Launch | ✅ Pass | |
| Home | ✅ Pass | |
| Settings | ❌ Fail | Crash on tap |
| Profile | ⏭️ Skip | Requires login |
### Console Errors: [count]
- [List any errors found]
### Human Verifications: [count]
- Sign in with Apple: ✅ Confirmed
- Push notifications: ✅ Confirmed
### Failures: [count]
- Settings screen - crash on navigation
### Created Todos: [count]
- `006-pending-p1-xcode-settings-crash.md`
### Result: [PASS / FAIL / PARTIAL]
```
</test_summary>
### 9. Cleanup
<cleanup>
After testing:
**Stop log capture:**
```
mcp__xcodebuildmcp__stop_log_capture({ simulator_id: "[uuid]" })
```
**Optionally shut down simulator:**
```
mcp__xcodebuildmcp__shutdown_simulator({ simulator_id: "[uuid]" })
```
</cleanup>
## Quick Usage Examples
```bash
# Test with default scheme
/xcode-test
# Test specific scheme
/xcode-test MyApp-Debug
# Test after making changes
/xcode-test current
```
## Integration with /ce:review
When reviewing PRs that touch iOS code, the `/ce:review` command can spawn this as a subagent:
```
Task general-purpose("Run /xcode-test for scheme [name]. Build, install on simulator, test key screens, check for crashes.")
```

View File

@@ -0,0 +1,311 @@
---
name: triage
description: Triage and categorize findings for the CLI todo system
argument-hint: "[findings list or source type]"
disable-model-invocation: true
---
- First set the /model to Haiku
- Then read all pending todos in the todos/ directory
Present all findings, decisions, or issues here one by one for triage. The goal is to go through each item and decide whether to add it to the CLI todo system.
**IMPORTANT: DO NOT CODE ANYTHING DURING TRIAGE!**
This command is for:
- Triaging code review findings
- Processing security audit results
- Reviewing performance analysis
- Handling any other categorized findings that need tracking
## Workflow
### Step 1: Present Each Finding
For each finding, present in this format:
```
---
Issue #X: [Brief Title]
Severity: 🔴 P1 (CRITICAL) / 🟡 P2 (IMPORTANT) / 🔵 P3 (NICE-TO-HAVE)
Category: [Security/Performance/Architecture/Bug/Feature/etc.]
Description:
[Detailed explanation of the issue or improvement]
Location: [file_path:line_number]
Problem Scenario:
[Step by step what's wrong or could happen]
Proposed Solution:
[How to fix it]
Estimated Effort: [Small (< 2 hours) / Medium (2-8 hours) / Large (> 8 hours)]
---
Do you want to add this to the todo list?
1. yes - create todo file
2. next - skip this item
3. custom - modify before creating
```
### Step 2: Handle User Decision
**When user says "yes":**
1. **Update existing todo file** (if it exists) or **Create new filename:**
If todo already exists (from code review):
- Rename file from `{id}-pending-{priority}-{desc}.md``{id}-ready-{priority}-{desc}.md`
- Update YAML frontmatter: `status: pending``status: ready`
- Keep issue_id, priority, and description unchanged
If creating new todo:
```
{next_id}-ready-{priority}-{brief-description}.md
```
Priority mapping:
- 🔴 P1 (CRITICAL) → `p1`
- 🟡 P2 (IMPORTANT) → `p2`
- 🔵 P3 (NICE-TO-HAVE) → `p3`
Example: `042-ready-p1-transaction-boundaries.md`
2. **Update YAML frontmatter:**
```yaml
---
status: ready # IMPORTANT: Change from "pending" to "ready"
priority: p1 # or p2, p3 based on severity
issue_id: "042"
tags: [category, relevant-tags]
dependencies: []
---
```
3. **Populate or update the file:**
```yaml
# [Issue Title]
## Problem Statement
[Description from finding]
## Findings
- [Key discoveries]
- Location: [file_path:line_number]
- [Scenario details]
## Proposed Solutions
### Option 1: [Primary solution]
- **Pros**: [Benefits]
- **Cons**: [Drawbacks if any]
- **Effort**: [Small/Medium/Large]
- **Risk**: [Low/Medium/High]
## Recommended Action
[Filled during triage - specific action plan]
## Technical Details
- **Affected Files**: [List files]
- **Related Components**: [Components affected]
- **Database Changes**: [Yes/No - describe if yes]
## Resources
- Original finding: [Source of this issue]
- Related issues: [If any]
## Acceptance Criteria
- [ ] [Specific success criteria]
- [ ] Tests pass
- [ ] Code reviewed
## Work Log
### {date} - Approved for Work
**By:** Claude Triage System
**Actions:**
- Issue approved during triage session
- Status changed from pending → ready
- Ready to be picked up and worked on
**Learnings:**
- [Context and insights]
## Notes
Source: Triage session on {date}
```
4. **Confirm approval:** "✅ Approved: `{new_filename}` (Issue #{issue_id}) - Status: **ready** → Ready to work on"
**When user says "next":**
- **Delete the todo file** - Remove it from todos/ directory since it's not relevant
- Skip to the next item
- Track skipped items for summary
**When user says "custom":**
- Ask what to modify (priority, description, details)
- Update the information
- Present revised version
- Ask again: yes/next/custom
### Step 3: Continue Until All Processed
- Process all items one by one
- Track using TodoWrite for visibility
- Don't wait for approval between items - keep moving
### Step 4: Final Summary
After all items processed:
````markdown
## Triage Complete
**Total Items:** [X] **Todos Approved (ready):** [Y] **Skipped:** [Z]
### Approved Todos (Ready for Work):
- `042-ready-p1-transaction-boundaries.md` - Transaction boundary issue
- `043-ready-p2-cache-optimization.md` - Cache performance improvement ...
### Skipped Items (Deleted):
- Item #5: [reason] - Removed from todos/
- Item #12: [reason] - Removed from todos/
### Summary of Changes Made:
During triage, the following status updates occurred:
- **Pending → Ready:** Filenames and frontmatter updated to reflect approved status
- **Deleted:** Todo files for skipped findings removed from todos/ directory
- Each approved file now has `status: ready` in YAML frontmatter
### Next Steps:
1. View approved todos ready for work:
```bash
ls todos/*-ready-*.md
```
````
2. Start work on approved items:
```bash
/resolve_todo_parallel # Work on multiple approved items efficiently
```
3. Or pick individual items to work on
4. As you work, update todo status:
- Ready → In Progress (in your local context as you work)
- In Progress → Complete (rename file: ready → complete, update frontmatter)
```
## Example Response Format
```
---
Issue #5: Missing Transaction Boundaries for Multi-Step Operations
Severity: 🔴 P1 (CRITICAL)
Category: Data Integrity / Security
Description: The google_oauth2_connected callback in GoogleOauthCallbacks concern performs multiple database operations without transaction protection. If any step fails midway, the database is left in an inconsistent state.
Location: app/controllers/concerns/google_oauth_callbacks.rb:13-50
Problem Scenario:
1. User.update succeeds (email changed)
2. Account.save! fails (validation error)
3. Result: User has changed email but no associated Account
4. Next login attempt fails completely
Operations Without Transaction:
- User confirmation (line 13)
- Waitlist removal (line 14)
- User profile update (line 21-23)
- Account creation (line 28-37)
- Avatar attachment (line 39-45)
- Journey creation (line 47)
Proposed Solution: Wrap all operations in ApplicationRecord.transaction do ... end block
Estimated Effort: Small (30 minutes)
---
Do you want to add this to the todo list?
1. yes - create todo file
2. next - skip this item
3. custom - modify before creating
```
## Important Implementation Details
### Status Transitions During Triage
**When "yes" is selected:**
1. Rename file: `{id}-pending-{priority}-{desc}.md` → `{id}-ready-{priority}-{desc}.md`
2. Update YAML frontmatter: `status: pending` → `status: ready`
3. Update Work Log with triage approval entry
4. Confirm: "✅ Approved: `{filename}` (Issue #{issue_id}) - Status: **ready**"
**When "next" is selected:**
1. Delete the todo file from todos/ directory
2. Skip to next item
3. No file remains in the system
### Progress Tracking
Every time you present a todo as a header, include:
- **Progress:** X/Y completed (e.g., "3/10 completed")
- **Estimated time remaining:** Based on how quickly you're progressing
- **Pacing:** Monitor time per finding and adjust estimate accordingly
Example:
```
Progress: 3/10 completed | Estimated time: ~2 minutes remaining
```
### Do Not Code During Triage
- ✅ Present findings
- ✅ Make yes/next/custom decisions
- ✅ Update todo files (rename, frontmatter, work log)
- ❌ Do NOT implement fixes or write code
- ❌ Do NOT add detailed implementation details
- ❌ That's for /resolve_todo_parallel phase
```
When done give these options
```markdown
What would you like to do next?
1. run /resolve_todo_parallel to resolve the todos
2. commit the todos
3. nothing, go chill
```

View File

@@ -0,0 +1,10 @@
---
name: workflows:brainstorm
description: "[DEPRECATED] Use /ce:brainstorm instead — renamed for clarity."
argument-hint: "[feature idea or problem to explore]"
disable-model-invocation: true
---
NOTE: /workflows:brainstorm is deprecated. Please use /ce:brainstorm instead. This alias will be removed in a future version.
/ce:brainstorm $ARGUMENTS

View File

@@ -0,0 +1,10 @@
---
name: workflows:compound
description: "[DEPRECATED] Use /ce:compound instead — renamed for clarity."
argument-hint: "[optional: brief context about the fix]"
disable-model-invocation: true
---
NOTE: /workflows:compound is deprecated. Please use /ce:compound instead. This alias will be removed in a future version.
/ce:compound $ARGUMENTS

View File

@@ -0,0 +1,10 @@
---
name: workflows:plan
description: "[DEPRECATED] Use /ce:plan instead — renamed for clarity."
argument-hint: "[feature description, bug report, or improvement idea]"
disable-model-invocation: true
---
NOTE: /workflows:plan is deprecated. Please use /ce:plan instead. This alias will be removed in a future version.
/ce:plan $ARGUMENTS

View File

@@ -0,0 +1,10 @@
---
name: workflows:review
description: "[DEPRECATED] Use /ce:review instead — renamed for clarity."
argument-hint: "[PR number, GitHub URL, branch name, or latest]"
disable-model-invocation: true
---
NOTE: /workflows:review is deprecated. Please use /ce:review instead. This alias will be removed in a future version.
/ce:review $ARGUMENTS

View File

@@ -0,0 +1,10 @@
---
name: workflows:work
description: "[DEPRECATED] Use /ce:work instead — renamed for clarity."
argument-hint: "[plan file, specification, or todo file path]"
disable-model-invocation: true
---
NOTE: /workflows:work is deprecated. Please use /ce:work instead. This alias will be removed in a future version.
/ce:work $ARGUMENTS