feat(ce-demo-reel): add demo reel skill with Python capture pipeline (#541)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:29:51 -07:00
parent f3cc7545e5
commit b979143ad0
18 changed files with 1796 additions and 751 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -25,7 +25,10 @@ bun run release:validate  # check plugin/marketplace consistency
 - **Release versioning:** Releases are prepared by release automation, not normal feature PRs. The repo now has multiple release components (`cli`, `compound-engineering`, `coding-tutor`, `marketplace`). GitHub release PRs and GitHub Releases are the canonical release-notes surface for new releases; root `CHANGELOG.md` is only a pointer to that history. Use conventional titles such as `feat:` and `fix:` so release automation can classify change intent, but do not hand-bump release-owned versions or hand-author release notes in routine PRs.
 - **Linked versions (cli + compound-engineering):** The `linked-versions` release-please plugin keeps `cli` and `compound-engineering` at the same version. This is intentional -- it simplifies version tracking across the CLI and the plugin it ships. A consequence is that a release with only plugin changes will still bump the CLI version (and vice versa). The CLI changelog may also include commits that `exclude-paths` would normally filter, because `linked-versions` overrides exclusion logic when forcing a synced bump. This is a known upstream release-please limitation, not a misconfiguration. Do not flag linked-version bumps as unnecessary.
 - **Output Paths:** Keep OpenCode output at `opencode.json` and `.opencode/{agents,skills,plugins}`. For OpenCode, command go to `~/.config/opencode/commands/<name>.md`; `opencode.json` is deep-merged (never overwritten wholesale).
- **Scratch Space:** When authoring or editing skills and agents that need repo-local scratch space, instruct them to use `.context/` for ephemeral collaboration artifacts. Namespace compound-engineering workflow state under `.context/compound-engineering/<workflow-or-skill-name>/`, add a per-run subdirectory when concurrent runs are plausible, and clean scratch artifacts up after successful completion unless the user asked to inspect them or another agent still needs them. Durable outputs like plans, specs, learnings, and docs do not belong in `.context/`.
+- **Scratch Space:** Two options depending on what the files are for:
+  - **Workflow state** (`.context/`): Files that other skills or agents in the same session may need to read — plans in progress, gate files, inter-skill handoff artifacts. Namespace under `.context/compound-engineering/<workflow-or-skill-name>/`, add a per-run subdirectory when concurrent runs are plausible, and clean up after successful completion unless the user asked to inspect them or another agent still needs them.
+  - **Throwaway artifacts** (`mktemp -d`): Files consumed once and discarded — captured screenshots, stitched GIFs, intermediate build outputs, recordings. Use OS temp (`mktemp -d -t <prefix>-XXXXXX`) so they live outside the repo tree entirely. No `.gitignore` needed, no risk of accidental commits, OS handles cleanup.
+  - **Rule of thumb:** If another skill might read it, `.context/`. If it gets uploaded/consumed and thrown away, OS temp. Durable outputs like plans, specs, learnings, and docs belong in neither — they go in `docs/`.
 - **Character encoding:**
  - **Identifiers** (file names, agent names, command names): ASCII only -- converters and regex patterns depend on it.
  - **Markdown tables:** Use pipe-delimited (`| col | col |`), never box-drawing characters.
--- a/docs/solutions/best-practices/prefer-python-over-bash-for-pipeline-scripts-2026-04-09.md
+++ b/docs/solutions/best-practices/prefer-python-over-bash-for-pipeline-scripts-2026-04-09.md
@@ -0,0 +1,123 @@
+---
+title: "Prefer Python over bash for multi-step pipeline scripts"
+date: 2026-04-09
+category: best-practices
+module: "skill scripting / ce-demo-reel"
+problem_type: best_practice
+component: tooling
+severity: medium
+applies_when:
+  - Script orchestrates 2+ external CLI tools (ffmpeg, curl, silicon, vhs)
+  - Script needs retry logic or graceful degradation on tool failure
+  - Script will run on macOS where bash 3.2 is the default
+  - Script needs to be tested from a non-shell test runner (Bun, Jest, pytest)
+  - Script has conditional failure paths where some errors should be caught and others should abort
+tags:
+  - bash-vs-python
+  - pipeline-scripts
+  - skill-scripting
+  - set-e-footguns
+  - error-handling
+  - ce-demo-reel
+---
+
+# Prefer Python over bash for multi-step pipeline scripts
+
+## Context
+
+When building the `ce-demo-reel` skill, the initial implementation used a bash script (`capture-evidence.sh`) to orchestrate ffmpeg stitching, frame normalization, and catbox.moe upload. Over 4 review rounds, the script hit 4 distinct bug classes that are inherent to bash's execution model rather than simple coding mistakes.
+
+## Guidance
+
+Use Python for agent pipeline scripts that chain multiple CLI tools with error handling. Bash `set -euo pipefail` works for simple sequential scripts but becomes a footgun when you need controlled failure paths.
+
+**Python subprocess model (explicit error handling):**
+```python
+result = subprocess.run(
+    ["curl", "-s", "-F", f"fileToUpload=@{file_path}", url],
+    capture_output=True, text=True, timeout=30, check=False
+)
+if result.returncode != 0:
+    # Retry logic runs normally
+    attempts += 1
+    continue
+```
+
+**Python timeout handling (explicit catch):**
+```python
+try:
+    result = subprocess.run(cmd, timeout=60)
+except subprocess.TimeoutExpired:
+    # Controlled failure, not a crash
+    return subprocess.CompletedProcess(cmd, returncode=1, stdout="", stderr="Timed out")
+```
+
+**Bash equivalent (the footgun):**
+```bash
+set -euo pipefail
+
+# Exits the entire script before retry logic runs
+url=$(curl -s -F "fileToUpload=@${file}" "$endpoint")
+# Never reaches here on curl failure
+
+# Workaround: || true on every line that might fail
+url=$(curl -s -F "fileToUpload=@${file}" "$endpoint") || true
+# Works but fragile and easy to forget
+```
+
+## Why This Matters
+
+Agent pipeline scripts run in environments the skill author does not control: different macOS versions (bash 3.2 vs 5.x), CI containers, worktrees. Each bash portability issue requires a non-obvious workaround that reviewers must catch. Python's subprocess model makes error handling explicit and testable rather than implicit and version-dependent.
+
+The 4 bugs found were not unusual. They are the predictable consequence of using bash for scripts that exceed its sweet spot.
+
+## When to Apply
+
+Use Python when:
+- The script orchestrates 2+ external CLI tools
+- The script needs retry logic or graceful degradation on tool failure
+- The script will run on macOS where bash 3.2 is the default
+- The script needs to be tested from a non-shell test runner
+- The script has more than ~3 subcommands
+
+Bash is still the right choice when:
+- Simple sequential scripts with no error recovery (set -e is fine)
+- One-liner wrappers around a single tool
+- Scripts using only POSIX features with no array manipulation
+- Git hooks and CI steps where the only failure mode is "abort the pipeline"
+
+## Examples
+
+**Before (bash, 4 bugs across 4 review rounds):**
+
+| Bug | Cause | Workaround needed |
+|---|---|---|
+| `url=$(curl ...)` exits on network failure | `set -e` + command substitution | `\|\| true` on every line |
+| `${array[-1]}` fails | Bash 3.2 lacks negative indexing | `${array[${#array[@]}-1]}` |
+| Frame reduction keeps all frames for n=3,4 | Integer math: `step=(n-1)/2` with min 1 | Minimum step of 2 |
+| `command -v ffmpeg` in Bun tests | `command` is a shell builtin, not spawnable | Use `which` instead |
+
+**After (Python, all 4 bug classes eliminated):**
+
+```python
+# Negative indexing just works
+last = frames[-1]
+
+# Timeout handling is explicit
+try:
+    result = subprocess.run(cmd, timeout=30)
+except subprocess.TimeoutExpired:
+    return None
+
+# Tool detection is a regular function
+if not shutil.which("ffmpeg"):
+    sys.exit("ffmpeg not found")
+
+# Math is straightforward
+step = max(2, (len(frames) - 1) // 2)
+```
+
+## Related
+
+- `docs/solutions/skill-design/script-first-skill-architecture.md`: covers when to use scripts vs agent logic (complementary: that doc answers "should a script do this?", this doc answers "which language?")
+- `docs/solutions/agent-friendly-cli-principles.md`: CLI design from the consumer side (overlaps on exit code and stderr patterns)
--- a/docs/solutions/integrations/agent-browser-chrome-authentication-patterns.md
+++ b/docs/solutions/integrations/agent-browser-chrome-authentication-patterns.md
@@ -1,147 +0,0 @@
---
-title: "Persistent GitHub authentication for agent-browser using named sessions"
-category: integrations
-date: 2026-03-22
-tags:
-  - agent-browser
-  - github
-  - authentication
-  - chrome
-  - session-persistence
-  - lightpanda
-related_to:
-  - plugins/compound-engineering/skills/feature-video/SKILL.md
-  - plugins/compound-engineering/skills/agent-browser/SKILL.md
-  - plugins/compound-engineering/skills/agent-browser/references/authentication.md
-  - plugins/compound-engineering/skills/agent-browser/references/session-management.md
---
-
-# agent-browser Chrome Authentication for GitHub
-
-## Problem
-
-agent-browser needs authenticated access to GitHub for workflows like the native video
-upload in the feature-video skill. Multiple authentication approaches were evaluated
-before finding one that works reliably with 2FA, SSO, and OAuth.
-
-## Investigation
-
-| Approach | Result |
-|---|---|
-| `--profile` flag | Lightpanda (default engine on some installs) throws "Profiles are not supported with Lightpanda". Must use `--engine chrome`. |
-| Fresh Chrome profile | No GitHub cookies. Shows "Sign up for free" instead of comment form. |
-| `--auto-connect` | Requires Chrome pre-launched with `--remote-debugging-port`. Error: "No running Chrome instance found" in normal use. Impractical. |
-| Auth vault (`auth save`/`auth login`) | Cannot handle 2FA, SSO, or OAuth redirects. Only works for simple username/password forms. |
-| `--session-name` with Chrome engine | Cookies auto-save/restore. One-time headed login handles any auth method. **This works.** |
-
-## Working Solution
-
-### One-time setup (headed, user logs in manually)
-
-```bash
-# Close any running daemon (ignores engine/option changes when reused)
-agent-browser close
-
-# Open GitHub login in headed Chrome with a named session
-agent-browser --engine chrome --headed --session-name github open https://github.com/login
-# User logs in manually -- handles 2FA, SSO, OAuth, any method
-
-# Verify auth
-agent-browser open https://github.com/settings/profile
-# If profile page loads, auth is confirmed
-```
-
-### Session validity check (before each workflow)
-
-```bash
-agent-browser close
-agent-browser --engine chrome --session-name github open https://github.com/settings/profile
-agent-browser get title
-# Title contains username or "Profile" -> session valid, proceed
-# Title contains "Sign in" or URL is github.com/login -> session expired, re-auth
-```
-
-### All subsequent runs (headless, cookies persist)
-
-```bash
-agent-browser --engine chrome --session-name github open https://github.com/...
-```
-
-## Key Findings
-
-### Engine requirement
-
-MUST use `--engine chrome`. Lightpanda does not support profiles, session persistence,
-or state files. Any workflow that uses `--session-name`, `--profile`, `--state`, or
-`state save/load` requires the Chrome engine.
-
-Include `--engine chrome` explicitly in every command that uses an authenticated session.
-Do not rely on environment defaults -- `AGENT_BROWSER_ENGINE` may be set to `lightpanda`
-in some environments.
-
-### Daemon restart
-
-Must run `agent-browser close` before switching engine or session options. A running
-daemon ignores new flags like `--engine`, `--headed`, or `--session-name`.
-
-### Session lifetime
-
-Cookies expire when GitHub invalidates them (typically weeks). Periodic re-authentication
-is required. The feature-video skill handles this by checking session validity before
-the upload step and prompting for re-auth only when needed.
-
-### Auth vault limitations
-
-The auth vault (`agent-browser auth save`/`auth login`) can only handle login forms with
-visible username and password fields. It cannot handle:
-
- 2FA (TOTP, SMS, push notification)
- SSO with identity provider redirect
- OAuth consent flows
- CAPTCHA
- Device verification prompts
-
-For GitHub and most modern services, use the one-time headed login approach instead.
-
-### `--auto-connect` viability
-
-Impractical for automated workflows. Requires Chrome to be pre-launched with
-`--remote-debugging-port=9222`, which is not how users normally run Chrome.
-
-## Prevention
-
-### Skills requiring auth must declare engine
-
-State the engine requirement in the Prerequisites section of any skill that needs
-browser auth. Include `--engine chrome` in every `agent-browser` command that touches
-an authenticated session.
-
-### Session check timing
-
-Perform the session check immediately before the step that needs auth, not at skill
-start. A session valid at start may expire during a long workflow (video encoding can
-take minutes).
-
-### Recovery without restart
-
-When expiry is detected at upload time, the video file is already encoded. Recovery:
-re-authenticate, then retry only the upload step. Do not restart from the beginning.
-
-### Concurrent sessions
-
-Use `--session-name` with a semantically descriptive name (e.g., `github`) when multiple
-skills or agents may run concurrently. Two concurrent runs sharing the default session
-will interfere with each other.
-
-### State file security
-
-Session state files in `~/.agent-browser/sessions/` contain cookies in plaintext.
-Do not commit to repositories. Add to `.gitignore` if the session directory is inside
-a repo tree.
-
-## Integration Points
-
-This pattern is used by:
- `feature-video` skill (GitHub native video upload)
- Any future skill requiring authenticated GitHub browser access
- Potential use for other OAuth-protected services (same pattern, different session name)
--- a/docs/solutions/integrations/github-native-video-upload-pr-automation.md
+++ b/docs/solutions/integrations/github-native-video-upload-pr-automation.md
@@ -1,141 +0,0 @@
---
-title: "GitHub inline video embedding via programmatic browser upload"
-category: integrations
-date: 2026-03-22
-tags:
-  - github
-  - video-embedding
-  - agent-browser
-  - playwright
-  - feature-video
-  - pr-description
-related_to:
-  - plugins/compound-engineering/skills/feature-video/SKILL.md
-  - plugins/compound-engineering/skills/agent-browser/SKILL.md
-  - plugins/compound-engineering/skills/agent-browser/references/authentication.md
---
-
-# GitHub Native Video Upload for PRs
-
-## Problem
-
-Embedding video demos in GitHub PR descriptions required external storage (R2/rclone)
-or GitHub Release assets. Release asset URLs render as plain download links, not inline
-video players. Only `user-attachments/assets/` URLs render with GitHub's native inline
-video player -- the same result as pasting a video into the PR editor manually.
-
-The distinction is absolute:
-
-| URL namespace | Rendering |
-|---|---|
-| `github.com/releases/download/...` | Plain download link (bad UX, triggers download on mobile) |
-| `github.com/user-attachments/assets/...` | Native inline `<video>` player with controls |
-
-## Investigation
-
-1. **Public upload API** -- No public API exists. The `/upload/policies/assets` endpoint
-   requires browser session cookies and is not exposed via REST or GraphQL. GitHub CLI
-   (`gh`) has no support; issues cli/cli#1895, #4228, and #4465 are all closed as
-   "not planned". GitHub keeps this private to limit abuse surface (malware hosting,
-   spam CDN, DMCA liability).
-
-2. **Release asset approach (Strategy B)** -- URLs render as download links, not video
-   players. Clickable GIF previews trigger downloads on mobile. Unacceptable UX.
-
-3. **Claude-in-Chrome JavaScript injection with base64** -- Blocked by CSP/mixed-content
-   policy. HTTPS github.com cannot fetch from HTTP localhost. Base64 chunking is possible
-   but does not scale for larger videos.
-
-4. **`tonkotsuboy/github-upload-image-to-pr`** -- Open-source reference confirming
-   browser automation is the only working approach for producing native URLs.
-
-5. **agent-browser `upload` command** -- Works. Playwright sets files directly on hidden
-   file inputs without base64 encoding or fetch requests. CSP is not a factor because
-   Playwright's `setInputFiles` operates at the browser engine level, not via JavaScript.
-
-## Working Solution
-
-### Upload flow
-
-```bash
-# Navigate to PR page (authenticated Chrome session)
-agent-browser --engine chrome --session-name github \
-  open "https://github.com/[owner]/[repo]/pull/[number]"
-agent-browser scroll down 5000
-
-# Upload video via the hidden file input
-agent-browser upload '#fc-new_comment_field' tmp/videos/feature-demo.mp4
-
-# Wait for GitHub to process the upload (typically 3-5 seconds)
-agent-browser wait 5000
-
-# Extract the URL GitHub injected into the textarea
-agent-browser eval "document.getElementById('new_comment_field').value"
-# Returns: https://github.com/user-attachments/assets/[uuid]
-
-# Clear the textarea without submitting (upload already persisted server-side)
-agent-browser eval "const ta = document.getElementById('new_comment_field'); \
-  ta.value = ''; ta.dispatchEvent(new Event('input', { bubbles: true }))"
-
-# Embed in PR description (URL on its own line renders as inline video player)
-gh pr edit [number] --body "[body with video URL on its own line]"
-```
-
-### Key selectors (validated March 2026)
-
-| Selector | Element | Purpose |
-|---|---|---|
-| `#fc-new_comment_field` | Hidden `<input type="file">` | Target for `agent-browser upload`. Accepts `.mp4`, `.mov`, `.webm` and many other types. |
-| `#new_comment_field` | `<textarea>` | GitHub injects the `user-attachments/assets/` URL here after processing the upload. |
-
-GitHub's comment form contains the hidden file input. After Playwright sets the file,
-GitHub uploads it server-side and injects a markdown URL into the textarea. The upload
-is persisted even if the form is never submitted.
-
-## What Was Removed
-
-The following approaches were removed from the feature-video skill:
-
- R2/rclone setup and configuration
- Release asset upload flow (`gh release upload`)
- GIF preview generation (unnecessary with native inline video player)
- Strategy B fallback logic
-
-Total: approximately 100 lines of SKILL.md content removed. The skill is now simpler
-and has zero external storage dependencies.
-
-## Prevention
-
-### URL validation
-
-After any upload step, confirm the extracted URL contains `user-attachments/assets/`
-before writing it into the PR description. If the URL does not match, the upload failed
-or used the wrong method.
-
-### Upload failure handling
-
-If the textarea is empty after the wait, check:
-1. Session validity (did GitHub redirect to login?)
-2. Wait time (processing can be slow under load -- retry after 3-5 more seconds)
-3. File size (10MB free, 100MB paid accounts)
-
-Do not silently substitute a release asset URL. Report the failure and offer to retry.
-
-### DOM selector fragility
-
-`#fc-new_comment_field` and `#new_comment_field` are GitHub's internal element IDs and
-may change in future UI updates. If the upload stops working, snapshot the PR page and
-inspect the current comment form structure for updated selectors.
-
-### Size limits
-
- Free accounts: 10MB per file
- Paid (Pro, Team, Enterprise): 100MB per file
-
-Check file size before attempting upload. Re-encode at lower quality if needed.
-
-## References
-
- GitHub CLI issues: cli/cli#1895, #4228, #4465 (all closed "not planned")
- `tonkotsuboy/github-upload-image-to-pr` -- reference implementation
- GitHub Community Discussions: #29993, #46951, #28219
--- a/plugins/compound-engineering/README.md
+++ b/plugins/compound-engineering/README.md
@@ -45,7 +45,7 @@ The primary entry points for engineering work, invoked as slash commands:
 | Skill | Description |
 |-------|-------------|
 | `/changelog` | Create engaging changelogs for recent merges |
-| `/feature-video` | Record video walkthroughs and add to PR description |
+| `/ce-demo-reel` | Capture a visual demo reel (GIF demos, terminal recordings, screenshots) for PRs with project-type-aware tier selection |
 | `/reproduce-bug` | Reproduce bugs using logs and console |
 | `/report-bug-ce` | Report a bug in the compound-engineering plugin |
 | `/resolve-pr-feedback` | Resolve PR review feedback in parallel |
--- a/plugins/compound-engineering/skills/ce-demo-reel/SKILL.md
+++ b/plugins/compound-engineering/skills/ce-demo-reel/SKILL.md
@@ -0,0 +1,168 @@
+---
+name: ce-demo-reel
+description: "Capture a visual demo reel (GIF, terminal recording, screenshots) for PR descriptions. Use when shipping UI changes, CLI features, or any work with observable behavior that benefits from visual proof. Also use when asked to add a demo, record a GIF, screenshot a feature, show what changed visually, create a demo reel, capture evidence, add proof to a PR, or create a before/after comparison."
+argument-hint: "[what to capture, e.g. 'the new settings page' or 'CLI output of the migrate command']"
+---
+
+# Demo Reel
+
+Detect project type, recommend a capture tier, record visual evidence, upload to a public URL, and return markdown for PR inclusion.
+
+**Evidence means USING THE PRODUCT, not running tests.** "I ran npm test" is test evidence. Evidence capture is running the actual CLI command, opening the web app, making the API call, or triggering the feature. The distinction is absolute -- test output is never labeled "Demo" or "Screenshots."
+
+If real product usage is impractical (requires API keys, cloud deploy, paid services, bot tokens), say so explicitly: "Real evidence would require [X]. Recommending [fallback approach] instead." Do not silently skip to "no evidence needed" or substitute test output.
+
+Never generate fake or placeholder image/GIF URLs. If upload fails, report the failure.
+
+## Arguments
+
+Parse `$ARGUMENTS`:
+- **What to capture**: A description of the feature or behavior to demonstrate. If provided, use it to guide which pages to visit, commands to run, or states to capture.
+- If blank, infer what to capture from recoverable branch or PR context. If the target remains ambiguous after that, ask the user what they want to demonstrate before proceeding.
+
+## Step 0: Discover Capture Target
+
+Treat target discovery as stateless and branch-aware. The agent may be invoked in a fresh session after the work was already done, so do not rely on conversation history or assume the caller knows the right artifact.
+
+If invoked by another skill, treat the caller-provided target as a hint, not proof. Rerun target discovery and validation before capturing anything.
+
+Use the lightest available context to identify the best evidence target:
+
+- Current branch name
+- Open PR title and description, if one exists
+- Changed files and diff against the base branch
+- Recent commits
+- A plan file only when it is obviously referenced by the branch, PR, arguments, or caller context
+
+Form a capture hypothesis: "The best evidence appears to be [behavior]."
+
+Proceed without asking only when there is exactly one high-confidence observable behavior and a plausible way to exercise it from the workspace. Ask the user what to demonstrate when multiple behaviors are plausible, the diff does not reveal how to exercise the behavior, or the requested target cannot be mapped to a product surface.
+
+Skip evidence with a clear reason when the diff is docs-only, markdown-only, config-only, CI-only, test-only, or a pure internal refactor with no observable output change.
+
+## Step 1: Exercise the Feature
+
+Before capturing anything, verify the feature works by actually using it:
+
+- **CLI tool**: Run the new/changed command and confirm the output is correct
+- **Web app**: Navigate to the new/changed page and confirm it renders correctly
+- **Library**: Run example code using the new/changed API
+- **Bug fix**: Reproduce the original bug scenario and confirm it's fixed
+
+Use the workspace where the feature was built. Do not reinstall from scratch. If setup requires credentials or services, use the platform's blocking question tool (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini) to ask the user.
+
+## Step 2: Detect Project Type
+
+Use the capture target from Step 0 to decide which directory to classify. If the diff touches a specific subdirectory with its own package manifest (e.g., `packages/cli/`, `apps/web/`), pass that as the root. Otherwise use the repo root.
+
+```bash
+python3 scripts/capture-demo.py detect --repo-root [TARGET_DIR]
+```
+
+This outputs JSON with `type` and `reason`. The result is a signal, not a gate. If the agent's understanding from Step 0 contradicts the script's classification (e.g., the diff clearly changes CLI behavior but the repo root classifies as `web-app` because of a sibling Next.js app), the agent's judgment wins.
+
+## Step 3: Assess Change Type
+
+Step 0 already handled the "no observable behavior" early exit. This step classifies changes that DO have observable behavior into `motion` or `states` to guide tier selection.
+
+If arguments describe what to capture, classify based on the description. Otherwise, use the diff context from Step 0.
+
+**Change classification:**
+
+1. **Involves motion or interaction?** (animations, typing flows, drag-and-drop, real-time updates, continuous CLI output) -> classify as `motion`.
+2. **Involves discrete states?** (before/after UI, new page, command with output, API response) -> classify as `states`.
+
+| Change characteristic | Classification |
+|---|---|
+| Animations, typing, drag-and-drop, streaming output | `motion` |
+| New UI, before/after, command output, API responses | `states` |
+
+**Feature vs bug fix -- what to demonstrate:**
+
+- **New feature (`feat`)**: Demonstrate the feature working. Show the hero moment -- the feature doing its thing.
+- **Bug fix (`fix`)**: Show before AND after. Reproduce the original broken state (if possible) then show the fix. If the broken state can't be reproduced (already fixed in the workspace), capture the fixed state and describe what was broken.
+
+Infer feat vs fix from commit messages, branch name, or plan file frontmatter (`type: feat` or `type: fix`). If unclear, ask.
+
+## Step 4: Tool Preflight
+
+Run the preflight check:
+
+```bash
+python3 scripts/capture-demo.py preflight
+```
+
+This outputs JSON with boolean availability for each tool: `agent_browser`, `vhs`, `silicon`, `ffmpeg`, `ffprobe`. Print a human-readable summary for the user based on the result, noting install commands for missing tools (e.g., `brew install charmbracelet/tap/vhs` for vhs, `brew install silicon` for silicon, `brew install ffmpeg` for ffmpeg).
+
+## Step 5: Create Run Directory
+
+Create a per-run scratch directory in the OS temp location:
+
+```bash
+mktemp -d -t demo-reel-XXXXXX
+```
+
+Use the output as `RUN_DIR`. Pass this concrete run directory to every tier reference. Evidence artifacts are ephemeral — they get uploaded to a public URL and then discarded. The OS temp directory is the right place for them, not the repo tree.
+
+## Step 6: Recommend Tier and Ask User
+
+Run the recommendation script with the project type from Step 2, change classification from Step 3, and preflight JSON from Step 4:
+
+```bash
+python3 scripts/capture-demo.py recommend --project-type [TYPE] --change-type [motion|states] --tools '[PREFLIGHT_JSON]'
+```
+
+This outputs JSON with `recommended` (the best tier), `available` (list of tiers whose tools are present), and `reasoning`.
+
+Present the available tiers to the user via the platform's blocking question tool (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini). Mark the recommended tier. Always include "No evidence needed" as a final option.
+
+**Question:** "How should evidence be captured for this change?"
+
+**Options** (show only tiers from the `available` list, order by recommendation):
+1. **Browser reel** -- Agent-browser screenshots stitched into animated GIF. Best for web apps.
+2. **Terminal recording** -- VHS terminal recording to GIF. Best for CLI tools with interaction/motion.
+3. **Screenshot reel** -- Styled terminal frames stitched into animated GIF. Best for discrete CLI steps.
+4. **Static screenshots** -- Individual PNGs. Fallback when other tools are unavailable.
+5. **No evidence needed** -- The diff speaks for itself. Best for text-only or config changes.
+
+If the question tool is unavailable (background agent, batch mode), present the numbered options and wait for the user's reply before proceeding.
+
+## Step 7: Execute Selected Tier
+
+Carry the capture hypothesis from Step 0 and the feature exercise results from Step 1 into tier execution — these determine which specific pages to visit, commands to run, or states to screenshot. Substitute `[RUN_DIR]` in the tier reference with the concrete path from Step 5.
+
+Load the appropriate reference file for the selected tier:
+
+- **Browser reel** -> Read `references/tier-browser-reel.md`
+- **Terminal recording** -> Read `references/tier-terminal-recording.md`
+- **Screenshot reel** -> Read `references/tier-screenshot-reel.md`
+- **Static screenshots** -> Read `references/tier-static-screenshots.md`
+- **No evidence needed** -> Skip to output. Set `evidence_url` to null, `evidence_label` to null.
+
+**Runtime failure fallback:** If the selected tier fails during execution (tool crashes, server not accessible, recording produces empty output), fall back to the next available tier rather than failing entirely. The fallback order is: browser reel -> static screenshots, terminal recording -> screenshot reel -> static screenshots, screenshot reel -> static screenshots. Static screenshots is the terminal fallback -- if even that fails, report the failure and let the user decide.
+
+## Step 8: Upload and Approval
+
+After the selected tier produces an artifact, read `references/upload-and-approval.md` for upload to a public host, user approval gate, and markdown embed generation.
+
+## Output
+
+Return these values to the caller (e.g., git-commit-push-pr):
+
+```
+=== Evidence Capture Complete ===
+Tier: [browser-reel / terminal-recording / screenshot-reel / static / skipped]
+Description: [1 sentence describing what the evidence shows]
+URL: [public URL or "none" (multiple URLs comma-separated for static screenshots)]
+=== End Evidence ===
+```
+
+The `Description` is a 1-line summary derived from the capture hypothesis in Step 0 (e.g., "CLI detect command classifying 3 project types and recommending capture tiers"). The caller decides how to format the URL(s) into the PR description.
+
+- `Tier: skipped` or `URL: "none"` means no evidence was captured.
+
+**Label convention:**
+- Browser reel, terminal recording, screenshot reel: label as "Demo"
+- Static screenshots: label as "Screenshots"
+- The caller applies the label when formatting. ce-demo-reel does not generate markdown.
+- Test output is never labeled "Demo" or "Screenshots"
--- a/plugins/compound-engineering/skills/ce-demo-reel/references/tier-browser-reel.md
+++ b/plugins/compound-engineering/skills/ce-demo-reel/references/tier-browser-reel.md
@@ -0,0 +1,105 @@
+# Tier: Browser Reel
+
+Capture 3-5 browser screenshots at key UI states and stitch into an animated GIF.
+
+**Best for:** Web apps, desktop apps accessible via localhost or CDP.
+**Output:** GIF (PNG screenshots stitched via ffmpeg two-pass palette)
+**Label:** "Demo"
+**Required tools:** agent-browser, ffmpeg
+
+## Step 1: Connect to the Application
+
+**For web apps** -- verify the dev server is accessible:
+
+- Read `package.json` `scripts` for `dev`, `start`, `serve` commands
+- Check `Procfile`, `Procfile.dev`, or `bin/dev` if they exist
+- Check `Gemfile` for Rails (`bin/rails server`) or Sinatra
+- Check for running processes on common ports (3000, 5000, 8080)
+
+If the server is not running, tell the user what start command was detected and ask them to start it. Do not start it automatically (it may require environment variables, database setup, etc.).
+
+If the server cannot be reached after the user confirms it should be running, fall back to static screenshots tier.
+
+Once accessible, note the base URL (e.g., `http://localhost:3000`).
+
+**For Electron/desktop apps** -- connect via Chrome DevTools Protocol (CDP):
+
+1. Check if the app is already running with CDP enabled by probing common ports:
+   ```bash
+   curl -s http://localhost:9222/json/version
+   ```
+   If that returns a JSON response, the app is ready -- connect agent-browser to it:
+   ```bash
+   agent-browser connect 9222
+   ```
+
+2. If not running, the app needs to be launched with `--remote-debugging-port`. Detect the entry point from `package.json` (look for the `main` field or `electron` in scripts), then ask the user to launch it with:
+   ```
+   your-electron-app --remote-debugging-port=9222
+   ```
+   If port 9222 is busy, try 9223-9230.
+
+3. Poll until CDP is ready (timeout after 30 seconds):
+   ```bash
+   curl -s http://localhost:9222/json/version
+   ```
+
+4. Connect agent-browser:
+   ```bash
+   agent-browser connect 9222
+   ```
+
+**CDP advantages:** Screenshots come from the renderer's frame buffer, not macOS screen capture -- no Accessibility or Screen Recording permissions needed.
+
+**If CDP connection fails:** Fall back to static screenshots tier. Tell the user: "Could not connect to the app via CDP. Falling back to static screenshots."
+
+## Step 2: Capture Screenshots
+
+Navigate to the relevant pages and capture 3-5 screenshots at key UI states:
+
+1. **Initial/empty state** -- Before the feature is used
+2. **Navigation** -- How the user reaches the feature (if not the landing page)
+3. **Feature in action** -- The hero shot showing the feature working
+4. **Result state** -- After interaction (data present, items created, success message)
+5. **Detail view** (optional) -- Expanded item, settings panel, modal
+
+For each screenshot, write to the concrete `RUN_DIR` created by the parent skill:
+
+```bash
+agent-browser open [URL]
+```
+
+```bash
+agent-browser wait 2000
+```
+
+```bash
+agent-browser screenshot [RUN_DIR]/frame-01-initial.png
+```
+
+**Capture tips:**
+- Use URL navigation (`agent-browser open URL`) rather than clicking SPA elements (clicks often fail on React/Vue/Svelte SPAs)
+- Wait 2-3 seconds after navigation for the page to settle
+- Capture the full viewport (sidebar, header give reviewers context)
+
+## Step 3: Stitch into GIF
+
+Use the capture pipeline script to normalize frame dimensions, stitch with two-pass palette, and auto-reduce if over 10 MB:
+
+```bash
+python3 scripts/capture-demo.py stitch [RUN_DIR]/demo.gif [RUN_DIR]/frame-*.png
+```
+
+The script handles dimension normalization (via ffprobe + ffmpeg padding), concat demuxer stitching, palette generation, and automatic frame reduction if the GIF exceeds GitHub's 10 MB inline limit. Default is 3 seconds per frame. To adjust:
+
+```bash
+python3 scripts/capture-demo.py stitch --duration 2.0 [RUN_DIR]/demo.gif [RUN_DIR]/frame-*.png
+```
+
+**If stitching fails:** Fall back to static screenshots tier using the individual PNGs already captured. If no PNGs were captured, report the failure.
+
+## Step 4: Cleanup
+
+After successful GIF creation, remove individual PNG frames. Keep only the final GIF for upload.
+
+Proceed to `references/upload-and-approval.md`.
--- a/plugins/compound-engineering/skills/ce-demo-reel/references/tier-screenshot-reel.md
+++ b/plugins/compound-engineering/skills/ce-demo-reel/references/tier-screenshot-reel.md
@@ -0,0 +1,61 @@
+# Tier: Screenshot Reel
+
+Render styled terminal frames from text and stitch into an animated GIF. Each frame shows one step of a CLI demo (command + output).
+
+**Best for:** CLI tools shown as discrete steps (command -> output -> next command -> output). Also useful when VHS breaks on quoting or special characters.
+**Output:** GIF (silicon PNGs stitched via ffmpeg)
+**Label:** "Demo"
+**Required tools:** silicon, ffmpeg
+
+## Step 1: Write Demo Content
+
+Create a text file with `---` delimiters between frames. Each frame shows the terminal state for one step:
+
+Write to `[RUN_DIR]/demo-steps.txt`:
+
+```
+$ your-cli-command --flag value
+Output line 1
+Output line 2
+Success: feature works correctly
+---
+$ your-cli-command --another-flag
+Different output showing another aspect
+Result: 42 items processed
+---
+$ your-cli-command --verify
+All checks passed
+```
+
+**Tips:**
+- Include the `$` prompt to show what the user types
+- Keep each frame under ~80 characters wide for readability
+- 3-5 frames is ideal -- enough to tell the story, not so many the GIF is huge
+- Strip unicode characters that silicon's default font can't render (checkmarks, fancy arrows)
+
+## Step 2: Split into Frame Files
+
+Split the demo content on `---` lines into separate text files, one per frame:
+
+- `[RUN_DIR]/frame-001.txt`
+- `[RUN_DIR]/frame-002.txt`
+- `[RUN_DIR]/frame-003.txt`
+- etc.
+
+## Step 3: Render and Stitch
+
+Use the capture pipeline script to render each text frame through silicon and stitch into an animated GIF in a single call:
+
+```bash
+python3 scripts/capture-demo.py screenshot-reel --output [RUN_DIR]/demo.gif --duration 2.5 --text [RUN_DIR]/frame-001.txt [RUN_DIR]/frame-002.txt [RUN_DIR]/frame-003.txt
+```
+
+The script handles silicon rendering, dimension normalization, two-pass palette generation, and automatic frame reduction if the GIF exceeds limits. Default duration is 2.5 seconds per frame (faster than browser reels since terminal frames are quicker to read).
+
+**If the script fails** (silicon rendering error, stitching error, empty output): fall back to static screenshots tier. Include the raw terminal output as a code block in the PR description instead. Label as "Terminal output", not "Screenshots".
+
+## Step 4: Cleanup
+
+Remove individual PNGs and text files. Keep only the final GIF for upload.
+
+Proceed to `references/upload-and-approval.md`.
--- a/plugins/compound-engineering/skills/ce-demo-reel/references/tier-static-screenshots.md
+++ b/plugins/compound-engineering/skills/ce-demo-reel/references/tier-static-screenshots.md
@@ -0,0 +1,55 @@
+# Tier: Static Screenshots
+
+Capture individual PNG screenshots. No animation, no stitching.
+
+**Best for:** Fallback when other tools are unavailable, library demos, or features where animation doesn't add value.
+**Output:** PNG files
+**Label:** "Screenshots"
+**Required tools:** Varies (agent-browser for web, silicon for CLI, or native screenshot)
+
+## Capture by Project Type
+
+### Web app or desktop app (agent-browser available)
+
+```bash
+agent-browser open [URL]
+```
+
+```bash
+agent-browser wait 2000
+```
+
+```bash
+agent-browser screenshot [RUN_DIR]/screenshot-01.png
+```
+
+Capture 1-3 screenshots: before state, feature in action, result state.
+
+### CLI tool (silicon available)
+
+Run the command, capture its output to a text file, then render with silicon:
+
+```bash
+silicon [RUN_DIR]/output.txt -o [RUN_DIR]/screenshot-01.png --theme Dracula -l bash --pad-horiz 20 --pad-vert 20
+```
+
+### CLI tool (no silicon)
+
+Run the command and capture the raw terminal output. Include the output as a code block in the PR description instead of an image. Label it as "Terminal output", never "Screenshot".
+
+### Library
+
+Run example code that exercises the new API. Capture the output as above (silicon if available, code block if not).
+
+## Upload
+
+Each PNG is uploaded individually. Proceed to `references/upload-and-approval.md` for each file, or upload all and present them together for approval.
+
+For multiple screenshots, the markdown embed uses multiple image lines:
+
+```markdown
+## Screenshots
+
+![Before](url-1)
+![After](url-2)
+```
--- a/plugins/compound-engineering/skills/ce-demo-reel/references/tier-terminal-recording.md
+++ b/plugins/compound-engineering/skills/ce-demo-reel/references/tier-terminal-recording.md
@@ -0,0 +1,88 @@
+# Tier: Terminal Recording
+
+Record a terminal session using VHS (charmbracelet/vhs) to produce a GIF demo.
+
+**Best for:** CLI tools, scripts, command-line features with interaction or motion (typing, streaming output, progressive rendering).
+**Output:** GIF (direct from VHS)
+**Label:** "Demo"
+**Required tools:** vhs
+
+## Step 1: Plan the Recording
+
+Before generating a .tape file, determine:
+
+- **What command(s) to run** -- The actual product command, not test commands. "I ran npm test" is test evidence, not a demo.
+- **Expected output** -- What the terminal should show when the command succeeds.
+- **Terminal dimensions** -- Wide enough for the longest output line, tall enough to avoid scrolling.
+- **Timing** -- Target 5-10 seconds total. Enough sleep after each command for output to render.
+
+## Step 2: Generate .tape File
+
+Write a VHS tape file to `[RUN_DIR]/demo.tape`:
+
+```tape
+Output [RUN_DIR]/demo.gif
+
+Set FontSize 16
+Set Width 800
+Set Height 500
+Set Theme "Catppuccin Mocha"
+Set TypingSpeed 40ms
+
+# Hide boring setup
+Hide
+Type "cd /path/to/project"
+Enter
+Sleep 500ms
+Show
+
+# The demo
+Type "your-cli-command --flag value"
+Sleep 500ms
+Enter
+Sleep 3s
+
+# Let viewer read the output
+Sleep 2s
+```
+
+**Key .tape directives:**
+- `Output [path]` -- Where to write the GIF (must be first line)
+- `Set FontSize [14-18]` -- Larger for readability
+- `Set Width/Height [pixels]` -- Match content needs
+- `Set Theme [name]` -- "Catppuccin Mocha" or "Dracula" are readable defaults
+- `Set TypingSpeed [ms]` -- 30-50ms feels natural
+- `Hide`/`Show` -- Skip boring setup (cd, source, npm install)
+- `Type [text]` -- Types characters (does not execute)
+- `Enter` -- Presses enter (executes the typed command)
+- `Sleep [duration]` -- Wait for output to render
+
+**Avoid:**
+- Non-deterministic output (random IDs, timestamps that change between runs)
+- Commands that require interactive input (prompts, password entry)
+- Very long output that scrolls off screen
+
+## Step 3: Run VHS
+
+Use the capture pipeline script to execute the tape file and validate output:
+
+```bash
+python3 scripts/capture-demo.py terminal-recording --output [RUN_DIR]/demo.gif --tape [RUN_DIR]/demo.tape
+```
+
+The script runs VHS, validates the output exists, and reports the file size. If the GIF exceeds 10 MB, reduce by adjusting the .tape: smaller terminal dimensions (`Set Width/Height`), shorter recording (fewer sleeps), or lower font size. Re-run.
+
+## Step 4: Quality Check
+
+Read the generated GIF to verify:
+
+1. Commands are visible and readable
+2. Output renders completely (not cut off)
+3. The feature being demonstrated is clearly shown
+4. No secrets, credentials, or sensitive paths are visible
+
+If quality is poor, revise the .tape file and re-record.
+
+**If VHS fails** (crashes, produces empty GIF, or the command being demonstrated fails): fall back to the screenshot reel tier. Write the same commands and expected output as text frames and stitch via silicon + ffmpeg. If silicon is also unavailable, fall back to static screenshots.
+
+Proceed to `references/upload-and-approval.md`.
--- a/plugins/compound-engineering/skills/ce-demo-reel/references/upload-and-approval.md
+++ b/plugins/compound-engineering/skills/ce-demo-reel/references/upload-and-approval.md
@@ -0,0 +1,40 @@
+# Upload and Approval
+
+Get user approval for the local artifact, upload evidence to a public URL, and generate markdown for PR inclusion.
+
+## Step 1: Local Approval Gate
+
+Before uploading anywhere public, present the local artifact path to the user for approval. Use the platform's blocking question tool (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini).
+
+**Question:** "Evidence captured at [RUN_DIR]/[artifact]. Review it locally and decide:"
+
+**Options:**
+1. **Looks good, upload for PR** -- proceed to upload
+2. **Not good enough, try again** -- return to the tier execution step and re-capture
+3. **Skip evidence for this PR** -- set evidence to null and proceed
+
+If the question tool is unavailable (headless/background mode), present the numbered options and wait for the user's reply before proceeding.
+
+## Step 2: Upload to catbox.moe
+
+After the user approves the local artifact, upload the evidence file (GIF or PNG) using the capture pipeline script. Set `ARTIFACT_PATH` to the approved GIF or PNG path:
+
+```bash
+python3 scripts/capture-demo.py upload [ARTIFACT_PATH]
+```
+
+The script uploads to catbox.moe, validates the response starts with `https://`, and retries once on failure. The last line of output is the public URL (e.g., `https://files.catbox.moe/abc123.gif`).
+
+**If upload fails** after retry, report the failure and the local artifact path. Do not commit evidence files to the repo — they are ephemeral artifacts, not source material. Tell the user: "Upload failed. Local artifact preserved at [ARTIFACT_PATH]. You can upload it manually or retry later."
+
+For multiple files (static screenshots tier), upload each file separately.
+
+## Step 3: Return Output
+
+Return the structured output defined in the SKILL.md Output section: `Tier`, `Description`, and `URL`. The caller formats the evidence into the PR description. ce-demo-reel does not generate markdown.
+
+## Step 4: Cleanup
+
+Remove the `[RUN_DIR]` scratch directory and all temporary files. Preserve nothing -- the evidence lives at the public URL now.
+
+If the upload failed and the user has not yet manually uploaded, preserve `[RUN_DIR]` so the artifact is still accessible.
--- a/plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py
+++ b/plugins/compound-engineering/skills/ce-demo-reel/scripts/capture-demo.py
@@ -0,0 +1,653 @@
+#!/usr/bin/env python3
+"""
+Evidence capture pipeline — deterministic helpers for the demo-reel skill.
+
+Subcommands:
+  preflight                          Check tool availability (JSON output)
+  detect [--repo-root PATH]          Detect project type from manifests (JSON output)
+  recommend --project-type T --change-type T --tools JSON   Recommend capture tier (JSON output)
+  stitch [--duration N] OUTPUT FRAME [FRAME ...]            Stitch frames into animated GIF
+  screenshot-reel --output OUT [--duration N] [--lang L] [--theme T] --text F [F ...]   Render text frames via silicon + stitch
+  terminal-recording --output OUT --tape TAPE               Run VHS tape file
+  upload FILE                        Upload to catbox.moe (retries once)
+"""
+import argparse
+import json
+import os
+import shutil
+import subprocess
+import sys
+import tempfile
+import time
+from pathlib import Path
+
+
+# --- Config ---
+
+MAX_GIF_SIZE = 10 * 1024 * 1024   # 10 MB — GitHub inline render limit
+TARGET_GIF_SIZE = 5 * 1024 * 1024  # 5 MB — preferred target
+CATBOX_API = "https://catbox.moe/user/api.php"
+
+
+# --- Helpers ---
+
+def die(msg):
+    print(f"ERROR: {msg}", file=sys.stderr)
+    sys.exit(1)
+
+
+def check_tool(name):
+    return shutil.which(name) is not None
+
+
+def run_cmd(cmd, timeout=120):
+    try:
+        result = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout, check=False)
+    except subprocess.TimeoutExpired:
+        print(f"ERROR: Command timed out after {timeout}s: {' '.join(cmd)}", file=sys.stderr)
+        return subprocess.CompletedProcess(cmd, returncode=1, stdout="", stderr=f"Timed out after {timeout}s")
+    if result.returncode != 0:
+        print(f"ERROR: Command failed (exit {result.returncode}): {' '.join(cmd)}", file=sys.stderr)
+        if result.stderr:
+            print(result.stderr.strip(), file=sys.stderr)
+    return result
+
+
+def file_size_mb(path):
+    return Path(path).stat().st_size / (1024 * 1024)
+
+
+# --- Preflight ---
+
+def cmd_preflight(_args):
+    tools = {
+        "agent_browser": check_tool("agent-browser"),
+        "vhs": check_tool("vhs"),
+        "silicon": check_tool("silicon"),
+        "ffmpeg": check_tool("ffmpeg"),
+        "ffprobe": check_tool("ffprobe"),
+    }
+    print(json.dumps(tools))
+
+
+# --- Detect ---
+
+ELECTRON_DEPS = {"electron", "electron-builder", "electron-forge", "electron-vite", "electron-packager"}
+WEB_NODE_DEPS = {
+    "react", "vue", "svelte", "astro", "next", "nuxt", "@angular/core", "solid-js",
+    "@remix-run/react", "gatsby", "express", "fastify", "koa", "hono", "@hono/node-server",
+}
+WEB_RUBY_DEPS = {"rails", "sinatra", "hanami", "roda"}
+WEB_GO_DEPS = {
+    "github.com/gin-gonic/gin", "github.com/labstack/echo", "github.com/gofiber/fiber",
+    "github.com/go-chi/chi", "github.com/gorilla/mux",
+}
+# Note: net/http is stdlib and won't appear in go.mod. The agent detects stdlib web
+# servers from source imports in the diff and overrides the classification (Step 2).
+WEB_PYTHON_DEPS = {"flask", "django", "fastapi", "starlette", "tornado", "sanic", "litestar"}
+WEB_RUST_DEPS = {"actix-web", "axum", "rocket", "warp", "poem", "tide"}
+CLI_RUBY_DEPS = {"thor", "gli", "dry-cli"}
+CLI_PYTHON_DEPS = {"click", "typer", "argparse"}
+
+
+def _read_file(path):
+    try:
+        return Path(path).read_text(encoding="utf-8", errors="replace")
+    except (OSError, IOError):
+        return None
+
+
+def _has_any_dep(pkg_json, dep_names):
+    deps = set(pkg_json.get("dependencies", {}).keys())
+    dev_deps = set(pkg_json.get("devDependencies", {}).keys())
+    all_deps = deps | dev_deps
+    return bool(all_deps & dep_names)
+
+
+def _detect_project_type(repo_root):
+    root = Path(repo_root)
+
+    # Try package.json first (used by multiple checks)
+    pkg_json = None
+    pkg_text = _read_file(root / "package.json")
+    if pkg_text:
+        try:
+            pkg_json = json.loads(pkg_text)
+        except json.JSONDecodeError:
+            pass
+
+    # 1. Desktop app (Electron)
+    if pkg_json and _has_any_dep(pkg_json, ELECTRON_DEPS):
+        return {"type": "desktop-app", "reason": "package.json contains Electron dependency"}
+
+    # 2. Web app
+    if pkg_json and _has_any_dep(pkg_json, WEB_NODE_DEPS):
+        return {"type": "web-app", "reason": "package.json contains web framework dependency"}
+
+    # Check vite with framework deps (vite alone could be anything)
+    if pkg_json and _has_any_dep(pkg_json, {"vite"}):
+        all_deps = set(pkg_json.get("dependencies", {}).keys()) | set(pkg_json.get("devDependencies", {}).keys())
+        if all_deps & WEB_NODE_DEPS:
+            return {"type": "web-app", "reason": "package.json contains vite with framework dependency"}
+
+    gemfile = _read_file(root / "Gemfile")
+    if gemfile:
+        for dep in WEB_RUBY_DEPS:
+            if dep in gemfile:
+                return {"type": "web-app", "reason": f"Gemfile contains {dep}"}
+
+    go_mod = _read_file(root / "go.mod")
+    if go_mod:
+        for dep in WEB_GO_DEPS:
+            if dep in go_mod:
+                return {"type": "web-app", "reason": f"go.mod contains {dep}"}
+
+    for pyfile in ["pyproject.toml", "requirements.txt"]:
+        content = _read_file(root / pyfile)
+        if content:
+            for dep in WEB_PYTHON_DEPS:
+                if dep in content:
+                    return {"type": "web-app", "reason": f"{pyfile} contains {dep}"}
+
+    cargo = _read_file(root / "Cargo.toml")
+    if cargo:
+        for dep in WEB_RUST_DEPS:
+            if dep in cargo:
+                return {"type": "web-app", "reason": f"Cargo.toml contains {dep}"}
+
+    # 3. CLI tool
+    if pkg_json:
+        if "bin" in pkg_json:
+            return {"type": "cli-tool", "reason": "package.json has bin field"}
+        if (root / "bin").is_dir():
+            return {"type": "cli-tool", "reason": "bin/ directory exists"}
+
+    if go_mod and (root / "cmd").is_dir():
+        return {"type": "cli-tool", "reason": "go.mod with cmd/ directory"}
+
+    if cargo and "[[bin]]" in cargo:
+        return {"type": "cli-tool", "reason": "Cargo.toml has [[bin]] section"}
+
+    pyproject = _read_file(root / "pyproject.toml")
+    if pyproject:
+        if "[project.scripts]" in pyproject or "[tool.poetry.scripts]" in pyproject:
+            return {"type": "cli-tool", "reason": "pyproject.toml has script entry points"}
+        for dep in CLI_PYTHON_DEPS:
+            if dep in pyproject:
+                return {"type": "cli-tool", "reason": f"pyproject.toml contains {dep}"}
+
+    if gemfile:
+        for dep in CLI_RUBY_DEPS:
+            if dep in gemfile:
+                return {"type": "cli-tool", "reason": f"Gemfile contains {dep}"}
+        if (root / "bin").is_dir() or (root / "exe").is_dir():
+            return {"type": "cli-tool", "reason": "Ruby project with bin/ or exe/ directory"}
+
+    if go_mod and (root / "main.go").exists():
+        return {"type": "cli-tool", "reason": "main.go exists without web framework"}
+
+    # 4. Library
+    manifests = ["package.json", "Gemfile", "go.mod", "Cargo.toml", "pyproject.toml", "setup.py"]
+    has_manifest = any((root / m).exists() for m in manifests)
+    if not has_manifest:
+        # Check for gemspec
+        has_manifest = bool(list(root.glob("*.gemspec")))
+
+    if has_manifest:
+        return {"type": "library", "reason": "package manifest exists but no web/CLI signals"}
+
+    # 5. Text-only
+    return {"type": "text-only", "reason": "no recognized package manifest"}
+
+
+def cmd_detect(args):
+    repo_root = args.repo_root or os.getcwd()
+    result = _detect_project_type(repo_root)
+    print(json.dumps(result))
+
+
+# --- Recommend ---
+
+def _recommend_tier(project_type, change_type, tools):
+    has_browser = tools.get("agent_browser", False)
+    has_vhs = tools.get("vhs", False)
+    has_silicon = tools.get("silicon", False)
+    has_ffmpeg = tools.get("ffmpeg", False)
+    has_ffprobe = tools.get("ffprobe", False)
+    has_stitch = has_ffmpeg and has_ffprobe  # stitching requires both
+
+    recommended = None
+    reasoning = ""
+
+    if project_type == "web-app":
+        if has_browser and has_stitch:
+            recommended = "browser-reel"
+            reasoning = "Web app with agent-browser and ffmpeg available"
+        elif has_browser:
+            recommended = "static-screenshots"
+            reasoning = "Web app with agent-browser but no ffmpeg/ffprobe for stitching"
+        else:
+            recommended = "static-screenshots"
+            reasoning = "Web app without agent-browser"
+
+    elif project_type == "cli-tool":
+        if change_type == "motion":
+            if has_vhs:
+                recommended = "terminal-recording"
+                reasoning = "CLI tool with motion, VHS available"
+            elif has_silicon and has_stitch:
+                recommended = "screenshot-reel"
+                reasoning = "CLI tool with motion, silicon + ffmpeg available (no VHS)"
+            else:
+                recommended = "static-screenshots"
+                reasoning = "CLI tool with no capture tools available"
+        else:  # states
+            if has_silicon and has_stitch:
+                recommended = "screenshot-reel"
+                reasoning = "CLI tool with discrete states, silicon + ffmpeg available"
+            elif has_vhs:
+                recommended = "terminal-recording"
+                reasoning = "CLI tool with discrete states, VHS available (no silicon)"
+            else:
+                recommended = "static-screenshots"
+                reasoning = "CLI tool with no capture tools available"
+
+    elif project_type == "desktop-app":
+        if has_browser and has_stitch:
+            recommended = "browser-reel"
+            reasoning = "Desktop app with agent-browser and ffmpeg (via localhost/CDP)"
+        else:
+            recommended = "static-screenshots"
+            reasoning = "Desktop app without agent-browser"
+
+    elif project_type == "library":
+        recommended = "static-screenshots"
+        reasoning = "Library projects use static screenshots"
+
+    else:  # text-only or unknown
+        recommended = "static-screenshots"
+        reasoning = "Fallback to static screenshots"
+
+    # Build available tiers list
+    available = []
+    if has_browser and has_stitch:
+        available.append("browser-reel")
+    if has_vhs:
+        available.append("terminal-recording")
+    if has_silicon and has_stitch:
+        available.append("screenshot-reel")
+    available.append("static-screenshots")  # always available
+
+    return {
+        "recommended": recommended,
+        "available": available,
+        "reasoning": reasoning,
+    }
+
+
+def cmd_recommend(args):
+    try:
+        tools = json.loads(args.tools)
+    except json.JSONDecodeError:
+        die("--tools must be valid JSON")
+    result = _recommend_tier(args.project_type, args.change_type, tools)
+    print(json.dumps(result))
+
+
+# --- Stitch ---
+
+def _get_frame_dimensions(path):
+    result = run_cmd([
+        "ffprobe", "-v", "error", "-select_streams", "v:0",
+        "-show_entries", "stream=width,height", "-of", "csv=p=0", str(path),
+    ])
+    if result.returncode != 0:
+        die(f"ffprobe failed on {path}")
+    parts = result.stdout.strip().split(",")
+    return int(parts[0]), int(parts[1])
+
+
+def _stitch_frames(output, frames, duration=3.0):
+    if not frames:
+        die("No input frames provided")
+
+    for f in frames:
+        if not Path(f).exists():
+            die(f"Frame not found: {f}")
+
+    if not check_tool("ffmpeg"):
+        die("ffmpeg is not installed. Install with: brew install ffmpeg")
+    if not check_tool("ffprobe"):
+        die("ffprobe is not installed. Install with: brew install ffmpeg")
+
+    print(f"Stitching {len(frames)} frames into GIF ({duration}s per frame)...")
+
+    tmpdir = tempfile.mkdtemp(prefix="evidence-stitch-")
+    try:
+        # Detect max dimensions
+        max_w, max_h = 0, 0
+        for f in frames:
+            w, h = _get_frame_dimensions(f)
+            max_w = max(max_w, w)
+            max_h = max(max_h, h)
+
+        # Even dimensions
+        if max_w % 2 != 0:
+            max_w += 1
+        if max_h % 2 != 0:
+            max_h += 1
+
+        print(f"  Target dimensions: {max_w}x{max_h}")
+
+        # Normalize frames
+        normalized = []
+        for i, f in enumerate(frames):
+            out = os.path.join(tmpdir, f"frame_{i:03d}.png")
+            result = run_cmd([
+                "ffmpeg", "-y", "-v", "error", "-i", f,
+                "-vf", f"scale={max_w}:{max_h}:force_original_aspect_ratio=decrease,"
+                       f"pad={max_w}:{max_h}:(ow-iw)/2:0:color=#0d1117",
+                out,
+            ])
+            if result.returncode != 0:
+                die(f"ffmpeg failed to normalize frame: {f}")
+            normalized.append(out)
+
+        print(f"  Normalized {len(normalized)} frames")
+
+        # Write concat file
+        concat_file = os.path.join(tmpdir, "concat.txt")
+        with open(concat_file, "w") as fh:
+            for f in normalized:
+                fh.write(f"file '{os.path.basename(f)}'\n")
+                fh.write(f"duration {duration}\n")
+            # Last file repeated without duration (concat demuxer requirement)
+            fh.write(f"file '{os.path.basename(normalized[-1])}'\n")
+
+        # Two-pass palette generation
+        palette = os.path.join(tmpdir, "palette.png")
+        result = run_cmd([
+            "ffmpeg", "-y", "-v", "error",
+            "-f", "concat", "-safe", "0", "-i", concat_file,
+            "-vf", "palettegen=stats_mode=diff",
+            palette,
+        ])
+        if result.returncode != 0:
+            die("ffmpeg palette generation failed")
+
+        # Generate GIF with palette
+        result = run_cmd([
+            "ffmpeg", "-y", "-v", "error",
+            "-f", "concat", "-safe", "0", "-i", concat_file,
+            "-i", palette,
+            "-lavfi", "paletteuse=dither=bayer:bayer_scale=3",
+            "-loop", "0",
+            output,
+        ])
+        if result.returncode != 0:
+            die("ffmpeg GIF encoding failed")
+
+        if not Path(output).exists():
+            die("GIF creation failed: no output file")
+
+        size = Path(output).stat().st_size
+        size_mb = size / (1024 * 1024)
+        print(f"  Created: {output} ({size_mb:.1f} MB, {len(frames)} frames)")
+
+        # Auto-reduce if over limit
+        if size > MAX_GIF_SIZE:
+            print("  GIF exceeds 10 MB limit. Reducing...")
+            if len(frames) > 2:
+                print("  Dropping middle frame(s) and re-stitching...")
+                reduced = [frames[0]]
+                step = max(2, (len(frames) - 1) // 2)
+                for j in range(step, len(frames) - 1, step):
+                    reduced.append(frames[j])
+                reduced.append(frames[-1])
+
+                if len(reduced) < len(frames):
+                    print(f"  Reduced from {len(frames)} to {len(reduced)} frames")
+                    shutil.rmtree(tmpdir, ignore_errors=True)
+                    _stitch_frames(output, reduced, duration)
+                    return
+            print("  WARNING: Could not reduce below 10 MB. GIF may not render inline on GitHub.")
+        elif size > TARGET_GIF_SIZE:
+            print("  Note: GIF is over 5 MB preferred target but under 10 MB limit. Acceptable.")
+
+    finally:
+        shutil.rmtree(tmpdir, ignore_errors=True)
+
+
+def cmd_stitch(args):
+    _stitch_frames(args.output, args.frames, args.duration)
+
+
+# --- Screenshot Reel ---
+
+def cmd_screenshot_reel(args):
+    if not check_tool("silicon"):
+        die("silicon is not installed. Install with: brew install silicon")
+    if not check_tool("ffmpeg"):
+        die("ffmpeg is not installed. Install with: brew install ffmpeg")
+
+    tmpdir = tempfile.mkdtemp(prefix="evidence-reel-")
+    try:
+        frame_pngs = []
+        for i, text_file in enumerate(args.text):
+            if not Path(text_file).exists():
+                die(f"Text file not found: {text_file}")
+
+            out_png = os.path.join(tmpdir, f"frame_{i:03d}.png")
+            result = run_cmd([
+                "silicon", text_file,
+                "-o", out_png,
+                "--theme", args.theme,
+                "-l", args.lang,
+                "--pad-horiz", "20",
+                "--pad-vert", "40",
+                "--no-line-number",
+                "--no-round-corner",
+                "--background", args.background,
+            ])
+            if result.returncode != 0 or not Path(out_png).exists():
+                die(f"silicon failed to render {text_file}")
+            frame_pngs.append(out_png)
+
+        print(f"Rendered {len(frame_pngs)} frames via silicon")
+        _stitch_frames(args.output, frame_pngs, args.duration)
+
+    finally:
+        shutil.rmtree(tmpdir, ignore_errors=True)
+
+
+# --- Terminal Recording ---
+
+def cmd_terminal_recording(args):
+    if not check_tool("vhs"):
+        die("vhs is not installed. Install with: brew install charmbracelet/tap/vhs")
+
+    tape_path = args.tape
+    if not Path(tape_path).exists():
+        die(f"Tape file not found: {tape_path}")
+
+    # Parse Output directive from tape file
+    output_path = args.output
+    tape_content = Path(tape_path).read_text()
+    tape_has_output = False
+    for line in tape_content.splitlines():
+        stripped = line.strip()
+        if stripped.startswith("Output "):
+            tape_has_output = True
+            if not output_path:
+                output_path = stripped.split(None, 1)[1].strip().strip('"').strip("'")
+            break
+
+    if not output_path:
+        die("No output path: use --output or set Output in the tape file")
+
+    # If --output differs from tape's Output directive, rewrite to a temp tape
+    actual_tape = tape_path
+    tmp_tape = None
+    if output_path and tape_has_output:
+        # Rewrite the Output line to use the requested path
+        lines = tape_content.splitlines()
+        rewritten = []
+        for line in lines:
+            if line.strip().startswith("Output "):
+                rewritten.append(f'Output "{output_path}"')
+            else:
+                rewritten.append(line)
+        fd, tmp_tape = tempfile.mkstemp(suffix=".tape", prefix="vhs-")
+        os.close(fd)
+        Path(tmp_tape).write_text("\n".join(rewritten) + "\n")
+        actual_tape = tmp_tape
+    elif output_path and not tape_has_output:
+        # No Output in tape — prepend one
+        fd, tmp_tape = tempfile.mkstemp(suffix=".tape", prefix="vhs-")
+        os.close(fd)
+        Path(tmp_tape).write_text(f'Output "{output_path}"\n{tape_content}')
+        actual_tape = tmp_tape
+
+    print(f"Running VHS tape: {tape_path}")
+    result = run_cmd(["vhs", actual_tape], timeout=300)
+
+    if tmp_tape and Path(tmp_tape).exists():
+        Path(tmp_tape).unlink()
+    if result.returncode != 0:
+        die(f"VHS failed (exit {result.returncode})")
+
+    if not Path(output_path).exists():
+        die(f"VHS produced no output at {output_path}")
+
+    size = Path(output_path).stat().st_size
+    size_mb = size / (1024 * 1024)
+    print(f"Recording: {output_path} ({size_mb:.1f} MB)")
+    print(json.dumps({"gif_path": str(output_path), "size_mb": round(size_mb, 1)}))
+
+
+# --- Upload ---
+
+def cmd_upload(args):
+    file_path = args.file
+    if not Path(file_path).exists():
+        die(f"File not found: {file_path}")
+
+    if not check_tool("curl"):
+        die("curl is not installed")
+
+    size_mb = file_size_mb(file_path)
+    print(f"Uploading {file_path} ({size_mb:.1f} MB) to catbox.moe...")
+
+    def _try_upload():
+        try:
+            result = subprocess.run(
+                ["curl", "-s", "--connect-timeout", "10",
+                 "-F", "reqtype=fileupload",
+                 "-F", f"fileToUpload=@{file_path}", CATBOX_API],
+                capture_output=True, text=True, timeout=30, check=False,
+            )
+            return result.stdout.strip()
+        except subprocess.TimeoutExpired:
+            print("ERROR: Upload timed out after 30s", file=sys.stderr)
+            return ""
+
+    url = _try_upload()
+
+    if url.startswith("https://"):
+        print(f"Uploaded: {url}")
+        print(url)
+        return
+
+    print(f"ERROR: Upload failed. Response: {url[:200]}", file=sys.stderr)
+    print(f"Local file preserved at: {file_path}", file=sys.stderr)
+
+    # Retry once
+    print("Retrying in 2 seconds...", file=sys.stderr)
+    time.sleep(2)
+    url = _try_upload()
+
+    if url.startswith("https://"):
+        print(f"Uploaded (retry): {url}")
+        print(url)
+    else:
+        print("ERROR: Retry also failed. Upload manually or commit to branch.", file=sys.stderr)
+        sys.exit(1)
+
+
+# --- Main ---
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Evidence capture pipeline",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Commands:
+  preflight                              Check tool availability (JSON)
+  detect [--repo-root PATH]              Detect project type (JSON)
+  recommend --project-type T ...         Recommend capture tier (JSON)
+  stitch [--duration N] OUTPUT FRAMES    Stitch frames into animated GIF
+  screenshot-reel --output O --text F    Render text via silicon + stitch
+  terminal-recording --output O --tape T Run VHS tape
+  upload FILE                            Upload to catbox.moe
+""",
+    )
+    sub = parser.add_subparsers(dest="command")
+
+    # preflight
+    sub.add_parser("preflight", help="Check tool availability")
+
+    # detect
+    p_detect = sub.add_parser("detect", help="Detect project type")
+    p_detect.add_argument("--repo-root", help="Repository root (default: cwd)")
+
+    # recommend
+    p_rec = sub.add_parser("recommend", help="Recommend capture tier")
+    p_rec.add_argument("--project-type", required=True,
+                       choices=["web-app", "cli-tool", "library", "desktop-app", "text-only"])
+    p_rec.add_argument("--change-type", required=True, choices=["motion", "states"])
+    p_rec.add_argument("--tools", required=True, help="JSON object of tool availability")
+
+    # stitch
+    p_stitch = sub.add_parser("stitch", help="Stitch frames into animated GIF")
+    p_stitch.add_argument("--duration", type=float, default=3.0, help="Seconds per frame")
+    p_stitch.add_argument("output", help="Output GIF path")
+    p_stitch.add_argument("frames", nargs="+", help="Input frame PNGs")
+
+    # screenshot-reel
+    p_reel = sub.add_parser("screenshot-reel", help="Render text frames via silicon + stitch")
+    p_reel.add_argument("--output", required=True, help="Output GIF path")
+    p_reel.add_argument("--duration", type=float, default=2.5, help="Seconds per frame")
+    p_reel.add_argument("--lang", default="bash", help="Language for syntax highlighting")
+    p_reel.add_argument("--theme", default="Dracula", help="Silicon theme")
+    p_reel.add_argument("--background", default="#0d1117", help="Background color for frame border")
+    p_reel.add_argument("--text", nargs="+", required=True, help="Text files (one per frame)")
+
+    # terminal-recording
+    p_term = sub.add_parser("terminal-recording", help="Run VHS tape file")
+    p_term.add_argument("--output", help="Output GIF path (overrides tape Output directive)")
+    p_term.add_argument("--tape", required=True, help="VHS tape file path")
+
+    # upload
+    p_upload = sub.add_parser("upload", help="Upload to catbox.moe")
+    p_upload.add_argument("file", help="File to upload")
+
+    args = parser.parse_args()
+
+    if not args.command:
+        parser.print_help()
+        sys.exit(1)
+
+    dispatch = {
+        "preflight": cmd_preflight,
+        "detect": cmd_detect,
+        "recommend": cmd_recommend,
+        "stitch": cmd_stitch,
+        "screenshot-reel": cmd_screenshot_reel,
+        "terminal-recording": cmd_terminal_recording,
+        "upload": cmd_upload,
+    }
+    dispatch[args.command](args)
+
+
+if __name__ == "__main__":
+    main()
--- a/plugins/compound-engineering/skills/ce-work-beta/references/shipping-workflow.md
+++ b/plugins/compound-engineering/skills/ce-work-beta/references/shipping-workflow.md
@@ -50,35 +50,11 @@ This file contains the shipping workflow (Phase 3-4). Load it only when all Phas

 ## Phase 4: Ship It

-1. **Capture and Upload Screenshots for UI Changes** (REQUIRED for any UI work)
+1. **Prepare Evidence Context**

-   For **any** design changes, new views, or UI modifications, capture and upload screenshots before creating the PR:
+   Do not invoke `ce-demo-reel` directly in this step. Evidence capture belongs to the PR creation or PR description update flow, where the final PR diff and description context are available.

-   **Step 1: Start dev server** (if not running)
-   ```bash
-   bin/dev  # Run in background
-   ```
-
-   **Step 2: Capture screenshots with agent-browser CLI**
-   ```bash
-   agent-browser open http://localhost:3000/[route]
-   agent-browser snapshot -i
-   agent-browser screenshot output.png
-   ```
-   See the `agent-browser` skill for detailed usage.
-
-   **Step 3: Upload using imgup skill**
-   ```bash
-   skill: imgup
-   # Then upload each screenshot:
-   imgup -h pixhost screenshot.png  # pixhost works without API key
-   # Alternative hosts: catbox, imagebin, beeimg
-   ```
-
-   **What to capture:**
-   - **New screens**: Screenshot of the new UI
-   - **Modified screens**: Before AND after screenshots
-   - **Design implementation**: Screenshot showing Figma design match
+   Note whether the completed work has observable behavior (UI rendering, CLI output, API/library behavior with a runnable example, generated artifacts, or workflow output). The `git-commit-push-pr` skill will ask whether to capture evidence only when evidence is possible.

 2. **Update Plan Status**

@@ -94,7 +70,7 @@ This file contains the shipping workflow (Phase 3-4). Load it only when all Phas
   When providing context for the PR description, include:
   - The plan's summary and key decisions
   - Testing notes (tests added/modified, manual testing performed)
-   - Screenshot URLs from step 1 (if applicable)
+   - Evidence context from step 1, so `git-commit-push-pr` can decide whether to ask about capturing evidence
   - Figma design link (if applicable)
   - The Post-Deploy Monitoring & Validation section (see Phase 3 Step 4)

@@ -116,11 +92,11 @@ Before creating PR, verify:
 - [ ] Linting passes (use linting-agent)
 - [ ] Code follows existing patterns
 - [ ] Figma designs match implementation (if applicable)
- [ ] Before/after screenshots captured and uploaded (for UI changes)
+- [ ] Evidence decision handled by `git-commit-push-pr` when the change has observable behavior
 - [ ] Commit messages follow conventional format
 - [ ] PR description includes Post-Deploy Monitoring & Validation section (or explicit no-impact rationale)
 - [ ] Code review completed (inline self-review or full `ce:review`)
- [ ] PR description includes summary, testing notes, and screenshots
+- [ ] PR description includes summary, testing notes, and evidence when captured
 - [ ] PR description includes Compound Engineered badge with accurate model and harness

 ## Code Review Tiers
--- a/plugins/compound-engineering/skills/ce-work/references/shipping-workflow.md
+++ b/plugins/compound-engineering/skills/ce-work/references/shipping-workflow.md
@@ -50,35 +50,11 @@ This file contains the shipping workflow (Phase 3-4). Load it only when all Phas

 ## Phase 4: Ship It

-1. **Capture and Upload Screenshots for UI Changes** (REQUIRED for any UI work)
+1. **Prepare Evidence Context**

-   For **any** design changes, new views, or UI modifications, capture and upload screenshots before creating the PR:
+   Do not invoke `ce-demo-reel` directly in this step. Evidence capture belongs to the PR creation or PR description update flow, where the final PR diff and description context are available.

-   **Step 1: Start dev server** (if not running)
-   ```bash
-   bin/dev  # Run in background
-   ```
-
-   **Step 2: Capture screenshots with agent-browser CLI**
-   ```bash
-   agent-browser open http://localhost:3000/[route]
-   agent-browser snapshot -i
-   agent-browser screenshot output.png
-   ```
-   See the `agent-browser` skill for detailed usage.
-
-   **Step 3: Upload using imgup skill**
-   ```bash
-   skill: imgup
-   # Then upload each screenshot:
-   imgup -h pixhost screenshot.png  # pixhost works without API key
-   # Alternative hosts: catbox, imagebin, beeimg
-   ```
-
-   **What to capture:**
-   - **New screens**: Screenshot of the new UI
-   - **Modified screens**: Before AND after screenshots
-   - **Design implementation**: Screenshot showing Figma design match
+   Note whether the completed work has observable behavior (UI rendering, CLI output, API/library behavior with a runnable example, generated artifacts, or workflow output). The `git-commit-push-pr` skill will ask whether to capture evidence only when evidence is possible.

 2. **Update Plan Status**

@@ -94,7 +70,7 @@ This file contains the shipping workflow (Phase 3-4). Load it only when all Phas
   When providing context for the PR description, include:
   - The plan's summary and key decisions
   - Testing notes (tests added/modified, manual testing performed)
-   - Screenshot URLs from step 1 (if applicable)
+   - Evidence context from step 1, so `git-commit-push-pr` can decide whether to ask about capturing evidence
   - Figma design link (if applicable)
   - The Post-Deploy Monitoring & Validation section (see Phase 3 Step 4)

@@ -116,11 +92,11 @@ Before creating PR, verify:
 - [ ] Linting passes (use linting-agent)
 - [ ] Code follows existing patterns
 - [ ] Figma designs match implementation (if applicable)
- [ ] Before/after screenshots captured and uploaded (for UI changes)
+- [ ] Evidence decision handled by `git-commit-push-pr` when the change has observable behavior
 - [ ] Commit messages follow conventional format
 - [ ] PR description includes Post-Deploy Monitoring & Validation section (or explicit no-impact rationale)
 - [ ] Code review completed (inline self-review or full `ce:review`)
- [ ] PR description includes summary, testing notes, and screenshots
+- [ ] PR description includes summary, testing notes, and evidence when captured
 - [ ] PR description includes Compound Engineered badge with accurate model and harness

 ## Code Review Tiers
--- a/plugins/compound-engineering/skills/feature-video/SKILL.md
+++ b/plugins/compound-engineering/skills/feature-video/SKILL.md
@@ -1,382 +0,0 @@
---
-name: feature-video
-description: Record a video walkthrough of a feature and add it to the PR description. Use when a PR needs a visual demo for reviewers, when the user asks to demo a feature, create a PR video, record a walkthrough, show what changed visually, or add a video to a pull request.
-argument-hint: "[PR number or 'current' or path/to/video.mp4] [optional: base URL, default localhost:3000]"
---
-
-# Feature Video Walkthrough
-
-Record browser interactions demonstrating a feature, stitch screenshots into an MP4 video, upload natively to GitHub, and embed in the PR description as an inline video player.
-
-## Prerequisites
-
- Local development server running (e.g., `bin/dev`, `npm run dev`, `rails server`)
- `agent-browser` CLI installed (load the `agent-browser` skill for details)
- `ffmpeg` installed (for video conversion)
- `gh` CLI authenticated with push access to the repo
- Git repository on a feature branch (PR optional -- skill can create a draft or record-only)
- One-time GitHub browser auth (see Step 6 auth check)
-
-## Main Tasks
-
-### 1. Parse Arguments & Resolve PR
-
-**Arguments:** $ARGUMENTS
-
-Parse the input:
- First argument: PR number, "current" (defaults to current branch's PR), or path to an existing `.mp4` file (upload-only resume mode)
- Second argument: Base URL (defaults to `http://localhost:3000`)
-
-**Upload-only resume:** If the first argument ends in `.mp4` and the file exists, skip Steps 2-5 and proceed directly to Step 6 using that file. Resolve the PR number from the current branch (`gh pr view --json number -q '.number'`).
-
-If an explicit PR number was provided, verify it exists and use it directly:
-
-```bash
-gh pr view [number] --json number -q '.number'
-```
-
-If no explicit PR number was provided (or "current" was specified), check if a PR exists for the current branch:
-
-```bash
-gh pr view --json number -q '.number'
-```
-
-If no PR exists for the current branch, ask the user how to proceed. **Use the platform's blocking question tool** (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini):
-
-```
-No PR found for the current branch.
-
-1. Create a draft PR now and continue (recommended)
-2. Record video only -- save locally and upload later when a PR exists
-3. Cancel
-```
-
-If option 1: create a draft PR with a placeholder title derived from the branch name, then continue with the new PR number:
-
-```bash
-gh pr create --draft --title "[branch-name-humanized]" --body "Draft PR for video walkthrough"
-```
-
-If option 2: set `RECORD_ONLY=true`. Proceed through Steps 2-5 (record and encode), skip Steps 6-7 (upload and PR update), and report the local video path and `[RUN_ID]` at the end.
-
-**Upload-only resume:** To upload a previously recorded video, pass an existing video file path as the first argument (e.g., `/feature-video .context/compound-engineering/feature-video/1711234567/videos/feature-demo.mp4`). When the first argument is a path to an `.mp4` file, skip Steps 2-5 and proceed directly to Step 6 using that file for upload.
-
-### 1b. Verify Required Tools
-
-Before proceeding, check that required CLI tools are installed. Fail early with a clear message rather than failing mid-workflow after screenshots have been recorded:
-
-```bash
-command -v ffmpeg
-```
-
-```bash
-command -v agent-browser
-```
-
-```bash
-command -v gh
-```
-
-If any tool is missing, stop and report which tools need to be installed:
- `ffmpeg`: `brew install ffmpeg` (macOS) or equivalent
- `agent-browser`: load the `agent-browser` skill for installation instructions
- `gh`: `brew install gh` (macOS) or see https://cli.github.com
-
-Do not proceed to Step 2 until all tools are available.
-
-### 2. Gather Feature Context
-
-**If a PR is available**, get PR details and changed files:
-
-```bash
-gh pr view [number] --json title,body,files,headRefName -q '.'
-```
-
-```bash
-gh pr view [number] --json files -q '.files[].path'
-```
-
-**If in record-only mode (no PR)**, detect the default branch and derive context from the branch diff. Run both commands in a single block so the variable persists:
-
-```bash
-DEFAULT_BRANCH=$(gh repo view --json defaultBranchRef -q '.defaultBranchRef.name') && git diff --name-only "$DEFAULT_BRANCH"...HEAD && git log --oneline "$DEFAULT_BRANCH"...HEAD
-```
-
-Map changed files to routes/pages that should be demonstrated. Examine the project's routing configuration (e.g., `routes.rb`, `next.config.js`, `app/` directory structure) to determine which URLs correspond to the changed files.
-
-### 3. Plan the Video Flow
-
-Before recording, create a shot list:
-
-1. **Opening shot**: Homepage or starting point (2-3 seconds)
-2. **Navigation**: How user gets to the feature
-3. **Feature demonstration**: Core functionality (main focus)
-4. **Edge cases**: Error states, validation, etc. (if applicable)
-5. **Success state**: Completed action/result
-
-Present the proposed flow to the user for confirmation before recording.
-
-**Use the platform's blocking question tool when available** (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini). Otherwise, present numbered options and wait for the user's reply before proceeding:
-
-```
-Proposed Video Flow for PR #[number]: [title]
-
-1. Start at: /[starting-route]
-2. Navigate to: /[feature-route]
-3. Demonstrate:
-   - [Action 1]
-   - [Action 2]
-   - [Action 3]
-4. Show result: [success state]
-
-Estimated duration: ~[X] seconds
-
-1. Start recording
-2. Modify the flow (describe changes)
-3. Add specific interactions to demonstrate
-```
-
-### 4. Record the Walkthrough
-
-Generate a unique run ID (e.g., timestamp) and create per-run output directories. This prevents stale screenshots from prior runs being spliced into the new video.
-
-**Important:** Shell variables do not persist across separate code blocks. After generating the run ID, substitute the concrete value into all subsequent commands in this workflow. For example, if the timestamp is `1711234567`, use that literal value in all paths below -- do not rely on `[RUN_ID]` expanding in later blocks.
-
-```bash
-date +%s
-```
-
-Use the output as RUN_ID. Create the directories with the concrete value:
-
-```bash
-mkdir -p .context/compound-engineering/feature-video/[RUN_ID]/screenshots
-mkdir -p .context/compound-engineering/feature-video/[RUN_ID]/videos
-```
-
-Execute the planned flow, capturing each step with agent-browser. Number screenshots sequentially for correct frame ordering:
-
-```bash
-agent-browser open "[base-url]/[start-route]"
-agent-browser wait 2000
-agent-browser screenshot .context/compound-engineering/feature-video/[RUN_ID]/screenshots/01-start.png
-```
-
-```bash
-agent-browser snapshot -i
-agent-browser click @e1
-agent-browser wait 1000
-agent-browser screenshot .context/compound-engineering/feature-video/[RUN_ID]/screenshots/02-navigate.png
-```
-
-```bash
-agent-browser snapshot -i
-agent-browser click @e2
-agent-browser wait 1000
-agent-browser screenshot .context/compound-engineering/feature-video/[RUN_ID]/screenshots/03-feature.png
-```
-
-```bash
-agent-browser wait 2000
-agent-browser screenshot .context/compound-engineering/feature-video/[RUN_ID]/screenshots/04-result.png
-```
-
-### 5. Create Video
-
-Stitch screenshots into an MP4 using the same `[RUN_ID]` from Step 4:
-
-```bash
-ffmpeg -y -framerate 0.5 -pattern_type glob -i ".context/compound-engineering/feature-video/[RUN_ID]/screenshots/*.png" \
-  -c:v libx264 -pix_fmt yuv420p -vf "scale=1280:-2" \
-  ".context/compound-engineering/feature-video/[RUN_ID]/videos/feature-demo.mp4"
-```
-
-Notes:
- `-framerate 0.5` = 2 seconds per frame. Adjust for faster/slower playback.
- `-2` in scale ensures height is divisible by 2 (required for H.264).
-
-### 6. Authenticate & Upload to GitHub
-
-Upload produces a `user-attachments/assets/` URL that GitHub renders as a native inline video player -- the same result as pasting a video into the PR editor manually.
-
-The approach: close any existing agent-browser session, start a Chrome-engine session with saved GitHub auth, navigate to the PR page, set the video file on the comment form's hidden file input, wait for GitHub to process the upload, extract the resulting URL, then clear the textarea without submitting.
-
-#### Check for existing session
-
-First, check if a saved GitHub session already exists:
-
-```bash
-agent-browser close
-agent-browser --engine chrome --session-name github open https://github.com/settings/profile
-agent-browser get title
-```
-
-If the page title contains the user's GitHub username or "Profile", the session is still valid -- skip to "Upload the video" below. If it redirects to the login page, the session has expired or was never created -- proceed to "Auth setup".
-
-#### Auth setup (one-time)
-
-Establish an authenticated GitHub session. This only needs to happen once -- session cookies persist across runs via the `--session-name` flag.
-
-Close the current session and open the GitHub login page in a headed Chrome window:
-
-```bash
-agent-browser close
-agent-browser --engine chrome --headed --session-name github open https://github.com/login
-```
-
-The user must log in manually in the browser window (handles 2FA, SSO, OAuth -- any login method). **Use the platform's blocking question tool** (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini). Otherwise, present the message and wait for the user's reply before proceeding:
-
-```
-GitHub login required for video upload.
-
-A Chrome window has opened to github.com/login. Please log in manually
-(this handles 2FA/SSO/OAuth automatically). Reply when done.
-```
-
-After login, verify the session works:
-
-```bash
-agent-browser open https://github.com/settings/profile
-```
-
-If the profile page loads, auth is confirmed. The `github` session is now saved and reusable.
-
-#### Upload the video
-
-Navigate to the PR page and scroll to the comment form:
-
-```bash
-agent-browser open "https://github.com/[owner]/[repo]/pull/[number]"
-agent-browser scroll down 5000
-```
-
-Save any existing textarea content before uploading (the comment box may contain an unsent draft):
-
-```bash
-agent-browser eval "document.getElementById('new_comment_field').value"
-```
-
-Store this value as `SAVED_TEXTAREA`. If non-empty, it will be restored after extracting the upload URL.
-
-Upload the video via the hidden file input. Use the caller-provided `.mp4` path if in upload-only resume mode, otherwise use the current run's encoded video:
-
-```bash
-agent-browser upload '#fc-new_comment_field' [VIDEO_FILE_PATH]
-```
-
-Where `[VIDEO_FILE_PATH]` is either:
- The `.mp4` path passed as the first argument (upload-only resume mode)
- `.context/compound-engineering/feature-video/[RUN_ID]/videos/feature-demo.mp4` (normal recording flow)
-
-Wait for GitHub to process the upload (typically 3-5 seconds), then read the textarea value:
-
-```bash
-agent-browser wait 5000
-agent-browser eval "document.getElementById('new_comment_field').value"
-```
-
-**Validate the extracted URL.** The value must contain `user-attachments/assets/` to confirm a successful native upload. If the textarea is empty, contains only placeholder text, or the URL does not match, do not proceed to Step 7. Instead:
-
-1. Check `agent-browser get url` -- if it shows `github.com/login`, the session expired. Re-run auth setup.
-2. If still on the PR page, wait an additional 5 seconds and re-read the textarea (GitHub processing can be slow).
-3. If validation still fails after retry, report the failure and the local video path so the user can upload manually.
-
-Restore the original textarea content (or clear if it was empty). A JSON-encoded string is also a valid JavaScript string literal, so assign it directly without `JSON.parse`:
-
-```bash
-agent-browser eval "const ta = document.getElementById('new_comment_field'); ta.value = [SAVED_TEXTAREA_AS_JS_STRING]; ta.dispatchEvent(new Event('input', { bubbles: true }))"
-```
-
-To prepare the value: take the SAVED_TEXTAREA string and produce a JS string literal from it -- escape backslashes, double quotes, and newlines (e.g., `"text with \"quotes\" and\nnewlines"`). If SAVED_TEXTAREA was empty, use `""`. The result is embedded directly as the right-hand side of the assignment -- no `JSON.parse` call needed.
-
-### 7. Update PR Description
-
-Get the current PR body:
-
-```bash
-gh pr view [number] --json body -q '.body'
-```
-
-Append a Demo section (or replace an existing one). The video URL renders as an inline player when placed on its own line:
-
-```markdown
-## Demo
-
-https://github.com/user-attachments/assets/[uuid]
-
-*Automated video walkthrough*
-```
-
-Update the PR:
-
-```bash
-gh pr edit [number] --body "[updated body with demo section]"
-```
-
-### 8. Cleanup
-
-Ask the user before removing temporary files. If confirmed, clean up only the current run's scratch directory (other runs may still be in progress or awaiting upload).
-
-**If the video was successfully uploaded**, remove the entire run directory:
-
-```bash
-rm -r .context/compound-engineering/feature-video/[RUN_ID]
-```
-
-**If in record-only mode or upload failed**, remove only the screenshots but preserve the video so the user can upload later:
-
-```bash
-rm -r .context/compound-engineering/feature-video/[RUN_ID]/screenshots
-```
-
-Present a completion summary:
-
-```
-Feature Video Complete
-
-PR: #[number] - [title]
-Video: [VIDEO_URL]
-
-Shots captured:
-1. [description]
-2. [description]
-3. [description]
-4. [description]
-
-PR description updated with demo section.
-```
-
-## Usage Examples
-
-```bash
-# Record video for current branch's PR
-/feature-video
-
-# Record video for specific PR
-/feature-video 847
-
-# Record with custom base URL
-/feature-video 847 http://localhost:5000
-
-# Record for staging environment
-/feature-video current https://staging.example.com
-```
-
-## Tips
-
- Keep it short: 10-30 seconds is ideal for PR demos
- Focus on the change: don't include unrelated UI
- Show before/after: if fixing a bug, show the broken state first (if possible)
- The `--session-name github` session expires when GitHub invalidates the cookies (typically weeks). If upload fails with a login redirect, re-run the auth setup.
- GitHub DOM selectors (`#fc-new_comment_field`, `#new_comment_field`) may change if GitHub updates its UI. If the upload silently fails, inspect the PR page for updated selectors.
-
-## Troubleshooting
-
-| Symptom | Cause | Fix |
-|---|---|---|
-| `ffmpeg: command not found` | ffmpeg not installed | Install via `brew install ffmpeg` (macOS) or equivalent |
-| `agent-browser: command not found` | agent-browser not installed | Load the `agent-browser` skill for installation instructions |
-| Textarea empty after upload wait | Session expired, or GitHub processing slow | Check session validity (Step 6 auth check). If valid, increase wait time and retry. |
-| Textarea empty, URL is `github.com/login` | Session expired | Re-run auth setup (Step 6) |
-| `gh pr view` fails | No PR for current branch | Step 1 handles this -- choose to create a draft PR or record-only mode |
-| Video file too large for upload | Exceeds GitHub's 10MB (free) or 100MB (paid) limit | Re-encode: lower framerate (`-framerate 0.33`), reduce resolution (`scale=960:-2`), or increase CRF (`-crf 28`) |
-| Upload URL does not contain `user-attachments/assets/` | Wrong upload method or GitHub change | Verify the file input selector is still correct by inspecting the PR page |
--- a/plugins/compound-engineering/skills/git-commit-push-pr/SKILL.md
+++ b/plugins/compound-engineering/skills/git-commit-push-pr/SKILL.md
@@ -71,9 +71,16 @@ Read the current PR description:
 gh pr view --json body --jq '.body'
 ```

-Follow the "Detect the base branch and remote" and "Gather the branch scope" sections of Step 6 to get the full branch diff. Use the PR found in DU-2 as the existing PR for base branch detection. Classify commits per the "Classify commits before writing" section -- this is especially important for description updates, where the recent commits that prompted the update are often fix-up work (code review fixes, lint fixes) rather than feature work. Then write a new description following the writing principles in Step 6, driven by the feature commits and the final diff. If the user provided a focus, incorporate it into the description alongside the branch diff context.
+Build the updated description in this order:

-Compare the new description against the current one and summarize the substantial changes for the user (e.g., "Added coverage of the new caching layer, updated test plan, removed outdated migration notes"). If the user provided a focus, confirm it was addressed. Ask the user to confirm before applying. Use the platform's blocking question tool (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini). If no question tool is available, present the summary and wait for the user's reply.
+1. **Get the full branch diff** -- follow "Detect the base branch and remote" and "Gather the branch scope" in Step 6. Use the PR found in DU-2 as the existing PR for base branch detection.
+2. **Classify commits** -- follow "Classify commits before writing" in Step 6. This matters especially for description updates, where the recent commits that prompted the update are often fix-up work (code review fixes, lint fixes) rather than feature work.
+3. **Decide on evidence** -- check if the current PR description already contains evidence (a `## Demo` or `## Screenshots` section with image embeds). If evidence exists, preserve it unless the user's focus specifically asks to refresh or remove it. If no evidence exists, follow "Evidence for PR descriptions" in Step 6. Description-only updates may be specifically intended to add evidence.
+4. **Write the new description** -- follow the writing principles in Step 6, driven by feature commits, final diff, and evidence decision.
+   - If the user provided a focus, incorporate it alongside the branch diff context.
+5. **Compare and confirm** -- summarize the substantial changes vs the current description (e.g., "Added coverage of the new caching layer, updated test plan, removed outdated migration notes").
+   - If the user provided a focus, confirm it was addressed.
+   - Ask the user to confirm before applying. Use the platform's blocking question tool (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini). If no question tool is available, present the summary and wait for the user's reply.

 If confirmed, apply:

@@ -107,19 +114,27 @@ If the current branch from the context above is empty, the repository is in deta
 - If the user agrees, derive a descriptive branch name from the change content, create it with `git checkout -b <branch-name>`, then run `git branch --show-current` again and use that result as the current branch name for the rest of the workflow.
 - If the user declines, stop.

-If the git status from the context above shows a clean working tree (no staged, modified, or untracked files), check whether there are unpushed commits or a missing PR before stopping. The current branch and existing PR check are already available from the context above. Additionally:
+If the git status from the context above shows a clean working tree (no staged, modified, or untracked files), determine the next action based on upstream state and PR status. The current branch and existing PR check are already available from the context above. Additionally:

 1. Run `git rev-parse --abbrev-ref --symbolic-full-name @{u}` to check whether an upstream is configured.
 2. If the command succeeds, run `git log <upstream>..HEAD --oneline` using the upstream name from the previous command.

- If the current branch is `main`, `master`, or the resolved default branch from Step 1 and there is **no upstream** or there are **unpushed commits**, explain that pushing now would use the default branch directly. Ask whether to create a feature branch first. Use the platform's blocking question tool (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini). If no question tool is available, present the options and wait for the user's reply.
- If the user agrees, derive a descriptive branch name from the change content, create it with `git checkout -b <branch-name>`, then continue from Step 5 (push).
- If the user declines, report that this workflow cannot open a PR from the default branch directly and stop.
- If there is **no upstream**, treat the branch as needing its first push. Skip Step 4 (commit) and continue from Step 5 (push).
- If there are **unpushed commits**, skip Step 4 (commit) and continue from Step 5 (push).
- If all commits are pushed but **no open PR exists** and the current branch is `main`, `master`, or the resolved default branch from Step 1, report that there is no feature branch work to open as a PR and stop.
- If all commits are pushed but **no open PR exists**, skip Steps 4-5 and continue from Step 6 (write the PR description) and Step 7 (create the PR).
- If all commits are pushed **and an open PR exists**, report that and stop -- there is nothing to do.
+Then follow this decision tree:
+
+- **On default branch** (`main`, `master`, or the resolved default branch) with no upstream or unpushed commits:
+  - Ask whether to create a feature branch first (pushing the default branch directly is not supported by this workflow). Use the platform's blocking question tool (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini). If no question tool is available, present the options and wait for the user's reply.
+  - If yes -> create branch with `git checkout -b <branch-name>`, continue from Step 5 (push)
+  - If no -> stop
+- **On default branch**, all commits pushed, no open PR:
+  - Report there is no feature branch work to open as a PR. Stop.
+- **No upstream configured** (feature branch, never pushed):
+  - Skip Step 4 (commit), continue from Step 5 (push)
+- **Unpushed commits exist** (feature branch, upstream configured):
+  - Skip Step 4 (commit), continue from Step 5 (push)
+- **All commits pushed, no open PR** (feature branch):
+  - Skip Steps 4-5, continue from Step 6 (PR description) and Step 7 (create PR)
+- **All commits pushed, open PR exists**:
+  - Report that everything is up to date. Stop.

 ### Step 2: Determine conventions

@@ -204,6 +219,27 @@ MERGE_BASE=$(git merge-base <base-remote>/<base-branch> HEAD) && echo "MERGE_BAS

 Use the full branch diff and commit list as the basis for the PR description -- not the working-tree diff from Step 1.

+#### Evidence for PR descriptions
+
+Decide whether evidence capture is possible from the full branch diff before writing the PR description.
+
+**Evidence is possible** when the final diff changes observable product behavior that can be demonstrated from the workspace: UI rendering or interactions, CLI commands and output, API/library behavior with runnable example code, generated artifacts, or workflow behavior with visible output.
+
+**Evidence is not possible** for docs-only, markdown-only, changelog-only, release metadata, CI/config-only, test-only, or pure internal refactors with no observable output change. It is also not possible when the behavior requires unavailable credentials, paid/cloud services, bot tokens, deploy-only infrastructure, or hardware the user has not provided. In those cases, do not ask about capturing evidence; omit the evidence section and, if relevant, mention the skip reason in the final user report.
+
+When evidence is possible, ask whether to include it in the PR description. Use the platform's blocking question tool (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini). If no question tool is available, present the options and wait for the user's reply.
+
+**Question:** "This PR has observable behavior. Capture evidence for the PR description?"
+
+**Options:**
+1. **Capture evidence now**: load the `ce-demo-reel` skill with a target description as the argument (e.g., "the new settings page" or "CLI output of the migrate command"). Infer the target from the branch diff. ce-demo-reel returns `Tier`, `Description`, and `URL`. Use the URL and description to build a `## Demo` or `## Screenshots` section (browser-reel/terminal-recording/screenshot-reel use "Demo", static uses "Screenshots").
+2. **Use existing evidence**: ask the user for the URL or markdown embed, then include it in the PR body.
+3. **Skip evidence**: write the PR description without an evidence section.
+
+If the user chooses capture, check ce-demo-reel's output for failure: `Tier: skipped` or `URL: "none"` means no evidence was captured. Do not add a placeholder section. Summarize the skip reason in the final user report.
+
+Place the evidence embed before the Compound Engineering badge, typically after the summary or within the changes section. Do not label test output as "Demo" or "Screenshots".
+
 #### Classify commits before writing

 Before writing the description, scan the commit list and classify each commit:
@@ -213,10 +249,25 @@ Before writing the description, scan the commit list and classify each commit:

 Only feature commits inform the description. Fix-up commits are noise -- they describe the iteration process, not the end result. The full diff already includes whatever the fix-up commits changed, so their intent is captured without narrating them. When sizing and writing the description, mentally subtract fix-up commits: a branch with 12 commits but 9 fix-ups is a 3-commit PR in terms of description weight.

-This is the most important step. The description must be **adaptive** -- its depth should match the complexity of the change. A one-line bugfix does not need a table of performance results. A large architectural change should not be a bullet list.
+#### Frame the narrative before sizing
+
+After classifying commits, articulate the PR's narrative frame:
+
+1. **Before**: What was broken, limited, or impossible? (One sentence.)
+2. **After**: What's now possible or improved? (One sentence.)
+3. **Scope rationale** (only if the PR touches 2+ separable-looking concerns): Why do these ship together? (One sentence.)
+
+This frame becomes the opening of the description. For small+simple PRs (the sizing table routes to 1-2 sentences), the "after" sentence alone may be the entire description -- that's fine.
+
+Example:
+- Before: "CLI and library PRs got no visual evidence because the capture flow assumed a web app with a dev server."
+- After: "Evidence capture now works for any project type -- CLI tools, libraries, desktop apps."
+- Scope: "Shipped with git-commit-push-pr restructuring because ce-demo-reel integrates into the PR description flow."

 #### Sizing the change

+The description must be **adaptive** -- its depth should match the complexity of the change. A one-line bugfix does not need a table of performance results. A large architectural change should not be a bullet list.
+
 Assess the PR along two axes before writing, based on the full branch diff:

 - **Size**: How many files changed? How large is the diff?
@@ -228,15 +279,26 @@ Use this to select the right description depth:
 |---|---|
 | Small + simple (typo, config, dep bump) | 1-2 sentences, no headers. Total body under ~300 characters. |
 | Small + non-trivial (targeted bugfix, behavioral change) | Short "Problem / Fix" narrative, ~3-5 sentences. Enough for a reviewer to understand *why* without reading the diff. No headers needed unless there are two distinct concerns. |
-| Medium feature or refactor | Summary paragraph, then a section explaining what changed and why. Call out design decisions. |
+| Medium feature or refactor | Open with the narrative frame (before/after/scope), then a section explaining what changed and why. Call out design decisions. |
 | Large or architecturally significant | Full narrative: problem context, approach chosen (and why), key decisions, migration notes or rollback considerations if relevant. |
 | Performance improvement | Include before/after measurements if available. A markdown table is effective here. |

 **Brevity matters for small changes.** A 3-line bugfix with a 20-line PR description signals the author didn't calibrate. Match the weight of the description to the weight of the change. When in doubt, shorter is better -- reviewers can read the diff.

+#### Writing voice
+
+If the user has documented writing style preferences (in CLAUDE.md, project instructions, or prior feedback), follow those. Otherwise, apply these defaults:
+
+- Use active voice throughout. No em dashes or double-hyphen (`--`) substitutes. Use periods, commas, colons, or parentheses instead.
+- Vary sentence length deliberately -- mix short punchy sentences with longer ones. Never write three sentences of similar length in a row.
+- Do not make a claim and then immediately explain it in the next sentence. Trust the reader.
+- Write in plain English. If there's a simpler word, that's preferable. Never use business jargon when a common word will do. Technical jargon is fine when it's the clearest term for a developer audience.
+- No filler phrases: "it's worth noting", "importantly", "essentially", "in order to", "leverage", "utilize."
+- Use digits for numbers ("3 files", "7 subcommands"), not words ("three files", "seven subcommands").
+
 #### Writing principles

- **Lead with value**: The first sentence should tell the reviewer *why this PR exists*, not *what files changed*. "Fixes timeout errors during batch exports" beats "Updated export_handler.py and config.yaml".
+- **Lead with value**: The first sentence should tell the reviewer *what's now possible or fixed*, not what was moved around. "Fixes timeout errors during batch exports" beats "Updated export_handler.py and config.yaml." The subtler failure is leading with the mechanism: "Replace the hardcoded capture block with a tiered skill" is technically purposeful but still doesn't tell the reviewer what changed for users. "Evidence capture now works for CLI tools and libraries, not just web apps" does.
 - **No orphaned opening paragraphs**: If the description uses `##` section headings anywhere, the opening summary must also be under a heading (e.g., `## Summary`). An untitled paragraph followed by titled sections looks like a missing heading. For short descriptions with no sections, a bare paragraph is fine.
 - **Describe the net result, not the journey**: The PR description is about the end state -- what changed and why. Do not include work-product details like bugs found and fixed during development, intermediate failures, debugging steps, iteration history, or refactoring done along the way. Those are part of getting the work done, not part of the result. If a bug fix happened during development, the fix is already in the diff -- mentioning it in the description implies it's a separate concern the reviewer should evaluate, when really it's just part of the final implementation. Exception: include process details only when they are critical for a reviewer to understand a design choice (e.g., "tried approach X first but it caused Y, so went with Z instead").
 - **When commits conflict, trust the final diff**: The commit list is supporting context, not the source of truth for the final PR description. If commit messages describe intermediate steps that were later revised or reverted (for example, "switch to gh pr list" followed by a later change back to `gh pr view`), describe the end state shown by the full branch diff. Do not narrate contradictory commit history as if all of it shipped.
@@ -252,7 +314,7 @@ Use this to select the right description depth:
  ```

 - **No empty sections**: If a section (like "Breaking Changes" or "Migration Guide") doesn't apply, omit it entirely. Do not include it with "N/A" or "None".
- **Test plan -- only when it adds value**: Include a test plan section when the testing approach is non-obvious: edge cases the reviewer might not think of, verification steps for behavior that's hard to see in the diff, or scenarios that require specific setup. Omit it for straightforward changes where the tests are self-explanatory or where "run the tests" is the only useful guidance. A test plan for "verify the typo is fixed" is noise.
+- **Test plan -- only when it adds value**: Include a test plan section when the testing approach is non-obvious: edge cases the reviewer might not think of, verification steps for behavior that's hard to see in the diff, or scenarios that require specific setup. Omit it for straightforward changes where the tests are self-explanatory or where "run the tests" is the only useful guidance. A test plan for "verify the typo is fixed" is noise. When the branch adds new test files, name them with what they cover -- "`tests/capture-evidence.test.ts` -- 8 tests covering arg validation and ffmpeg stitch integration" is more useful than "bun test passes."

 #### Visual communication

@@ -363,7 +425,12 @@ Keep the PR title under 72 characters. The title follows the same convention as

 The new commits are already on the PR from the push in Step 5. Report the PR URL, then ask the user whether they want the PR description updated to reflect the new changes. Use the platform's blocking question tool (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini). If no question tool is available, present the option and wait for the user's reply before proceeding.

- If **yes** -- write a new description following the same principles in Step 6 (size the full PR, not just the new commits). Classify commits per "Classify commits before writing" -- the new commits since the last push are often fix-up work (code review fixes, lint fixes) and should not appear as distinct items in the updated description. Describe the PR's net result as if writing it fresh. Include the Compound Engineering badge unless one is already present in the existing description. Apply it:
+- If **yes**:
+  1. Classify commits per "Classify commits before writing" -- the new commits since the last push are often fix-up work (code review fixes, lint fixes) and should not appear as distinct items
+  2. Size the full PR (not just the new commits) using the sizing table in Step 6
+  3. Write the description as if fresh, following Step 6's writing principles -- describe the PR's net result
+  4. Include the Compound Engineering badge unless one is already present
+  5. Apply:

  ```bash
  gh pr edit --body "$(cat <<'EOF'
--- a/plugins/compound-engineering/skills/lfg/SKILL.md
+++ b/plugins/compound-engineering/skills/lfg/SKILL.md
@@ -25,8 +25,6 @@ CRITICAL: You MUST execute every step below IN ORDER. Do NOT skip any required s

 6. `/compound-engineering:test-browser`

-7. `/compound-engineering:feature-video`
-
-8. Output `<promise>DONE</promise>` when video is in PR
+7. Output `<promise>DONE</promise>` when complete

 Start with step 2 now (or step 1 if ralph-loop is available). Remember: plan FIRST, then work. Never skip the plan.
--- a/tests/ce-demo-reel.test.ts
+++ b/tests/ce-demo-reel.test.ts
@@ -0,0 +1,402 @@
+import { describe, expect, test, beforeAll, afterAll } from "bun:test"
+import { promises as fs } from "fs"
+import path from "path"
+import os from "os"
+
+const SCRIPT = path.join(
+  process.cwd(),
+  "plugins",
+  "compound-engineering",
+  "skills",
+  "ce-demo-reel",
+  "scripts",
+  "capture-demo.py",
+)
+
+async function run(
+  ...args: string[]
+): Promise<{ exitCode: number; stdout: string; stderr: string }> {
+  const proc = Bun.spawn(["python3", SCRIPT, ...args], {
+    stdout: "pipe",
+    stderr: "pipe",
+  })
+  const exitCode = await proc.exited
+  const stdout = await new Response(proc.stdout).text()
+  const stderr = await new Response(proc.stderr).text()
+  return { exitCode, stdout, stderr }
+}
+
+/** Create a minimal valid PNG (1x1 pixel, solid color). */
+function createTestPng(color: [number, number, number]): Buffer {
+  const [r, g, b] = color
+
+  // Raw RGB pixel data: 1 row, filter byte 0, then RGB
+  const rawData = Buffer.from([0, r, g, b])
+
+  // Compress with zlib
+  const compressed = Bun.deflateSync(rawData, { level: 0 })
+  const cmf = 0x78
+  const flg = 0x01
+  let s1 = 1
+  let s2 = 0
+  for (const byte of rawData) {
+    s1 = (s1 + byte) % 65521
+    s2 = (s2 + s1) % 65521
+  }
+  const adler32 = Buffer.alloc(4)
+  adler32.writeUInt32BE((s2 << 16) | s1)
+  const zlibData = Buffer.concat([Buffer.from([cmf, flg]), compressed, adler32])
+
+  const signature = Buffer.from([137, 80, 78, 71, 13, 10, 26, 10])
+
+  function chunk(type: string, data: Buffer): Buffer {
+    const len = Buffer.alloc(4)
+    len.writeUInt32BE(data.length)
+    const typeB = Buffer.from(type, "ascii")
+    const body = Buffer.concat([typeB, data])
+    const crc = crc32(body)
+    const crcB = Buffer.alloc(4)
+    crcB.writeUInt32BE(crc >>> 0)
+    return Buffer.concat([len, body, crcB])
+  }
+
+  // IHDR: 1x1, 8-bit RGB (color type 2)
+  const ihdr = Buffer.alloc(13)
+  ihdr.writeUInt32BE(1, 0)
+  ihdr.writeUInt32BE(1, 4)
+  ihdr[8] = 8  // bit depth
+  ihdr[9] = 2  // color type: RGB
+  ihdr[10] = 0
+  ihdr[11] = 0
+  ihdr[12] = 0
+
+  return Buffer.concat([
+    signature,
+    chunk("IHDR", ihdr),
+    chunk("IDAT", zlibData),
+    chunk("IEND", Buffer.alloc(0)),
+  ])
+}
+
+function crc32(data: Buffer): number {
+  let crc = 0xffffffff
+  for (const byte of data) {
+    crc ^= byte
+    for (let j = 0; j < 8; j++) {
+      crc = crc & 1 ? (crc >>> 1) ^ 0xedb88320 : crc >>> 1
+    }
+  }
+  return (crc ^ 0xffffffff) >>> 0
+}
+
+// --- Preflight ---
+
+describe("capture-evidence.py", () => {
+  describe("preflight", () => {
+    test("returns JSON with tool availability", async () => {
+      const { exitCode, stdout } = await run("preflight")
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result).toHaveProperty("agent_browser")
+      expect(result).toHaveProperty("vhs")
+      expect(result).toHaveProperty("silicon")
+      expect(result).toHaveProperty("ffmpeg")
+      expect(result).toHaveProperty("ffprobe")
+      expect(typeof result.ffmpeg).toBe("boolean")
+    })
+  })
+
+  // --- Detect ---
+
+  describe("detect", () => {
+    let tmpDir: string
+
+    beforeAll(async () => {
+      tmpDir = await fs.mkdtemp(path.join(os.tmpdir(), "evidence-detect-"))
+    })
+
+    afterAll(async () => {
+      if (tmpDir) await fs.rm(tmpDir, { recursive: true, force: true })
+    })
+
+    test("detects web-app from package.json with react", async () => {
+      const dir = path.join(tmpDir, "webapp")
+      await fs.mkdir(dir)
+      await fs.writeFile(
+        path.join(dir, "package.json"),
+        JSON.stringify({ dependencies: { react: "^18.0.0" } }),
+      )
+      const { exitCode, stdout } = await run("detect", "--repo-root", dir)
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.type).toBe("web-app")
+    })
+
+    test("detects cli-tool from package.json with bin field", async () => {
+      const dir = path.join(tmpDir, "clitool")
+      await fs.mkdir(dir)
+      await fs.writeFile(
+        path.join(dir, "package.json"),
+        JSON.stringify({ bin: { mycli: "./cli.js" } }),
+      )
+      const { exitCode, stdout } = await run("detect", "--repo-root", dir)
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.type).toBe("cli-tool")
+    })
+
+    test("detects desktop-app from electron dependency", async () => {
+      const dir = path.join(tmpDir, "electron")
+      await fs.mkdir(dir)
+      await fs.writeFile(
+        path.join(dir, "package.json"),
+        JSON.stringify({ devDependencies: { electron: "^28.0.0", react: "^18.0.0" } }),
+      )
+      const { exitCode, stdout } = await run("detect", "--repo-root", dir)
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.type).toBe("desktop-app")
+    })
+
+    test("detects library when manifest exists but no web/CLI signals", async () => {
+      const dir = path.join(tmpDir, "lib")
+      await fs.mkdir(dir)
+      await fs.writeFile(
+        path.join(dir, "package.json"),
+        JSON.stringify({ name: "my-utils", version: "1.0.0" }),
+      )
+      const { exitCode, stdout } = await run("detect", "--repo-root", dir)
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.type).toBe("library")
+    })
+
+    test("detects text-only when no manifest exists", async () => {
+      const dir = path.join(tmpDir, "textonly")
+      await fs.mkdir(dir)
+      await fs.writeFile(path.join(dir, "README.md"), "# Hello")
+      const { exitCode, stdout } = await run("detect", "--repo-root", dir)
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.type).toBe("text-only")
+    })
+
+    test("electron takes priority over web-app", async () => {
+      const dir = path.join(tmpDir, "electron-react")
+      await fs.mkdir(dir)
+      await fs.writeFile(
+        path.join(dir, "package.json"),
+        JSON.stringify({ dependencies: { react: "^18.0.0" }, devDependencies: { electron: "^28.0.0" } }),
+      )
+      const { exitCode, stdout } = await run("detect", "--repo-root", dir)
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.type).toBe("desktop-app")
+    })
+
+    test("detects web-app from Gemfile with rails", async () => {
+      const dir = path.join(tmpDir, "rails")
+      await fs.mkdir(dir)
+      await fs.writeFile(path.join(dir, "Gemfile"), 'gem "rails", "~> 7.0"')
+      const { exitCode, stdout } = await run("detect", "--repo-root", dir)
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.type).toBe("web-app")
+    })
+
+    test("detects cli-tool from go.mod with cmd/ directory", async () => {
+      const dir = path.join(tmpDir, "gocli")
+      await fs.mkdir(dir)
+      await fs.writeFile(path.join(dir, "go.mod"), "module example.com/mycli\n\ngo 1.21")
+      await fs.mkdir(path.join(dir, "cmd"))
+      const { exitCode, stdout } = await run("detect", "--repo-root", dir)
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.type).toBe("cli-tool")
+    })
+  })
+
+  // --- Recommend ---
+
+  describe("recommend", () => {
+    const allTools = '{"agent_browser":true,"vhs":true,"silicon":true,"ffmpeg":true,"ffprobe":true}'
+    const noTools = '{"agent_browser":false,"vhs":false,"silicon":false,"ffmpeg":false,"ffprobe":false}'
+
+    test("web-app with browser + ffmpeg recommends browser-reel", async () => {
+      const { exitCode, stdout } = await run(
+        "recommend", "--project-type", "web-app", "--change-type", "states", "--tools", allTools,
+      )
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.recommended).toBe("browser-reel")
+    })
+
+    test("cli-tool with motion + vhs recommends terminal-recording", async () => {
+      const { exitCode, stdout } = await run(
+        "recommend", "--project-type", "cli-tool", "--change-type", "motion", "--tools", allTools,
+      )
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.recommended).toBe("terminal-recording")
+    })
+
+    test("cli-tool with states + silicon recommends screenshot-reel", async () => {
+      const tools = '{"agent_browser":false,"vhs":false,"silicon":true,"ffmpeg":true,"ffprobe":true}'
+      const { exitCode, stdout } = await run(
+        "recommend", "--project-type", "cli-tool", "--change-type", "states", "--tools", tools,
+      )
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.recommended).toBe("screenshot-reel")
+    })
+
+    test("library always recommends static-screenshots", async () => {
+      const { exitCode, stdout } = await run(
+        "recommend", "--project-type", "library", "--change-type", "states", "--tools", allTools,
+      )
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.recommended).toBe("static-screenshots")
+    })
+
+    test("no tools always falls back to static-screenshots", async () => {
+      const { exitCode, stdout } = await run(
+        "recommend", "--project-type", "cli-tool", "--change-type", "motion", "--tools", noTools,
+      )
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.recommended).toBe("static-screenshots")
+    })
+
+    test("available list includes only tiers with tools present", async () => {
+      const tools = '{"agent_browser":false,"vhs":true,"silicon":false,"ffmpeg":true,"ffprobe":true}'
+      const { exitCode, stdout } = await run(
+        "recommend", "--project-type", "cli-tool", "--change-type", "motion", "--tools", tools,
+      )
+      expect(exitCode).toBe(0)
+      const result = JSON.parse(stdout.trim())
+      expect(result.available).toContain("terminal-recording")
+      expect(result.available).toContain("static-screenshots")
+      expect(result.available).not.toContain("browser-reel")
+      expect(result.available).not.toContain("screenshot-reel")
+    })
+  })
+
+  // --- Stitch arg validation ---
+
+  describe("stitch arg validation", () => {
+    test("stitch with no args fails", async () => {
+      const { exitCode, stderr } = await run("stitch")
+      expect(exitCode).not.toBe(0)
+    })
+
+    test("stitch fails on missing frame file", async () => {
+      const { exitCode, stderr } = await run(
+        "stitch", "out.gif", "/tmp/nonexistent-frame-abc123.png",
+      )
+      expect(exitCode).toBe(1)
+      expect(stderr).toContain("Frame not found")
+    })
+
+    test("upload fails on missing file", async () => {
+      const { exitCode, stderr } = await run(
+        "upload", "/tmp/nonexistent-file-abc123.gif",
+      )
+      expect(exitCode).toBe(1)
+      expect(stderr).toContain("File not found")
+    })
+  })
+
+  // --- Stitch integration (requires ffmpeg) ---
+
+  describe("stitch integration", () => {
+    let tmpDir: string
+    let hasFFmpeg: boolean
+
+    beforeAll(async () => {
+      const proc = Bun.spawn(["which", "ffmpeg"], {
+        stdout: "pipe",
+        stderr: "pipe",
+      })
+      hasFFmpeg = (await proc.exited) === 0
+
+      if (!hasFFmpeg) return
+
+      tmpDir = await fs.mkdtemp(path.join(os.tmpdir(), "evidence-test-"))
+
+      const red = createTestPng([255, 0, 0])
+      const green = createTestPng([0, 255, 0])
+      const blue = createTestPng([0, 0, 255])
+
+      await fs.writeFile(path.join(tmpDir, "frame1.png"), red)
+      await fs.writeFile(path.join(tmpDir, "frame2.png"), green)
+      await fs.writeFile(path.join(tmpDir, "frame3.png"), blue)
+    })
+
+    afterAll(async () => {
+      if (tmpDir) await fs.rm(tmpDir, { recursive: true, force: true })
+    })
+
+    test("stitches frames into a GIF", async () => {
+      if (!hasFFmpeg) {
+        console.log("Skipping: ffmpeg not available")
+        return
+      }
+
+      const output = path.join(tmpDir, "output.gif")
+      const { exitCode, stdout } = await run(
+        "stitch", "--duration", "0.5", output,
+        path.join(tmpDir, "frame1.png"),
+        path.join(tmpDir, "frame2.png"),
+      )
+
+      expect(exitCode).toBe(0)
+      expect(stdout).toContain("Stitching 2 frames")
+      expect(stdout).toContain("Created:")
+
+      const stat = await fs.stat(output)
+      expect(stat.size).toBeGreaterThan(0)
+
+      const header = Buffer.alloc(6)
+      const fh = await fs.open(output, "r")
+      await fh.read(header, 0, 6)
+      await fh.close()
+      expect(header.toString("ascii").startsWith("GIF")).toBe(true)
+    })
+
+    test("stitches 3 frames into a GIF", async () => {
+      if (!hasFFmpeg) {
+        console.log("Skipping: ffmpeg not available")
+        return
+      }
+
+      const output = path.join(tmpDir, "output3.gif")
+      const { exitCode, stdout } = await run(
+        "stitch", "--duration", "0.5", output,
+        path.join(tmpDir, "frame1.png"),
+        path.join(tmpDir, "frame2.png"),
+        path.join(tmpDir, "frame3.png"),
+      )
+
+      expect(exitCode).toBe(0)
+      expect(stdout).toContain("Stitching 3 frames")
+    })
+
+    test("default duration is used when --duration not specified", async () => {
+      if (!hasFFmpeg) {
+        console.log("Skipping: ffmpeg not available")
+        return
+      }
+
+      const output = path.join(tmpDir, "output-default-dur.gif")
+      const { exitCode, stdout } = await run(
+        "stitch", output,
+        path.join(tmpDir, "frame1.png"),
+        path.join(tmpDir, "frame2.png"),
+      )
+
+      expect(exitCode).toBe(0)
+      expect(stdout).toContain("Created:")
+    })
+  })
+})