feat(ce-demo-reel): add demo reel skill with Python capture pipeline (#541)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Trevin Chow
2026-04-09 21:29:51 -07:00
committed by GitHub
parent f3cc7545e5
commit b979143ad0
18 changed files with 1796 additions and 751 deletions

View File

@@ -0,0 +1,123 @@
---
title: "Prefer Python over bash for multi-step pipeline scripts"
date: 2026-04-09
category: best-practices
module: "skill scripting / ce-demo-reel"
problem_type: best_practice
component: tooling
severity: medium
applies_when:
- Script orchestrates 2+ external CLI tools (ffmpeg, curl, silicon, vhs)
- Script needs retry logic or graceful degradation on tool failure
- Script will run on macOS where bash 3.2 is the default
- Script needs to be tested from a non-shell test runner (Bun, Jest, pytest)
- Script has conditional failure paths where some errors should be caught and others should abort
tags:
- bash-vs-python
- pipeline-scripts
- skill-scripting
- set-e-footguns
- error-handling
- ce-demo-reel
---
# Prefer Python over bash for multi-step pipeline scripts
## Context
When building the `ce-demo-reel` skill, the initial implementation used a bash script (`capture-evidence.sh`) to orchestrate ffmpeg stitching, frame normalization, and catbox.moe upload. Over 4 review rounds, the script hit 4 distinct bug classes that are inherent to bash's execution model rather than simple coding mistakes.
## Guidance
Use Python for agent pipeline scripts that chain multiple CLI tools with error handling. Bash `set -euo pipefail` works for simple sequential scripts but becomes a footgun when you need controlled failure paths.
**Python subprocess model (explicit error handling):**
```python
result = subprocess.run(
["curl", "-s", "-F", f"fileToUpload=@{file_path}", url],
capture_output=True, text=True, timeout=30, check=False
)
if result.returncode != 0:
# Retry logic runs normally
attempts += 1
continue
```
**Python timeout handling (explicit catch):**
```python
try:
result = subprocess.run(cmd, timeout=60)
except subprocess.TimeoutExpired:
# Controlled failure, not a crash
return subprocess.CompletedProcess(cmd, returncode=1, stdout="", stderr="Timed out")
```
**Bash equivalent (the footgun):**
```bash
set -euo pipefail
# Exits the entire script before retry logic runs
url=$(curl -s -F "fileToUpload=@${file}" "$endpoint")
# Never reaches here on curl failure
# Workaround: || true on every line that might fail
url=$(curl -s -F "fileToUpload=@${file}" "$endpoint") || true
# Works but fragile and easy to forget
```
## Why This Matters
Agent pipeline scripts run in environments the skill author does not control: different macOS versions (bash 3.2 vs 5.x), CI containers, worktrees. Each bash portability issue requires a non-obvious workaround that reviewers must catch. Python's subprocess model makes error handling explicit and testable rather than implicit and version-dependent.
The 4 bugs found were not unusual. They are the predictable consequence of using bash for scripts that exceed its sweet spot.
## When to Apply
Use Python when:
- The script orchestrates 2+ external CLI tools
- The script needs retry logic or graceful degradation on tool failure
- The script will run on macOS where bash 3.2 is the default
- The script needs to be tested from a non-shell test runner
- The script has more than ~3 subcommands
Bash is still the right choice when:
- Simple sequential scripts with no error recovery (set -e is fine)
- One-liner wrappers around a single tool
- Scripts using only POSIX features with no array manipulation
- Git hooks and CI steps where the only failure mode is "abort the pipeline"
## Examples
**Before (bash, 4 bugs across 4 review rounds):**
| Bug | Cause | Workaround needed |
|---|---|---|
| `url=$(curl ...)` exits on network failure | `set -e` + command substitution | `\|\| true` on every line |
| `${array[-1]}` fails | Bash 3.2 lacks negative indexing | `${array[${#array[@]}-1]}` |
| Frame reduction keeps all frames for n=3,4 | Integer math: `step=(n-1)/2` with min 1 | Minimum step of 2 |
| `command -v ffmpeg` in Bun tests | `command` is a shell builtin, not spawnable | Use `which` instead |
**After (Python, all 4 bug classes eliminated):**
```python
# Negative indexing just works
last = frames[-1]
# Timeout handling is explicit
try:
result = subprocess.run(cmd, timeout=30)
except subprocess.TimeoutExpired:
return None
# Tool detection is a regular function
if not shutil.which("ffmpeg"):
sys.exit("ffmpeg not found")
# Math is straightforward
step = max(2, (len(frames) - 1) // 2)
```
## Related
- `docs/solutions/skill-design/script-first-skill-architecture.md`: covers when to use scripts vs agent logic (complementary: that doc answers "should a script do this?", this doc answers "which language?")
- `docs/solutions/agent-friendly-cli-principles.md`: CLI design from the consumer side (overlaps on exit code and stderr patterns)

View File

@@ -1,147 +0,0 @@
---
title: "Persistent GitHub authentication for agent-browser using named sessions"
category: integrations
date: 2026-03-22
tags:
- agent-browser
- github
- authentication
- chrome
- session-persistence
- lightpanda
related_to:
- plugins/compound-engineering/skills/feature-video/SKILL.md
- plugins/compound-engineering/skills/agent-browser/SKILL.md
- plugins/compound-engineering/skills/agent-browser/references/authentication.md
- plugins/compound-engineering/skills/agent-browser/references/session-management.md
---
# agent-browser Chrome Authentication for GitHub
## Problem
agent-browser needs authenticated access to GitHub for workflows like the native video
upload in the feature-video skill. Multiple authentication approaches were evaluated
before finding one that works reliably with 2FA, SSO, and OAuth.
## Investigation
| Approach | Result |
|---|---|
| `--profile` flag | Lightpanda (default engine on some installs) throws "Profiles are not supported with Lightpanda". Must use `--engine chrome`. |
| Fresh Chrome profile | No GitHub cookies. Shows "Sign up for free" instead of comment form. |
| `--auto-connect` | Requires Chrome pre-launched with `--remote-debugging-port`. Error: "No running Chrome instance found" in normal use. Impractical. |
| Auth vault (`auth save`/`auth login`) | Cannot handle 2FA, SSO, or OAuth redirects. Only works for simple username/password forms. |
| `--session-name` with Chrome engine | Cookies auto-save/restore. One-time headed login handles any auth method. **This works.** |
## Working Solution
### One-time setup (headed, user logs in manually)
```bash
# Close any running daemon (ignores engine/option changes when reused)
agent-browser close
# Open GitHub login in headed Chrome with a named session
agent-browser --engine chrome --headed --session-name github open https://github.com/login
# User logs in manually -- handles 2FA, SSO, OAuth, any method
# Verify auth
agent-browser open https://github.com/settings/profile
# If profile page loads, auth is confirmed
```
### Session validity check (before each workflow)
```bash
agent-browser close
agent-browser --engine chrome --session-name github open https://github.com/settings/profile
agent-browser get title
# Title contains username or "Profile" -> session valid, proceed
# Title contains "Sign in" or URL is github.com/login -> session expired, re-auth
```
### All subsequent runs (headless, cookies persist)
```bash
agent-browser --engine chrome --session-name github open https://github.com/...
```
## Key Findings
### Engine requirement
MUST use `--engine chrome`. Lightpanda does not support profiles, session persistence,
or state files. Any workflow that uses `--session-name`, `--profile`, `--state`, or
`state save/load` requires the Chrome engine.
Include `--engine chrome` explicitly in every command that uses an authenticated session.
Do not rely on environment defaults -- `AGENT_BROWSER_ENGINE` may be set to `lightpanda`
in some environments.
### Daemon restart
Must run `agent-browser close` before switching engine or session options. A running
daemon ignores new flags like `--engine`, `--headed`, or `--session-name`.
### Session lifetime
Cookies expire when GitHub invalidates them (typically weeks). Periodic re-authentication
is required. The feature-video skill handles this by checking session validity before
the upload step and prompting for re-auth only when needed.
### Auth vault limitations
The auth vault (`agent-browser auth save`/`auth login`) can only handle login forms with
visible username and password fields. It cannot handle:
- 2FA (TOTP, SMS, push notification)
- SSO with identity provider redirect
- OAuth consent flows
- CAPTCHA
- Device verification prompts
For GitHub and most modern services, use the one-time headed login approach instead.
### `--auto-connect` viability
Impractical for automated workflows. Requires Chrome to be pre-launched with
`--remote-debugging-port=9222`, which is not how users normally run Chrome.
## Prevention
### Skills requiring auth must declare engine
State the engine requirement in the Prerequisites section of any skill that needs
browser auth. Include `--engine chrome` in every `agent-browser` command that touches
an authenticated session.
### Session check timing
Perform the session check immediately before the step that needs auth, not at skill
start. A session valid at start may expire during a long workflow (video encoding can
take minutes).
### Recovery without restart
When expiry is detected at upload time, the video file is already encoded. Recovery:
re-authenticate, then retry only the upload step. Do not restart from the beginning.
### Concurrent sessions
Use `--session-name` with a semantically descriptive name (e.g., `github`) when multiple
skills or agents may run concurrently. Two concurrent runs sharing the default session
will interfere with each other.
### State file security
Session state files in `~/.agent-browser/sessions/` contain cookies in plaintext.
Do not commit to repositories. Add to `.gitignore` if the session directory is inside
a repo tree.
## Integration Points
This pattern is used by:
- `feature-video` skill (GitHub native video upload)
- Any future skill requiring authenticated GitHub browser access
- Potential use for other OAuth-protected services (same pattern, different session name)

View File

@@ -1,141 +0,0 @@
---
title: "GitHub inline video embedding via programmatic browser upload"
category: integrations
date: 2026-03-22
tags:
- github
- video-embedding
- agent-browser
- playwright
- feature-video
- pr-description
related_to:
- plugins/compound-engineering/skills/feature-video/SKILL.md
- plugins/compound-engineering/skills/agent-browser/SKILL.md
- plugins/compound-engineering/skills/agent-browser/references/authentication.md
---
# GitHub Native Video Upload for PRs
## Problem
Embedding video demos in GitHub PR descriptions required external storage (R2/rclone)
or GitHub Release assets. Release asset URLs render as plain download links, not inline
video players. Only `user-attachments/assets/` URLs render with GitHub's native inline
video player -- the same result as pasting a video into the PR editor manually.
The distinction is absolute:
| URL namespace | Rendering |
|---|---|
| `github.com/releases/download/...` | Plain download link (bad UX, triggers download on mobile) |
| `github.com/user-attachments/assets/...` | Native inline `<video>` player with controls |
## Investigation
1. **Public upload API** -- No public API exists. The `/upload/policies/assets` endpoint
requires browser session cookies and is not exposed via REST or GraphQL. GitHub CLI
(`gh`) has no support; issues cli/cli#1895, #4228, and #4465 are all closed as
"not planned". GitHub keeps this private to limit abuse surface (malware hosting,
spam CDN, DMCA liability).
2. **Release asset approach (Strategy B)** -- URLs render as download links, not video
players. Clickable GIF previews trigger downloads on mobile. Unacceptable UX.
3. **Claude-in-Chrome JavaScript injection with base64** -- Blocked by CSP/mixed-content
policy. HTTPS github.com cannot fetch from HTTP localhost. Base64 chunking is possible
but does not scale for larger videos.
4. **`tonkotsuboy/github-upload-image-to-pr`** -- Open-source reference confirming
browser automation is the only working approach for producing native URLs.
5. **agent-browser `upload` command** -- Works. Playwright sets files directly on hidden
file inputs without base64 encoding or fetch requests. CSP is not a factor because
Playwright's `setInputFiles` operates at the browser engine level, not via JavaScript.
## Working Solution
### Upload flow
```bash
# Navigate to PR page (authenticated Chrome session)
agent-browser --engine chrome --session-name github \
open "https://github.com/[owner]/[repo]/pull/[number]"
agent-browser scroll down 5000
# Upload video via the hidden file input
agent-browser upload '#fc-new_comment_field' tmp/videos/feature-demo.mp4
# Wait for GitHub to process the upload (typically 3-5 seconds)
agent-browser wait 5000
# Extract the URL GitHub injected into the textarea
agent-browser eval "document.getElementById('new_comment_field').value"
# Returns: https://github.com/user-attachments/assets/[uuid]
# Clear the textarea without submitting (upload already persisted server-side)
agent-browser eval "const ta = document.getElementById('new_comment_field'); \
ta.value = ''; ta.dispatchEvent(new Event('input', { bubbles: true }))"
# Embed in PR description (URL on its own line renders as inline video player)
gh pr edit [number] --body "[body with video URL on its own line]"
```
### Key selectors (validated March 2026)
| Selector | Element | Purpose |
|---|---|---|
| `#fc-new_comment_field` | Hidden `<input type="file">` | Target for `agent-browser upload`. Accepts `.mp4`, `.mov`, `.webm` and many other types. |
| `#new_comment_field` | `<textarea>` | GitHub injects the `user-attachments/assets/` URL here after processing the upload. |
GitHub's comment form contains the hidden file input. After Playwright sets the file,
GitHub uploads it server-side and injects a markdown URL into the textarea. The upload
is persisted even if the form is never submitted.
## What Was Removed
The following approaches were removed from the feature-video skill:
- R2/rclone setup and configuration
- Release asset upload flow (`gh release upload`)
- GIF preview generation (unnecessary with native inline video player)
- Strategy B fallback logic
Total: approximately 100 lines of SKILL.md content removed. The skill is now simpler
and has zero external storage dependencies.
## Prevention
### URL validation
After any upload step, confirm the extracted URL contains `user-attachments/assets/`
before writing it into the PR description. If the URL does not match, the upload failed
or used the wrong method.
### Upload failure handling
If the textarea is empty after the wait, check:
1. Session validity (did GitHub redirect to login?)
2. Wait time (processing can be slow under load -- retry after 3-5 more seconds)
3. File size (10MB free, 100MB paid accounts)
Do not silently substitute a release asset URL. Report the failure and offer to retry.
### DOM selector fragility
`#fc-new_comment_field` and `#new_comment_field` are GitHub's internal element IDs and
may change in future UI updates. If the upload stops working, snapshot the PR page and
inspect the current comment form structure for updated selectors.
### Size limits
- Free accounts: 10MB per file
- Paid (Pro, Team, Enterprise): 100MB per file
Check file size before attempting upload. Re-encode at lower quality if needed.
## References
- GitHub CLI issues: cli/cli#1895, #4228, #4465 (all closed "not planned")
- `tonkotsuboy/github-upload-image-to-pr` -- reference implementation
- GitHub Community Discussions: #29993, #46951, #28219