feat: improve feature-video skill with GitHub native video upload (#344)

This commit is contained in:
Trevin Chow
2026-03-22 21:27:59 -07:00
committed by GitHub
parent 423e692726
commit 4aa50e1bad
3 changed files with 520 additions and 201 deletions

View File

@@ -0,0 +1,147 @@
---
title: "Persistent GitHub authentication for agent-browser using named sessions"
category: integrations
date: 2026-03-22
tags:
- agent-browser
- github
- authentication
- chrome
- session-persistence
- lightpanda
related_to:
- plugins/compound-engineering/skills/feature-video/SKILL.md
- plugins/compound-engineering/skills/agent-browser/SKILL.md
- plugins/compound-engineering/skills/agent-browser/references/authentication.md
- plugins/compound-engineering/skills/agent-browser/references/session-management.md
---
# agent-browser Chrome Authentication for GitHub
## Problem
agent-browser needs authenticated access to GitHub for workflows like the native video
upload in the feature-video skill. Multiple authentication approaches were evaluated
before finding one that works reliably with 2FA, SSO, and OAuth.
## Investigation
| Approach | Result |
|---|---|
| `--profile` flag | Lightpanda (default engine on some installs) throws "Profiles are not supported with Lightpanda". Must use `--engine chrome`. |
| Fresh Chrome profile | No GitHub cookies. Shows "Sign up for free" instead of comment form. |
| `--auto-connect` | Requires Chrome pre-launched with `--remote-debugging-port`. Error: "No running Chrome instance found" in normal use. Impractical. |
| Auth vault (`auth save`/`auth login`) | Cannot handle 2FA, SSO, or OAuth redirects. Only works for simple username/password forms. |
| `--session-name` with Chrome engine | Cookies auto-save/restore. One-time headed login handles any auth method. **This works.** |
## Working Solution
### One-time setup (headed, user logs in manually)
```bash
# Close any running daemon (ignores engine/option changes when reused)
agent-browser close
# Open GitHub login in headed Chrome with a named session
agent-browser --engine chrome --headed --session-name github open https://github.com/login
# User logs in manually -- handles 2FA, SSO, OAuth, any method
# Verify auth
agent-browser open https://github.com/settings/profile
# If profile page loads, auth is confirmed
```
### Session validity check (before each workflow)
```bash
agent-browser close
agent-browser --engine chrome --session-name github open https://github.com/settings/profile
agent-browser get title
# Title contains username or "Profile" -> session valid, proceed
# Title contains "Sign in" or URL is github.com/login -> session expired, re-auth
```
### All subsequent runs (headless, cookies persist)
```bash
agent-browser --engine chrome --session-name github open https://github.com/...
```
## Key Findings
### Engine requirement
MUST use `--engine chrome`. Lightpanda does not support profiles, session persistence,
or state files. Any workflow that uses `--session-name`, `--profile`, `--state`, or
`state save/load` requires the Chrome engine.
Include `--engine chrome` explicitly in every command that uses an authenticated session.
Do not rely on environment defaults -- `AGENT_BROWSER_ENGINE` may be set to `lightpanda`
in some environments.
### Daemon restart
Must run `agent-browser close` before switching engine or session options. A running
daemon ignores new flags like `--engine`, `--headed`, or `--session-name`.
### Session lifetime
Cookies expire when GitHub invalidates them (typically weeks). Periodic re-authentication
is required. The feature-video skill handles this by checking session validity before
the upload step and prompting for re-auth only when needed.
### Auth vault limitations
The auth vault (`agent-browser auth save`/`auth login`) can only handle login forms with
visible username and password fields. It cannot handle:
- 2FA (TOTP, SMS, push notification)
- SSO with identity provider redirect
- OAuth consent flows
- CAPTCHA
- Device verification prompts
For GitHub and most modern services, use the one-time headed login approach instead.
### `--auto-connect` viability
Impractical for automated workflows. Requires Chrome to be pre-launched with
`--remote-debugging-port=9222`, which is not how users normally run Chrome.
## Prevention
### Skills requiring auth must declare engine
State the engine requirement in the Prerequisites section of any skill that needs
browser auth. Include `--engine chrome` in every `agent-browser` command that touches
an authenticated session.
### Session check timing
Perform the session check immediately before the step that needs auth, not at skill
start. A session valid at start may expire during a long workflow (video encoding can
take minutes).
### Recovery without restart
When expiry is detected at upload time, the video file is already encoded. Recovery:
re-authenticate, then retry only the upload step. Do not restart from the beginning.
### Concurrent sessions
Use `--session-name` with a semantically descriptive name (e.g., `github`) when multiple
skills or agents may run concurrently. Two concurrent runs sharing the default session
will interfere with each other.
### State file security
Session state files in `~/.agent-browser/sessions/` contain cookies in plaintext.
Do not commit to repositories. Add to `.gitignore` if the session directory is inside
a repo tree.
## Integration Points
This pattern is used by:
- `feature-video` skill (GitHub native video upload)
- Any future skill requiring authenticated GitHub browser access
- Potential use for other OAuth-protected services (same pattern, different session name)

View File

@@ -0,0 +1,141 @@
---
title: "GitHub inline video embedding via programmatic browser upload"
category: integrations
date: 2026-03-22
tags:
- github
- video-embedding
- agent-browser
- playwright
- feature-video
- pr-description
related_to:
- plugins/compound-engineering/skills/feature-video/SKILL.md
- plugins/compound-engineering/skills/agent-browser/SKILL.md
- plugins/compound-engineering/skills/agent-browser/references/authentication.md
---
# GitHub Native Video Upload for PRs
## Problem
Embedding video demos in GitHub PR descriptions required external storage (R2/rclone)
or GitHub Release assets. Release asset URLs render as plain download links, not inline
video players. Only `user-attachments/assets/` URLs render with GitHub's native inline
video player -- the same result as pasting a video into the PR editor manually.
The distinction is absolute:
| URL namespace | Rendering |
|---|---|
| `github.com/releases/download/...` | Plain download link (bad UX, triggers download on mobile) |
| `github.com/user-attachments/assets/...` | Native inline `<video>` player with controls |
## Investigation
1. **Public upload API** -- No public API exists. The `/upload/policies/assets` endpoint
requires browser session cookies and is not exposed via REST or GraphQL. GitHub CLI
(`gh`) has no support; issues cli/cli#1895, #4228, and #4465 are all closed as
"not planned". GitHub keeps this private to limit abuse surface (malware hosting,
spam CDN, DMCA liability).
2. **Release asset approach (Strategy B)** -- URLs render as download links, not video
players. Clickable GIF previews trigger downloads on mobile. Unacceptable UX.
3. **Claude-in-Chrome JavaScript injection with base64** -- Blocked by CSP/mixed-content
policy. HTTPS github.com cannot fetch from HTTP localhost. Base64 chunking is possible
but does not scale for larger videos.
4. **`tonkotsuboy/github-upload-image-to-pr`** -- Open-source reference confirming
browser automation is the only working approach for producing native URLs.
5. **agent-browser `upload` command** -- Works. Playwright sets files directly on hidden
file inputs without base64 encoding or fetch requests. CSP is not a factor because
Playwright's `setInputFiles` operates at the browser engine level, not via JavaScript.
## Working Solution
### Upload flow
```bash
# Navigate to PR page (authenticated Chrome session)
agent-browser --engine chrome --session-name github \
open "https://github.com/[owner]/[repo]/pull/[number]"
agent-browser scroll down 5000
# Upload video via the hidden file input
agent-browser upload '#fc-new_comment_field' tmp/videos/feature-demo.mp4
# Wait for GitHub to process the upload (typically 3-5 seconds)
agent-browser wait 5000
# Extract the URL GitHub injected into the textarea
agent-browser eval "document.getElementById('new_comment_field').value"
# Returns: https://github.com/user-attachments/assets/[uuid]
# Clear the textarea without submitting (upload already persisted server-side)
agent-browser eval "const ta = document.getElementById('new_comment_field'); \
ta.value = ''; ta.dispatchEvent(new Event('input', { bubbles: true }))"
# Embed in PR description (URL on its own line renders as inline video player)
gh pr edit [number] --body "[body with video URL on its own line]"
```
### Key selectors (validated March 2026)
| Selector | Element | Purpose |
|---|---|---|
| `#fc-new_comment_field` | Hidden `<input type="file">` | Target for `agent-browser upload`. Accepts `.mp4`, `.mov`, `.webm` and many other types. |
| `#new_comment_field` | `<textarea>` | GitHub injects the `user-attachments/assets/` URL here after processing the upload. |
GitHub's comment form contains the hidden file input. After Playwright sets the file,
GitHub uploads it server-side and injects a markdown URL into the textarea. The upload
is persisted even if the form is never submitted.
## What Was Removed
The following approaches were removed from the feature-video skill:
- R2/rclone setup and configuration
- Release asset upload flow (`gh release upload`)
- GIF preview generation (unnecessary with native inline video player)
- Strategy B fallback logic
Total: approximately 100 lines of SKILL.md content removed. The skill is now simpler
and has zero external storage dependencies.
## Prevention
### URL validation
After any upload step, confirm the extracted URL contains `user-attachments/assets/`
before writing it into the PR description. If the URL does not match, the upload failed
or used the wrong method.
### Upload failure handling
If the textarea is empty after the wait, check:
1. Session validity (did GitHub redirect to login?)
2. Wait time (processing can be slow under load -- retry after 3-5 more seconds)
3. File size (10MB free, 100MB paid accounts)
Do not silently substitute a release asset URL. Report the failure and offer to retry.
### DOM selector fragility
`#fc-new_comment_field` and `#new_comment_field` are GitHub's internal element IDs and
may change in future UI updates. If the upload stops working, snapshot the PR page and
inspect the current comment form structure for updated selectors.
### Size limits
- Free accounts: 10MB per file
- Paid (Pro, Team, Enterprise): 100MB per file
Check file size before attempting upload. Re-encode at lower quality if needed.
## References
- GitHub CLI issues: cli/cli#1895, #4228, #4465 (all closed "not planned")
- `tonkotsuboy/github-upload-image-to-pr` -- reference implementation
- GitHub Community Discussions: #29993, #46951, #28219