feat: improve feature-video skill with GitHub native video upload (#344)
This commit is contained in:
@@ -0,0 +1,147 @@
|
||||
---
|
||||
title: "Persistent GitHub authentication for agent-browser using named sessions"
|
||||
category: integrations
|
||||
date: 2026-03-22
|
||||
tags:
|
||||
- agent-browser
|
||||
- github
|
||||
- authentication
|
||||
- chrome
|
||||
- session-persistence
|
||||
- lightpanda
|
||||
related_to:
|
||||
- plugins/compound-engineering/skills/feature-video/SKILL.md
|
||||
- plugins/compound-engineering/skills/agent-browser/SKILL.md
|
||||
- plugins/compound-engineering/skills/agent-browser/references/authentication.md
|
||||
- plugins/compound-engineering/skills/agent-browser/references/session-management.md
|
||||
---
|
||||
|
||||
# agent-browser Chrome Authentication for GitHub
|
||||
|
||||
## Problem
|
||||
|
||||
agent-browser needs authenticated access to GitHub for workflows like the native video
|
||||
upload in the feature-video skill. Multiple authentication approaches were evaluated
|
||||
before finding one that works reliably with 2FA, SSO, and OAuth.
|
||||
|
||||
## Investigation
|
||||
|
||||
| Approach | Result |
|
||||
|---|---|
|
||||
| `--profile` flag | Lightpanda (default engine on some installs) throws "Profiles are not supported with Lightpanda". Must use `--engine chrome`. |
|
||||
| Fresh Chrome profile | No GitHub cookies. Shows "Sign up for free" instead of comment form. |
|
||||
| `--auto-connect` | Requires Chrome pre-launched with `--remote-debugging-port`. Error: "No running Chrome instance found" in normal use. Impractical. |
|
||||
| Auth vault (`auth save`/`auth login`) | Cannot handle 2FA, SSO, or OAuth redirects. Only works for simple username/password forms. |
|
||||
| `--session-name` with Chrome engine | Cookies auto-save/restore. One-time headed login handles any auth method. **This works.** |
|
||||
|
||||
## Working Solution
|
||||
|
||||
### One-time setup (headed, user logs in manually)
|
||||
|
||||
```bash
|
||||
# Close any running daemon (ignores engine/option changes when reused)
|
||||
agent-browser close
|
||||
|
||||
# Open GitHub login in headed Chrome with a named session
|
||||
agent-browser --engine chrome --headed --session-name github open https://github.com/login
|
||||
# User logs in manually -- handles 2FA, SSO, OAuth, any method
|
||||
|
||||
# Verify auth
|
||||
agent-browser open https://github.com/settings/profile
|
||||
# If profile page loads, auth is confirmed
|
||||
```
|
||||
|
||||
### Session validity check (before each workflow)
|
||||
|
||||
```bash
|
||||
agent-browser close
|
||||
agent-browser --engine chrome --session-name github open https://github.com/settings/profile
|
||||
agent-browser get title
|
||||
# Title contains username or "Profile" -> session valid, proceed
|
||||
# Title contains "Sign in" or URL is github.com/login -> session expired, re-auth
|
||||
```
|
||||
|
||||
### All subsequent runs (headless, cookies persist)
|
||||
|
||||
```bash
|
||||
agent-browser --engine chrome --session-name github open https://github.com/...
|
||||
```
|
||||
|
||||
## Key Findings
|
||||
|
||||
### Engine requirement
|
||||
|
||||
MUST use `--engine chrome`. Lightpanda does not support profiles, session persistence,
|
||||
or state files. Any workflow that uses `--session-name`, `--profile`, `--state`, or
|
||||
`state save/load` requires the Chrome engine.
|
||||
|
||||
Include `--engine chrome` explicitly in every command that uses an authenticated session.
|
||||
Do not rely on environment defaults -- `AGENT_BROWSER_ENGINE` may be set to `lightpanda`
|
||||
in some environments.
|
||||
|
||||
### Daemon restart
|
||||
|
||||
Must run `agent-browser close` before switching engine or session options. A running
|
||||
daemon ignores new flags like `--engine`, `--headed`, or `--session-name`.
|
||||
|
||||
### Session lifetime
|
||||
|
||||
Cookies expire when GitHub invalidates them (typically weeks). Periodic re-authentication
|
||||
is required. The feature-video skill handles this by checking session validity before
|
||||
the upload step and prompting for re-auth only when needed.
|
||||
|
||||
### Auth vault limitations
|
||||
|
||||
The auth vault (`agent-browser auth save`/`auth login`) can only handle login forms with
|
||||
visible username and password fields. It cannot handle:
|
||||
|
||||
- 2FA (TOTP, SMS, push notification)
|
||||
- SSO with identity provider redirect
|
||||
- OAuth consent flows
|
||||
- CAPTCHA
|
||||
- Device verification prompts
|
||||
|
||||
For GitHub and most modern services, use the one-time headed login approach instead.
|
||||
|
||||
### `--auto-connect` viability
|
||||
|
||||
Impractical for automated workflows. Requires Chrome to be pre-launched with
|
||||
`--remote-debugging-port=9222`, which is not how users normally run Chrome.
|
||||
|
||||
## Prevention
|
||||
|
||||
### Skills requiring auth must declare engine
|
||||
|
||||
State the engine requirement in the Prerequisites section of any skill that needs
|
||||
browser auth. Include `--engine chrome` in every `agent-browser` command that touches
|
||||
an authenticated session.
|
||||
|
||||
### Session check timing
|
||||
|
||||
Perform the session check immediately before the step that needs auth, not at skill
|
||||
start. A session valid at start may expire during a long workflow (video encoding can
|
||||
take minutes).
|
||||
|
||||
### Recovery without restart
|
||||
|
||||
When expiry is detected at upload time, the video file is already encoded. Recovery:
|
||||
re-authenticate, then retry only the upload step. Do not restart from the beginning.
|
||||
|
||||
### Concurrent sessions
|
||||
|
||||
Use `--session-name` with a semantically descriptive name (e.g., `github`) when multiple
|
||||
skills or agents may run concurrently. Two concurrent runs sharing the default session
|
||||
will interfere with each other.
|
||||
|
||||
### State file security
|
||||
|
||||
Session state files in `~/.agent-browser/sessions/` contain cookies in plaintext.
|
||||
Do not commit to repositories. Add to `.gitignore` if the session directory is inside
|
||||
a repo tree.
|
||||
|
||||
## Integration Points
|
||||
|
||||
This pattern is used by:
|
||||
- `feature-video` skill (GitHub native video upload)
|
||||
- Any future skill requiring authenticated GitHub browser access
|
||||
- Potential use for other OAuth-protected services (same pattern, different session name)
|
||||
@@ -0,0 +1,141 @@
|
||||
---
|
||||
title: "GitHub inline video embedding via programmatic browser upload"
|
||||
category: integrations
|
||||
date: 2026-03-22
|
||||
tags:
|
||||
- github
|
||||
- video-embedding
|
||||
- agent-browser
|
||||
- playwright
|
||||
- feature-video
|
||||
- pr-description
|
||||
related_to:
|
||||
- plugins/compound-engineering/skills/feature-video/SKILL.md
|
||||
- plugins/compound-engineering/skills/agent-browser/SKILL.md
|
||||
- plugins/compound-engineering/skills/agent-browser/references/authentication.md
|
||||
---
|
||||
|
||||
# GitHub Native Video Upload for PRs
|
||||
|
||||
## Problem
|
||||
|
||||
Embedding video demos in GitHub PR descriptions required external storage (R2/rclone)
|
||||
or GitHub Release assets. Release asset URLs render as plain download links, not inline
|
||||
video players. Only `user-attachments/assets/` URLs render with GitHub's native inline
|
||||
video player -- the same result as pasting a video into the PR editor manually.
|
||||
|
||||
The distinction is absolute:
|
||||
|
||||
| URL namespace | Rendering |
|
||||
|---|---|
|
||||
| `github.com/releases/download/...` | Plain download link (bad UX, triggers download on mobile) |
|
||||
| `github.com/user-attachments/assets/...` | Native inline `<video>` player with controls |
|
||||
|
||||
## Investigation
|
||||
|
||||
1. **Public upload API** -- No public API exists. The `/upload/policies/assets` endpoint
|
||||
requires browser session cookies and is not exposed via REST or GraphQL. GitHub CLI
|
||||
(`gh`) has no support; issues cli/cli#1895, #4228, and #4465 are all closed as
|
||||
"not planned". GitHub keeps this private to limit abuse surface (malware hosting,
|
||||
spam CDN, DMCA liability).
|
||||
|
||||
2. **Release asset approach (Strategy B)** -- URLs render as download links, not video
|
||||
players. Clickable GIF previews trigger downloads on mobile. Unacceptable UX.
|
||||
|
||||
3. **Claude-in-Chrome JavaScript injection with base64** -- Blocked by CSP/mixed-content
|
||||
policy. HTTPS github.com cannot fetch from HTTP localhost. Base64 chunking is possible
|
||||
but does not scale for larger videos.
|
||||
|
||||
4. **`tonkotsuboy/github-upload-image-to-pr`** -- Open-source reference confirming
|
||||
browser automation is the only working approach for producing native URLs.
|
||||
|
||||
5. **agent-browser `upload` command** -- Works. Playwright sets files directly on hidden
|
||||
file inputs without base64 encoding or fetch requests. CSP is not a factor because
|
||||
Playwright's `setInputFiles` operates at the browser engine level, not via JavaScript.
|
||||
|
||||
## Working Solution
|
||||
|
||||
### Upload flow
|
||||
|
||||
```bash
|
||||
# Navigate to PR page (authenticated Chrome session)
|
||||
agent-browser --engine chrome --session-name github \
|
||||
open "https://github.com/[owner]/[repo]/pull/[number]"
|
||||
agent-browser scroll down 5000
|
||||
|
||||
# Upload video via the hidden file input
|
||||
agent-browser upload '#fc-new_comment_field' tmp/videos/feature-demo.mp4
|
||||
|
||||
# Wait for GitHub to process the upload (typically 3-5 seconds)
|
||||
agent-browser wait 5000
|
||||
|
||||
# Extract the URL GitHub injected into the textarea
|
||||
agent-browser eval "document.getElementById('new_comment_field').value"
|
||||
# Returns: https://github.com/user-attachments/assets/[uuid]
|
||||
|
||||
# Clear the textarea without submitting (upload already persisted server-side)
|
||||
agent-browser eval "const ta = document.getElementById('new_comment_field'); \
|
||||
ta.value = ''; ta.dispatchEvent(new Event('input', { bubbles: true }))"
|
||||
|
||||
# Embed in PR description (URL on its own line renders as inline video player)
|
||||
gh pr edit [number] --body "[body with video URL on its own line]"
|
||||
```
|
||||
|
||||
### Key selectors (validated March 2026)
|
||||
|
||||
| Selector | Element | Purpose |
|
||||
|---|---|---|
|
||||
| `#fc-new_comment_field` | Hidden `<input type="file">` | Target for `agent-browser upload`. Accepts `.mp4`, `.mov`, `.webm` and many other types. |
|
||||
| `#new_comment_field` | `<textarea>` | GitHub injects the `user-attachments/assets/` URL here after processing the upload. |
|
||||
|
||||
GitHub's comment form contains the hidden file input. After Playwright sets the file,
|
||||
GitHub uploads it server-side and injects a markdown URL into the textarea. The upload
|
||||
is persisted even if the form is never submitted.
|
||||
|
||||
## What Was Removed
|
||||
|
||||
The following approaches were removed from the feature-video skill:
|
||||
|
||||
- R2/rclone setup and configuration
|
||||
- Release asset upload flow (`gh release upload`)
|
||||
- GIF preview generation (unnecessary with native inline video player)
|
||||
- Strategy B fallback logic
|
||||
|
||||
Total: approximately 100 lines of SKILL.md content removed. The skill is now simpler
|
||||
and has zero external storage dependencies.
|
||||
|
||||
## Prevention
|
||||
|
||||
### URL validation
|
||||
|
||||
After any upload step, confirm the extracted URL contains `user-attachments/assets/`
|
||||
before writing it into the PR description. If the URL does not match, the upload failed
|
||||
or used the wrong method.
|
||||
|
||||
### Upload failure handling
|
||||
|
||||
If the textarea is empty after the wait, check:
|
||||
1. Session validity (did GitHub redirect to login?)
|
||||
2. Wait time (processing can be slow under load -- retry after 3-5 more seconds)
|
||||
3. File size (10MB free, 100MB paid accounts)
|
||||
|
||||
Do not silently substitute a release asset URL. Report the failure and offer to retry.
|
||||
|
||||
### DOM selector fragility
|
||||
|
||||
`#fc-new_comment_field` and `#new_comment_field` are GitHub's internal element IDs and
|
||||
may change in future UI updates. If the upload stops working, snapshot the PR page and
|
||||
inspect the current comment form structure for updated selectors.
|
||||
|
||||
### Size limits
|
||||
|
||||
- Free accounts: 10MB per file
|
||||
- Paid (Pro, Team, Enterprise): 100MB per file
|
||||
|
||||
Check file size before attempting upload. Re-encode at lower quality if needed.
|
||||
|
||||
## References
|
||||
|
||||
- GitHub CLI issues: cli/cli#1895, #4228, #4465 (all closed "not planned")
|
||||
- `tonkotsuboy/github-upload-image-to-pr` -- reference implementation
|
||||
- GitHub Community Discussions: #29993, #46951, #28219
|
||||
Reference in New Issue
Block a user