[2.14.0] Add /playwright-test command for browser testing

- New `/playwright-test` command for end-to-end browser tests on PR-affected pages - Uses Playwright MCP to navigate, snapshot, check console errors - Supports human-in-the-loop for OAuth/email/payment flows - Creates P1 todos for failures and retries until passing - Added Section 7 to `/workflows:review` - optional Playwright testing as subagent 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-18 10:26:43 -08:00
parent d8ea046bd9
commit f619e261c4
5 changed files with 292 additions and 4 deletions
--- a/plugins/compound-engineering/.claude-plugin/plugin.json
+++ b/plugins/compound-engineering/.claude-plugin/plugin.json
@@ -1,7 +1,7 @@
 {
  "name": "compound-engineering",
-  "version": "2.13.0",
-  "description": "AI-powered development tools. 27 agents, 17 commands, 13 skills, 2 MCP servers for code review, research, design, and workflow automation.",
+  "version": "2.14.0",
+  "description": "AI-powered development tools. 27 agents, 18 commands, 13 skills, 2 MCP servers for code review, research, design, and workflow automation.",
  "author": {
    "name": "Kieran Klaassen",
    "email": "kieran@every.to",
--- a/plugins/compound-engineering/CHANGELOG.md
+++ b/plugins/compound-engineering/CHANGELOG.md
@@ -5,6 +5,16 @@ All notable changes to the compound-engineering plugin will be documented in thi
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

+## [2.14.0] - 2025-12-18
+
+### Added
+
+- **`/playwright-test` command** - Run end-to-end browser tests on pages affected by a PR or branch. Uses Playwright MCP to navigate pages, capture snapshots, check console errors, test interactions, and pause for human verification on OAuth/email/payment flows. Creates P1 todos for failures and retries until passing.
+
+### Changed
+
+- **`/workflows:review` command** - Added optional Playwright testing phase (Section 7). After review agents complete, offers to spawn `/playwright-test` as a subagent to verify affected pages in a real browser.
+
 ## [2.13.0] - 2025-12-15

 ### Added
--- a/plugins/compound-engineering/README.md
+++ b/plugins/compound-engineering/README.md
@@ -7,8 +7,8 @@ AI-powered development tools that get smarter with every use. Make each unit of
 | Component | Count |
 |-----------|-------|
 | Agents | 27 |
-| Commands | 17 |
-| Skills | 12 |
+| Commands | 18 |
+| Skills | 13 |
 | MCP Servers | 2 |

 ## Agents
@@ -95,6 +95,7 @@ Core workflow commands use `workflows:` prefix to avoid collisions with built-in
 | `/resolve_pr_parallel` | Resolve PR comments in parallel |
 | `/resolve_todo_parallel` | Resolve todos in parallel |
 | `/triage` | Triage and prioritize issues |
+| `/playwright-test` | Run browser tests on PR-affected pages |

 ## Skills

--- a/plugins/compound-engineering/commands/playwright-test.md
+++ b/plugins/compound-engineering/commands/playwright-test.md
@@ -0,0 +1,248 @@
+---
+name: playwright-test
+description: Run Playwright browser tests on pages affected by current PR or branch
+argument-hint: "[PR number, branch name, or 'current' for current branch]"
+---
+
+# Playwright Test Command
+
+<command_purpose>Run end-to-end browser tests on pages affected by a PR or branch changes using Playwright MCP.</command_purpose>
+
+## Introduction
+
+<role>QA Engineer specializing in browser-based end-to-end testing</role>
+
+This command tests affected pages in a real browser, catching issues that unit tests miss:
+- JavaScript integration bugs
+- CSS/layout regressions
+- User workflow breakages
+- Console errors
+
+## Prerequisites
+
+<requirements>
+- Local development server running (e.g., `bin/dev`, `rails server`)
+- Playwright MCP server connected
+- Git repository with changes to test
+</requirements>
+
+## Main Tasks
+
+### 1. Determine Test Scope
+
+<test_target> $ARGUMENTS </test_target>
+
+<determine_scope>
+
+**If PR number provided:**
+```bash
+gh pr view [number] --json files -q '.files[].path'
+```
+
+**If 'current' or empty:**
+```bash
+git diff --name-only main...HEAD
+```
+
+**If branch name provided:**
+```bash
+git diff --name-only main...[branch]
+```
+
+</determine_scope>
+
+### 2. Map Files to Routes
+
+<file_to_route_mapping>
+
+Map changed files to testable routes:
+
+| File Pattern | Route(s) |
+|-------------|----------|
+| `app/views/users/*` | `/users`, `/users/:id`, `/users/new` |
+| `app/controllers/settings_controller.rb` | `/settings` |
+| `app/javascript/controllers/*_controller.js` | Pages using that Stimulus controller |
+| `app/components/*_component.rb` | Pages rendering that component |
+| `app/views/layouts/*` | All pages (test homepage at minimum) |
+| `app/assets/stylesheets/*` | Visual regression on key pages |
+| `app/helpers/*_helper.rb` | Pages using that helper |
+
+Build a list of URLs to test based on the mapping.
+
+</file_to_route_mapping>
+
+### 3. Verify Server is Running
+
+<check_server>
+
+Before testing, verify the local server is accessible:
+
+```
+mcp__playwright__browser_navigate({ url: "http://localhost:3000" })
+mcp__playwright__browser_snapshot({})
+```
+
+If server is not running, inform user:
+```markdown
+**Server not running**
+
+Please start your development server:
+- Rails: `bin/dev` or `rails server`
+- Node: `npm run dev`
+
+Then run `/playwright-test` again.
+```
+
+</check_server>
+
+### 4. Test Each Affected Page
+
+<test_pages>
+
+For each affected route:
+
+**Step 1: Navigate and capture snapshot**
+```
+mcp__playwright__browser_navigate({ url: "http://localhost:3000/[route]" })
+mcp__playwright__browser_snapshot({})
+```
+
+**Step 2: Check for errors**
+```
+mcp__playwright__browser_console_messages({ level: "error" })
+```
+
+**Step 3: Verify key elements**
+- Page title/heading present
+- Primary content rendered
+- No error messages visible
+- Forms have expected fields
+
+**Step 4: Test critical interactions (if applicable)**
+```
+mcp__playwright__browser_click({ element: "[description]", ref: "[ref]" })
+mcp__playwright__browser_snapshot({})
+```
+
+</test_pages>
+
+### 5. Human Verification (When Required)
+
+<human_verification>
+
+Pause for human input when testing touches:
+
+| Flow Type | What to Ask |
+|-----------|-------------|
+| OAuth | "Please sign in with [provider] and confirm it works" |
+| Email | "Check your inbox for the test email and confirm receipt" |
+| Payments | "Complete a test purchase in sandbox mode" |
+| SMS | "Verify you received the SMS code" |
+| External APIs | "Confirm the [service] integration is working" |
+
+Use AskUserQuestion:
+```markdown
+**Human Verification Needed**
+
+This test touches the [flow type]. Please:
+1. [Action to take]
+2. [What to verify]
+
+Did it work correctly?
+1. Yes - continue testing
+2. No - describe the issue
+```
+
+</human_verification>
+
+### 6. Handle Failures
+
+<failure_handling>
+
+When a test fails:
+
+1. **Document the failure:**
+   - Screenshot the error state
+   - Capture console errors
+   - Note the exact reproduction steps
+
+2. **Ask user how to proceed:**
+   ```markdown
+   **Test Failed: [route]**
+
+   Issue: [description]
+   Console errors: [if any]
+
+   How to proceed?
+   1. Fix now - I'll help debug and fix
+   2. Create todo - Add to todos/ for later
+   3. Skip - Continue testing other pages
+   ```
+
+3. **If "Fix now":**
+   - Investigate the issue
+   - Propose a fix
+   - Apply fix
+   - Re-run the failing test
+
+4. **If "Create todo":**
+   - Create `{id}-pending-p1-playwright-{description}.md`
+   - Continue testing
+
+5. **If "Skip":**
+   - Log as skipped
+   - Continue testing
+
+</failure_handling>
+
+### 7. Test Summary
+
+<test_summary>
+
+After all tests complete, present summary:
+
+```markdown
+## 🎭 Playwright Test Results
+
+**Test Scope:** PR #[number] / [branch name]
+**Server:** http://localhost:3000
+
+### Pages Tested: [count]
+
+| Route | Status | Notes |
+|-------|--------|-------|
+| `/users` | ✅ Pass | |
+| `/settings` | ✅ Pass | |
+| `/dashboard` | ❌ Fail | Console error: [msg] |
+| `/checkout` | ⏭️ Skip | Requires payment credentials |
+
+### Console Errors: [count]
+- [List any errors found]
+
+### Human Verifications: [count]
+- OAuth flow: ✅ Confirmed
+- Email delivery: ✅ Confirmed
+
+### Failures: [count]
+- `/dashboard` - [issue description]
+
+### Created Todos: [count]
+- `005-pending-p1-playwright-dashboard-error.md`
+
+### Result: [PASS / FAIL / PARTIAL]
+```
+
+</test_summary>
+
+## Quick Usage Examples
+
+```bash
+# Test current branch changes
+/playwright-test
+
+# Test specific PR
+/playwright-test 847
+
+# Test specific branch
+/playwright-test feature/new-dashboard
+```
--- a/plugins/compound-engineering/commands/workflows/review.md
+++ b/plugins/compound-engineering/commands/workflows/review.md
@@ -425,6 +425,35 @@ After creating all todo files, present comprehensive summary:

 ```

+### 7. Playwright Testing (Optional)
+
+<offer_testing>
+After presenting the Summary Report, ask the user:
+
+**"Want to run Playwright tests on the affected pages?"**
+1. Yes - run `/playwright-test`
+2. No - skip to next steps
+</offer_testing>
+
+#### If User Accepts:
+
+Spawn a subagent to run the tests (preserves main context):
+
+```
+Task general-purpose("Run /playwright-test for PR #[number]. Test all affected pages, check for console errors, handle failures by creating todos and fixing.")
+```
+
+The subagent will:
+1. Identify pages affected by the PR
+2. Navigate to each page and capture snapshots
+3. Check for console errors
+4. Test critical interactions
+5. Pause for human verification on OAuth/email/payment flows
+6. Create P1 todos for any failures
+7. Fix and retry until all tests pass
+
+**Alternatively**, user can run standalone: `/playwright-test [PR number]`
+
 ### Important: P1 Findings Block Merge

 Any **🔴 P1 (CRITICAL)** findings must be addressed before merging the PR. Present these prominently and ensure they're resolved before accepting the PR.