diff --git a/plugins/compound-engineering/AGENTS.md b/plugins/compound-engineering/AGENTS.md index f778ddd..f393b5b 100644 --- a/plugins/compound-engineering/AGENTS.md +++ b/plugins/compound-engineering/AGENTS.md @@ -84,6 +84,18 @@ When adding or modifying skills, verify compliance with the skill spec: - [ ] When a skill needs to ask the user a question, instruct use of the platform's blocking question tool and name the known equivalents (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini) - [ ] Include a fallback for environments without a question tool (e.g., present numbered options and wait for the user's reply before proceeding) +### Cross-Platform Task Tracking + +- [ ] When a skill needs to create or track tasks, describe the intent (e.g., "create a task list") and name the known equivalents (`TaskCreate`/`TaskUpdate`/`TaskList` in Claude Code, `update_plan` in Codex) +- [ ] Do not reference `TodoWrite` or `TodoRead` — these are legacy Claude Code tools replaced by `TaskCreate`/`TaskUpdate`/`TaskList` +- [ ] When a skill dispatches sub-agents, prefer parallel execution but include a sequential fallback for platforms that do not support parallel dispatch + +### Script Path References in Skills + +- [ ] In bash code blocks, reference co-located scripts using relative paths (e.g., `bash scripts/my-script ARG`) — not `${CLAUDE_PLUGIN_ROOT}` or other platform-specific variables +- [ ] All platforms resolve script paths relative to the skill's directory; no env var prefix is needed +- [ ] Always also include a markdown link to the script (e.g., `[scripts/my-script](scripts/my-script)`) so the agent can locate and read it + ### Cross-Platform Reference Rules This plugin is authored once, then converted for other agent platforms. Commands and agents are transformed during that conversion, but `plugin.skills` are usually copied almost exactly as written. diff --git a/plugins/compound-engineering/README.md b/plugins/compound-engineering/README.md index 46348a6..10c4b5f 100644 --- a/plugins/compound-engineering/README.md +++ b/plugins/compound-engineering/README.md @@ -94,16 +94,15 @@ Core workflow commands use `ce:` prefix to unambiguously identify them as compou | `/changelog` | Create engaging changelogs for recent merges | | `/create-agent-skill` | Create or edit Claude Code skills | | `/generate_command` | Generate new slash commands | -| `/heal-skill` | Fix skill documentation issues | | `/sync` | Sync Claude Code config across machines | -| `/report-bug` | Report a bug in the plugin | +| `/report-bug-ce` | Report a bug in the compound-engineering plugin | | `/reproduce-bug` | Reproduce bugs using logs and console | | `/resolve_parallel` | Resolve TODO comments in parallel | | `/resolve_pr_parallel` | Resolve PR comments in parallel | | `/resolve-todo-parallel` | Resolve todos in parallel | | `/triage` | Triage and prioritize issues | | `/test-browser` | Run browser tests on PR-affected pages | -| `/xcode-test` | Build and test iOS apps on simulator | +| `/test-xcode` | Build and test iOS apps on simulator | | `/feature-video` | Record video walkthroughs and add to PR description | ## Skills diff --git a/plugins/compound-engineering/skills/heal-skill/SKILL.md b/plugins/compound-engineering/skills/heal-skill/SKILL.md deleted file mode 100644 index a021f31..0000000 --- a/plugins/compound-engineering/skills/heal-skill/SKILL.md +++ /dev/null @@ -1,143 +0,0 @@ ---- -name: heal-skill -description: Fix incorrect SKILL.md files when a skill has wrong instructions or outdated API references -argument-hint: "[optional: specific issue to fix]" -allowed-tools: [Read, Edit, Bash(ls:*), Bash(git:*)] -disable-model-invocation: true ---- - - -Update a skill's SKILL.md and related files based on corrections discovered during execution. - -Analyze the conversation to detect which skill is running, reflect on what went wrong, propose specific fixes, get user approval, then apply changes with optional commit. - - - -Skill detection: !`ls -1 ./skills/*/SKILL.md | head -5` - - - - -1. **Detect skill** from conversation context (invocation messages, recent SKILL.md references) -2. **Reflect** on what went wrong and how you discovered the fix -3. **Present** proposed changes with before/after diffs -4. **Get approval** before making any edits -5. **Apply** changes and optionally commit - - - - - -Identify the skill from conversation context: - -- Look for skill invocation messages -- Check which SKILL.md was recently referenced -- Examine current task context - -Set: `SKILL_NAME=[skill-name]` and `SKILL_DIR=./skills/$SKILL_NAME` - -If unclear, ask the user. - - - -Focus on $ARGUMENTS if provided, otherwise analyze broader context. - -Determine: -- **What was wrong**: Quote specific sections from SKILL.md that are incorrect -- **Discovery method**: Context7, error messages, trial and error, documentation lookup -- **Root cause**: Outdated API, incorrect parameters, wrong endpoint, missing context -- **Scope of impact**: Single section or multiple? Related files affected? -- **Proposed fix**: Which files, which sections, before/after for each - - - -```bash -ls -la $SKILL_DIR/ -ls -la $SKILL_DIR/references/ 2>/dev/null -ls -la $SKILL_DIR/scripts/ 2>/dev/null -``` - - - -Present changes in this format: - -``` -**Skill being healed:** [skill-name] -**Issue discovered:** [1-2 sentence summary] -**Root cause:** [brief explanation] - -**Files to be modified:** -- [ ] SKILL.md -- [ ] references/[file].md -- [ ] scripts/[file].py - -**Proposed changes:** - -### Change 1: SKILL.md - [Section name] -**Location:** Line [X] in SKILL.md - -**Current (incorrect):** -``` -[exact text from current file] -``` - -**Corrected:** -``` -[new text] -``` - -**Reason:** [why this fixes the issue] - -[repeat for each change across all files] - -**Impact assessment:** -- Affects: [authentication/API endpoints/parameters/examples/etc.] - -**Verification:** -These changes will prevent: [specific error that prompted this] -``` - - - -``` -Should I apply these changes? - -1. Yes, apply and commit all changes -2. Apply but don't commit (let me review first) -3. Revise the changes (I'll provide feedback) -4. Cancel (don't make changes) - -Choose (1-4): -``` - -**Wait for user response. Do not proceed without approval.** - - - -Only after approval (option 1 or 2): - -1. Use Edit tool for each correction across all files -2. Read back modified sections to verify -3. If option 1, commit with structured message showing what was healed -4. Confirm completion with file list - - - - -- Skill correctly detected from conversation context -- All incorrect sections identified with before/after -- User approved changes before application -- All edits applied across SKILL.md and related files -- Changes verified by reading back -- Commit created if user chose option 1 -- Completion confirmed with file list - - - -Before completing: - -- Read back each modified section to confirm changes applied -- Ensure cross-file consistency (SKILL.md examples match references/) -- Verify git commit created if option 1 was selected -- Check no unintended files were modified - diff --git a/plugins/compound-engineering/skills/report-bug/SKILL.md b/plugins/compound-engineering/skills/report-bug-ce/SKILL.md similarity index 64% rename from plugins/compound-engineering/skills/report-bug/SKILL.md rename to plugins/compound-engineering/skills/report-bug-ce/SKILL.md index 2e7ba48..3da76e6 100644 --- a/plugins/compound-engineering/skills/report-bug/SKILL.md +++ b/plugins/compound-engineering/skills/report-bug-ce/SKILL.md @@ -1,17 +1,17 @@ --- -name: report-bug +name: report-bug-ce description: Report a bug in the compound-engineering plugin argument-hint: "[optional: brief description of the bug]" disable-model-invocation: true --- -# Report a Compounding Engineering Plugin Bug +# Report a Compound Engineering Plugin Bug -Report bugs encountered while using the compound-engineering plugin. This command gathers structured information and creates a GitHub issue for the maintainer. +Report bugs encountered while using the compound-engineering plugin. This skill gathers structured information and creates a GitHub issue for the maintainer. ## Step 1: Gather Bug Information -Use the AskUserQuestion tool to collect the following information: +Ask the user the following questions (using the platform's blocking question tool — e.g., `AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini — or present numbered options and wait for a reply): **Question 1: Bug Category** - What type of issue are you experiencing? @@ -39,18 +39,25 @@ Use the AskUserQuestion tool to collect the following information: ## Step 2: Collect Environment Information -Automatically gather: +Automatically gather environment details. Detect the coding agent platform and collect what is available: + +**OS info (all platforms):** ```bash -# Get plugin version -cat ~/.claude/plugins/installed_plugins.json 2>/dev/null | grep -A5 "compound-engineering" | head -10 || echo "Plugin info not found" - -# Get Claude Code version -claude --version 2>/dev/null || echo "Claude CLI version unknown" - -# Get OS info uname -a ``` +**Plugin version:** Read the plugin manifest or installed plugin metadata. Common locations: +- Claude Code: `~/.claude/plugins/installed_plugins.json` +- Codex: `.codex/plugins/` or project config +- Other platforms: check the platform's plugin registry + +**Agent CLI version:** Run the platform's version command: +- Claude Code: `claude --version` +- Codex: `codex --version` +- Other platforms: use the appropriate CLI version flag + +If any of these fail, note "unknown" and continue — do not block the report. + ## Step 3: Format the Bug Report Create a well-structured bug report with: @@ -63,8 +70,9 @@ Create a well-structured bug report with: ## Environment -- **Plugin Version:** [from installed_plugins.json] -- **Claude Code Version:** [from claude --version] +- **Plugin Version:** [from plugin manifest/registry] +- **Agent Platform:** [e.g., Claude Code, Codex, Copilot, Pi, Kilo] +- **Agent Version:** [from CLI version command] - **OS:** [from uname] ## What Happened @@ -83,16 +91,14 @@ Create a well-structured bug report with: ## Error Messages -``` [Any error output] -``` ## Additional Context [Any other relevant information] --- -*Reported via `/report-bug` command* +*Reported via `/report-bug-ce` skill* ``` ## Step 4: Create GitHub Issue @@ -125,7 +131,7 @@ After the issue is created: ## Output Format ``` -✅ Bug report submitted successfully! +Bug report submitted successfully! Issue: https://github.com/EveryInc/compound-engineering-plugin/issues/[NUMBER] Title: [compound-engineering] Bug: [description] @@ -136,16 +142,16 @@ The maintainer will review your report and respond as soon as possible. ## Error Handling -- If `gh` CLI is not authenticated: Prompt user to run `gh auth login` first -- If issue creation fails: Display the formatted report so user can manually create the issue -- If required information is missing: Re-prompt for that specific field +- If `gh` CLI is not installed or not authenticated: prompt the user to install/authenticate first +- If issue creation fails: display the formatted report so the user can manually create the issue +- If required information is missing: re-prompt for that specific field ## Privacy Notice -This command does NOT collect: +This skill does NOT collect: - Personal information - API keys or credentials -- Private code from your projects +- Private code from projects - File paths beyond basic OS info Only technical information about the bug is included in the report. diff --git a/plugins/compound-engineering/skills/resolve-pr-parallel/SKILL.md b/plugins/compound-engineering/skills/resolve-pr-parallel/SKILL.md index e040fba..36f60fd 100644 --- a/plugins/compound-engineering/skills/resolve-pr-parallel/SKILL.md +++ b/plugins/compound-engineering/skills/resolve-pr-parallel/SKILL.md @@ -12,7 +12,7 @@ Resolve all unresolved PR review comments by spawning parallel agents for each t ## Context Detection -Claude Code automatically detects git context: +Detect git context from the current working directory: - Current branch and associated PR - All PR comments and review threads - Works with any PR by specifying the number @@ -21,10 +21,10 @@ Claude Code automatically detects git context: ### 1. Analyze -Fetch unresolved review threads using the GraphQL script: +Fetch unresolved review threads using the GraphQL script at [scripts/get-pr-comments](scripts/get-pr-comments): ```bash -bash ${CLAUDE_PLUGIN_ROOT}/skills/resolve-pr-parallel/scripts/get-pr-comments PR_NUMBER +bash scripts/get-pr-comments PR_NUMBER ``` This returns only **unresolved, non-outdated** threads with file paths, line numbers, and comment bodies. @@ -37,7 +37,7 @@ gh api repos/{owner}/{repo}/pulls/PR_NUMBER/comments ### 2. Plan -Create a TodoWrite list of all unresolved items grouped by type: +Create a task list of all unresolved items grouped by type (e.g., `TaskCreate` in Claude Code, `update_plan` in Codex): - Code changes requested - Questions to answer - Style/convention fixes @@ -45,23 +45,17 @@ Create a TodoWrite list of all unresolved items grouped by type: ### 3. Implement (PARALLEL) -Spawn a `pr-comment-resolver` agent for each unresolved item in parallel. +Spawn a `compound-engineering:workflow:pr-comment-resolver` agent for each unresolved item. -If there are 3 comments, spawn 3 agents: - -1. Task pr-comment-resolver(comment1) -2. Task pr-comment-resolver(comment2) -3. Task pr-comment-resolver(comment3) - -Always run all in parallel subagents/Tasks for each Todo item. +If there are 3 comments, spawn 3 agents — one per comment. Prefer running all agents in parallel; if the platform does not support parallel dispatch, run them sequentially. ### 4. Commit & Resolve - Commit changes with a clear message referencing the PR feedback -- Resolve each thread programmatically: +- Resolve each thread programmatically using [scripts/resolve-pr-thread](scripts/resolve-pr-thread): ```bash -bash ${CLAUDE_PLUGIN_ROOT}/skills/resolve-pr-parallel/scripts/resolve-pr-thread THREAD_ID +bash scripts/resolve-pr-thread THREAD_ID ``` - Push to remote @@ -71,7 +65,7 @@ bash ${CLAUDE_PLUGIN_ROOT}/skills/resolve-pr-parallel/scripts/resolve-pr-thread Re-fetch comments to confirm all threads are resolved: ```bash -bash ${CLAUDE_PLUGIN_ROOT}/skills/resolve-pr-parallel/scripts/get-pr-comments PR_NUMBER +bash scripts/get-pr-comments PR_NUMBER ``` Should return an empty array `[]`. If threads remain, repeat from step 1. diff --git a/plugins/compound-engineering/skills/resolve-todo-parallel/SKILL.md b/plugins/compound-engineering/skills/resolve-todo-parallel/SKILL.md index bd7a660..07f238b 100644 --- a/plugins/compound-engineering/skills/resolve-todo-parallel/SKILL.md +++ b/plugins/compound-engineering/skills/resolve-todo-parallel/SKILL.md @@ -16,19 +16,13 @@ If any todo recommends deleting, removing, or gitignoring files in `docs/brainst ### 2. Plan -Create a TodoWrite list of all unresolved items grouped by type. Make sure to look at dependencies that might occur and prioritize the ones needed by others. For example, if you need to change a name, you must wait to do the others. Output a mermaid flow diagram showing how we can do this. Can we do everything in parallel? Do we need to do one first that leads to others in parallel? I'll put the to-dos in the mermaid diagram flow-wise so the agent knows how to proceed in order. +Create a task list of all unresolved items grouped by type (e.g., `TaskCreate` in Claude Code, `update_plan` in Codex). Analyze dependencies and prioritize items that others depend on. For example, if a rename is needed, it must complete before dependent items. Output a mermaid flow diagram showing execution order — what can run in parallel, and what must run first. ### 3. Implement (PARALLEL) -Spawn a pr-comment-resolver agent for each unresolved item in parallel. +Spawn a `compound-engineering:workflow:pr-comment-resolver` agent for each unresolved item. -So if there are 3 comments, it will spawn 3 pr-comment-resolver agents in parallel. Like this: - -1. Task pr-comment-resolver(comment1) -2. Task pr-comment-resolver(comment2) -3. Task pr-comment-resolver(comment3) - -Always run all in parallel subagents/Tasks for each Todo item. +If there are 3 items, spawn 3 agents — one per item. Prefer running all agents in parallel; if the platform does not support parallel dispatch, run them sequentially respecting the dependency order from step 2. ### 4. Commit & Resolve @@ -40,7 +34,7 @@ GATE: STOP. Verify that todos have been resolved and changes committed. Do NOT p ### 5. Compound on Lessons Learned -Run the `ce:compound` skill to document what was learned from resolving the todos. +Load the `ce:compound` skill to document what was learned from resolving the todos. The todo resolutions often surface patterns, recurring issues, or architectural insights worth capturing. This step ensures that knowledge compounds rather than being lost. diff --git a/plugins/compound-engineering/skills/setup/SKILL.md b/plugins/compound-engineering/skills/setup/SKILL.md index 73fc0fb..189995f 100644 --- a/plugins/compound-engineering/skills/setup/SKILL.md +++ b/plugins/compound-engineering/skills/setup/SKILL.md @@ -8,26 +8,20 @@ disable-model-invocation: true ## Interaction Method -If `AskUserQuestion` is available, use it for all prompts below. +Ask the user each question below using the platform's blocking question tool (e.g., `AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini). If no structured question tool is available, present each question as a numbered list and wait for a reply before proceeding. For multiSelect questions, accept comma-separated numbers (e.g. `1, 3`). Never skip or auto-configure. -If not, present each question as a numbered list and wait for a reply before proceeding to the next step. For multiSelect questions, accept comma-separated numbers (e.g. `1, 3`). Never skip or auto-configure. - -Interactive setup for `compound-engineering.local.md` — configures which agents run during `/ce:review` and `/ce:work`. +Interactive setup for `compound-engineering.local.md` — configures which agents run during `ce:review` and `ce:work`. ## Step 1: Check Existing Config -Read `compound-engineering.local.md` in the project root. If it exists, display current settings summary and use AskUserQuestion: +Read `compound-engineering.local.md` in the project root. If it exists, display current settings and ask: ``` -question: "Settings file already exists. What would you like to do?" -header: "Config" -options: - - label: "Reconfigure" - description: "Run the interactive setup again from scratch" - - label: "View current" - description: "Show the file contents, then stop" - - label: "Cancel" - description: "Keep current settings" +Settings file already exists. What would you like to do? + +1. Reconfigure - Run the interactive setup again from scratch +2. View current - Show the file contents, then stop +3. Cancel - Keep current settings ``` If "View current": read and display the file, then stop. @@ -47,16 +41,13 @@ test -f requirements.txt && echo "python" || \ echo "general" ``` -Use AskUserQuestion: +Ask: ``` -question: "Detected {type} project. How would you like to configure?" -header: "Setup" -options: - - label: "Auto-configure (Recommended)" - description: "Use smart defaults for {type}. Done in one click." - - label: "Customize" - description: "Choose stack, focus areas, and review depth." +Detected {type} project. How would you like to configure? + +1. Auto-configure (Recommended) - Use smart defaults for {type}. Done in one click. +2. Customize - Choose stack, focus areas, and review depth. ``` ### If Auto-configure → Skip to Step 4 with defaults: @@ -73,50 +64,35 @@ options: **a. Stack** — confirm or override: ``` -question: "Which stack should we optimize for?" -header: "Stack" -options: - - label: "{detected_type} (Recommended)" - description: "Auto-detected from project files" - - label: "Rails" - description: "Ruby on Rails — adds DHH-style and Rails-specific reviewers" - - label: "Python" - description: "Python — adds Pythonic pattern reviewer" - - label: "TypeScript" - description: "TypeScript — adds type safety reviewer" +Which stack should we optimize for? + +1. {detected_type} (Recommended) - Auto-detected from project files +2. Rails - Ruby on Rails, adds DHH-style and Rails-specific reviewers +3. Python - Adds Pythonic pattern reviewer +4. TypeScript - Adds type safety reviewer ``` Only show options that differ from the detected type. -**b. Focus areas** — multiSelect: +**b. Focus areas** — multiSelect (user picks one or more): ``` -question: "Which review areas matter most?" -header: "Focus" -multiSelect: true -options: - - label: "Security" - description: "Vulnerability scanning, auth, input validation (security-sentinel)" - - label: "Performance" - description: "N+1 queries, memory leaks, complexity (performance-oracle)" - - label: "Architecture" - description: "Design patterns, SOLID, separation of concerns (architecture-strategist)" - - label: "Code simplicity" - description: "Over-engineering, YAGNI violations (code-simplicity-reviewer)" +Which review areas matter most? (comma-separated, e.g. 1, 3) + +1. Security - Vulnerability scanning, auth, input validation (security-sentinel) +2. Performance - N+1 queries, memory leaks, complexity (performance-oracle) +3. Architecture - Design patterns, SOLID, separation of concerns (architecture-strategist) +4. Code simplicity - Over-engineering, YAGNI violations (code-simplicity-reviewer) ``` **c. Depth:** ``` -question: "How thorough should reviews be?" -header: "Depth" -options: - - label: "Thorough (Recommended)" - description: "Stack reviewers + all selected focus agents." - - label: "Fast" - description: "Stack reviewers + code simplicity only. Less context, quicker." - - label: "Comprehensive" - description: "All above + git history, data integrity, agent-native checks." +How thorough should reviews be? + +1. Thorough (Recommended) - Stack reviewers + all selected focus agents. +2. Fast - Stack reviewers + code simplicity only. Less context, quicker. +3. Comprehensive - All above + git history, data integrity, agent-native checks. ``` ## Step 4: Build Agent List and Write File @@ -151,7 +127,7 @@ plan_review_agents: [{computed plan agent list}] # Review Context Add project-specific review instructions here. -These notes are passed to all review agents during /ce:review and /ce:work. +These notes are passed to all review agents during ce:review and ce:work. Examples: - "We use Turbo Frames heavily — check for frame-busting issues" diff --git a/plugins/compound-engineering/skills/test-browser/SKILL.md b/plugins/compound-engineering/skills/test-browser/SKILL.md index be25139..a32a29e 100644 --- a/plugins/compound-engineering/skills/test-browser/SKILL.md +++ b/plugins/compound-engineering/skills/test-browser/SKILL.md @@ -4,56 +4,45 @@ description: Run browser tests on pages affected by current PR or branch argument-hint: "[PR number, branch name, 'current', or --port PORT]" --- -# Browser Test Command +# Browser Test Skill -Run end-to-end browser tests on pages affected by a PR or branch changes using agent-browser CLI. +Run end-to-end browser tests on pages affected by a PR or branch changes using the `agent-browser` CLI. -## CRITICAL: Use agent-browser CLI Only +## Use `agent-browser` Only For Browser Automation -**DO NOT use Chrome MCP tools (mcp__claude-in-chrome__*).** +This workflow uses the `agent-browser` CLI exclusively. Do not use any alternative browser automation system, browser MCP integration, or built-in browser-control tool. If the platform offers multiple ways to control a browser, always choose `agent-browser`. -This command uses the `agent-browser` CLI exclusively. The agent-browser CLI is a Bash-based tool from Vercel that runs headless Chromium. It is NOT the same as Chrome browser automation via MCP. +Use `agent-browser` for: opening pages, clicking elements, filling forms, taking screenshots, and scraping rendered content. -If you find yourself calling `mcp__claude-in-chrome__*` tools, STOP. Use `agent-browser` Bash commands instead. - -## Introduction - -QA Engineer specializing in browser-based end-to-end testing - -This command tests affected pages in a real browser, catching issues that unit tests miss: -- JavaScript integration bugs -- CSS/layout regressions -- User workflow breakages -- Console errors +Platform-specific hints: +- In Claude Code, do not use Chrome MCP tools (`mcp__claude-in-chrome__*`). +- In Codex, do not substitute unrelated browsing tools. ## Prerequisites - - Local development server running (e.g., `bin/dev`, `rails server`, `npm run dev`) -- agent-browser CLI installed (see Setup below) +- `agent-browser` CLI installed (see Setup below) - Git repository with changes to test - ## Setup -**Check installation:** ```bash command -v agent-browser >/dev/null 2>&1 && echo "Installed" || echo "NOT INSTALLED" ``` -**Install if needed:** +Install if needed: ```bash npm install -g agent-browser -agent-browser install # Downloads Chromium (~160MB) +agent-browser install ``` See the `agent-browser` skill for detailed usage. -## Main Tasks +## Workflow -### 0. Verify agent-browser Installation +### 1. Verify Installation -Before starting ANY browser testing, verify agent-browser is installed: +Before starting, verify `agent-browser` is available: ```bash command -v agent-browser >/dev/null 2>&1 && echo "Ready" || (echo "Installing..." && npm install -g agent-browser && agent-browser install) @@ -61,27 +50,20 @@ command -v agent-browser >/dev/null 2>&1 && echo "Ready" || (echo "Installing... If installation fails, inform the user and stop. -### 1. Ask Browser Mode +### 2. Ask Browser Mode - +Ask the user whether to run headed or headless (using the platform's question tool — e.g., `AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini — or present options and wait for a reply): -Before starting tests, ask user if they want to watch the browser: +``` +Do you want to watch the browser tests run? -Use AskUserQuestion with: -- Question: "Do you want to watch the browser tests run?" -- Options: - 1. **Headed (watch)** - Opens visible browser window so you can see tests run - 2. **Headless (faster)** - Runs in background, faster but invisible +1. Headed (watch) - Opens visible browser window so you can see tests run +2. Headless (faster) - Runs in background, faster but invisible +``` -Store the choice and use `--headed` flag when user selects "Headed". +Store the choice and use the `--headed` flag when the user selects option 1. - - -### 2. Determine Test Scope - - $ARGUMENTS - - +### 3. Determine Test Scope **If PR number provided:** ```bash @@ -98,11 +80,7 @@ git diff --name-only main...HEAD git diff --name-only main...[branch] ``` - - -### 3. Map Files to Routes - - +### 4. Map Files to Routes Map changed files to testable routes: @@ -120,43 +98,17 @@ Map changed files to testable routes: Build a list of URLs to test based on the mapping. - +### 5. Detect Dev Server Port -### 4. Detect Dev Server Port +Determine the dev server port using this priority: - - -Determine the dev server port using this priority order: - -**Priority 1: Explicit argument** -If the user passed a port number (e.g., `/test-browser 5000` or `/test-browser --port 5000`), use that port directly. - -**Priority 2: AGENTS.md / project instructions** -```bash -# Check AGENTS.md first for port references, then CLAUDE.md as compatibility fallback -grep -Eio '(port\s*[:=]\s*|localhost:)([0-9]{4,5})' AGENTS.md 2>/dev/null | grep -Eo '[0-9]{4,5}' | head -1 -grep -Eio '(port\s*[:=]\s*|localhost:)([0-9]{4,5})' CLAUDE.md 2>/dev/null | grep -Eo '[0-9]{4,5}' | head -1 -``` - -**Priority 3: package.json scripts** -```bash -# Check dev/start scripts for --port flags -grep -Eo '\-\-port[= ]+[0-9]{4,5}' package.json 2>/dev/null | grep -Eo '[0-9]{4,5}' | head -1 -``` - -**Priority 4: Environment files** -```bash -# Check .env, .env.local, .env.development for PORT= -grep -h '^PORT=' .env .env.local .env.development 2>/dev/null | tail -1 | cut -d= -f2 -``` - -**Priority 5: Default fallback** -If none of the above yields a port, default to `3000`. - -Store the result in a `PORT` variable for use in all subsequent steps. +1. **Explicit argument** — if the user passed `--port 5000`, use that directly +2. **Project instructions** — check `AGENTS.md`, `CLAUDE.md`, or other instruction files for port references +3. **package.json** — check dev/start scripts for `--port` flags +4. **Environment files** — check `.env`, `.env.local`, `.env.development` for `PORT=` +5. **Default** — fall back to `3000` ```bash -# Combined detection (run this) PORT="${EXPLICIT_PORT:-}" if [ -z "$PORT" ]; then PORT=$(grep -Eio '(port\s*[:=]\s*|localhost:)([0-9]{4,5})' AGENTS.md 2>/dev/null | grep -Eo '[0-9]{4,5}' | head -1) @@ -174,77 +126,64 @@ PORT="${PORT:-3000}" echo "Using dev server port: $PORT" ``` - - -### 5. Verify Server is Running - - - -Before testing, verify the local server is accessible using the detected port: +### 6. Verify Server is Running ```bash agent-browser open http://localhost:${PORT} agent-browser snapshot -i ``` -If server is not running, inform user: -```markdown -**Server not running on port ${PORT}** +If the server is not running, inform the user: + +``` +Server not running on port ${PORT} Please start your development server: - Rails: `bin/dev` or `rails server` - Node/Next.js: `npm run dev` -- Custom port: `/test-browser --port ` +- Custom port: run this skill again with `--port ` -Then run `/test-browser` again. +Then re-run this skill. ``` - +### 7. Test Each Affected Page -### 6. Test Each Affected Page +For each affected route: - - -For each affected route, use agent-browser CLI commands (NOT Chrome MCP): - -**Step 1: Navigate and capture snapshot** +**Navigate and capture snapshot:** ```bash agent-browser open "http://localhost:${PORT}/[route]" agent-browser snapshot -i ``` -**Step 2: For headed mode (visual debugging)** +**For headed mode:** ```bash agent-browser --headed open "http://localhost:${PORT}/[route]" agent-browser --headed snapshot -i ``` -**Step 3: Verify key elements** +**Verify key elements:** - Use `agent-browser snapshot -i` to get interactive elements with refs - Page title/heading present - Primary content rendered - No error messages visible - Forms have expected fields -**Step 4: Test critical interactions** +**Test critical interactions:** ```bash -agent-browser click @e1 # Use ref from snapshot +agent-browser click @e1 agent-browser snapshot -i ``` -**Step 5: Take screenshots** +**Take screenshots:** ```bash agent-browser screenshot page-name.png -agent-browser screenshot --full page-name-full.png # Full page +agent-browser screenshot --full page-name-full.png ``` - +### 8. Human Verification (When Required) -### 7. Human Verification (When Required) - - - -Pause for human input when testing touches: +Pause for human input when testing touches flows that require external interaction: | Flow Type | What to Ask | |-----------|-------------| @@ -254,11 +193,12 @@ Pause for human input when testing touches: | SMS | "Verify you received the SMS code" | | External APIs | "Confirm the [service] integration is working" | -Use AskUserQuestion: -```markdown -**Human Verification Needed** +Ask the user (using the platform's question tool, or present numbered options and wait): -This test touches the [flow type]. Please: +``` +Human Verification Needed + +This test touches [flow type]. Please: 1. [Action to take] 2. [What to verify] @@ -267,11 +207,7 @@ Did it work correctly? 2. No - describe the issue ``` - - -### 8. Handle Failures - - +### 9. Handle Failures When a test fails: @@ -279,9 +215,10 @@ When a test fails: - Screenshot the error state: `agent-browser screenshot error.png` - Note the exact reproduction steps -2. **Ask user how to proceed:** - ```markdown - **Test Failed: [route]** +2. **Ask the user how to proceed:** + + ``` + Test Failed: [route] Issue: [description] Console errors: [if any] @@ -292,27 +229,13 @@ When a test fails: 3. Skip - Continue testing other pages ``` -3. **If "Fix now":** - - Investigate the issue - - Propose a fix - - Apply fix - - Re-run the failing test +3. **If "Fix now":** investigate, propose a fix, apply, re-run the failing test +4. **If "Create todo":** create `{id}-pending-p1-browser-test-{description}.md`, continue +5. **If "Skip":** log as skipped, continue -4. **If "Create todo":** - - Create `{id}-pending-p1-browser-test-{description}.md` - - Continue testing +### 10. Test Summary -5. **If "Skip":** - - Log as skipped - - Continue testing - - - -### 9. Test Summary - - - -After all tests complete, present summary: +After all tests complete, present a summary: ```markdown ## Browser Test Results @@ -345,8 +268,6 @@ After all tests complete, present summary: ### Result: [PASS / FAIL / PARTIAL] ``` - - ## Quick Usage Examples ```bash @@ -365,8 +286,6 @@ After all tests complete, present summary: ## agent-browser CLI Reference -**ALWAYS use these Bash commands. NEVER use mcp__claude-in-chrome__* tools.** - ```bash # Navigation agent-browser open # Navigate to URL diff --git a/plugins/compound-engineering/skills/test-xcode/SKILL.md b/plugins/compound-engineering/skills/test-xcode/SKILL.md index 10cba1b..9ccc3ee 100644 --- a/plugins/compound-engineering/skills/test-xcode/SKILL.md +++ b/plugins/compound-engineering/skills/test-xcode/SKILL.md @@ -5,167 +5,81 @@ argument-hint: "[scheme name or 'current' to use default]" disable-model-invocation: true --- -# Xcode Test Command +# Xcode Test Skill -Build, install, and test iOS apps on the simulator using XcodeBuildMCP. Captures screenshots, logs, and verifies app behavior. - -## Introduction - -iOS QA Engineer specializing in simulator-based testing - -This command tests iOS/macOS apps by: -- Building for simulator -- Installing and launching the app -- Taking screenshots of key screens -- Capturing console logs for errors -- Supporting human verification for external flows +Build, install, and test iOS apps on the simulator using XcodeBuildMCP. Captures screenshots, logs, and verifies app behavior. ## Prerequisites - - Xcode installed with command-line tools -- XcodeBuildMCP server connected +- XcodeBuildMCP MCP server connected - Valid Xcode project or workspace - At least one iOS Simulator available - -## Main Tasks +## Workflow -### 0. Verify XcodeBuildMCP is Installed +### 0. Verify XcodeBuildMCP is Available - +Check that the XcodeBuildMCP MCP server is connected by calling its `list_simulators` tool. -**First, check if XcodeBuildMCP tools are available.** +MCP tool names vary by platform: +- Claude Code: `mcp__xcodebuildmcp__list_simulators` +- Other platforms: use the equivalent MCP tool call for the `XcodeBuildMCP` server's `list_simulators` method + +If the tool is not found or errors, inform the user they need to add the XcodeBuildMCP MCP server: -Try calling: ``` -mcp__xcodebuildmcp__list_simulators({}) +XcodeBuildMCP not installed + +Install via Homebrew: + brew tap getsentry/xcodebuildmcp && brew install xcodebuildmcp + +Or via npx (no global install needed): + npx -y xcodebuildmcp@latest mcp + +Then add "XcodeBuildMCP" as an MCP server in your agent configuration +and restart your agent. ``` -**If the tool is not found or errors:** - -Tell the user: -```markdown -**XcodeBuildMCP not installed** - -Please install the XcodeBuildMCP server first: - -\`\`\`bash -claude mcp add XcodeBuildMCP -- npx xcodebuildmcp@latest -\`\`\` - -Then restart Claude Code and run `/xcode-test` again. -``` - -**Do NOT proceed** until XcodeBuildMCP is confirmed working. - - +Do NOT proceed until XcodeBuildMCP is confirmed working. ### 1. Discover Project and Scheme - +Call XcodeBuildMCP's `discover_projs` tool to find available projects, then `list_schemes` with the project path to get available schemes. -**Find available projects:** -``` -mcp__xcodebuildmcp__discover_projs({}) -``` - -**List schemes for the project:** -``` -mcp__xcodebuildmcp__list_schemes({ project_path: "/path/to/Project.xcodeproj" }) -``` - -**If argument provided:** -- Use the specified scheme name -- Or "current" to use the default/last-used scheme - - +If an argument was provided, use that scheme name. If "current", use the default/last-used scheme. ### 2. Boot Simulator - +Call `list_simulators` to find available simulators. Boot the preferred simulator (iPhone 15 Pro recommended) using `boot_simulator` with the simulator's UUID. -**List available simulators:** -``` -mcp__xcodebuildmcp__list_simulators({}) -``` - -**Boot preferred simulator (iPhone 15 Pro recommended):** -``` -mcp__xcodebuildmcp__boot_simulator({ simulator_id: "[uuid]" }) -``` - -**Wait for simulator to be ready:** -Check simulator state before proceeding with installation. - - +Wait for the simulator to be ready before proceeding. ### 3. Build the App - +Call `build_ios_sim_app` with the project path and scheme name. -**Build for iOS Simulator:** -``` -mcp__xcodebuildmcp__build_ios_sim_app({ - project_path: "/path/to/Project.xcodeproj", - scheme: "[scheme_name]" -}) -``` - -**Handle build failures:** +**On failure:** - Capture build errors -- Create P1 todo for each build error +- Create a P1 todo for each build error - Report to user with specific error details **On success:** - Note the built app path for installation -- Proceed to installation step - - +- Proceed to step 4 ### 4. Install and Launch - - -**Install app on simulator:** -``` -mcp__xcodebuildmcp__install_app_on_simulator({ - app_path: "/path/to/built/App.app", - simulator_id: "[uuid]" -}) -``` - -**Launch the app:** -``` -mcp__xcodebuildmcp__launch_app_on_simulator({ - bundle_id: "[app.bundle.id]", - simulator_id: "[uuid]" -}) -``` - -**Start capturing logs:** -``` -mcp__xcodebuildmcp__capture_sim_logs({ - simulator_id: "[uuid]", - bundle_id: "[app.bundle.id]" -}) -``` - - +1. Call `install_app_on_simulator` with the built app path and simulator UUID +2. Call `launch_app_on_simulator` with the bundle ID and simulator UUID +3. Call `capture_sim_logs` with the simulator UUID and bundle ID to start log capture ### 5. Test Key Screens - - For each key screen in the app: **Take screenshot:** -``` -mcp__xcodebuildmcp__take_screenshot({ - simulator_id: "[uuid]", - filename: "screen-[name].png" -}) -``` +Call `take_screenshot` with the simulator UUID and a descriptive filename (e.g., `screen-home.png`). **Review screenshot for:** - UI elements rendered correctly @@ -174,23 +88,15 @@ mcp__xcodebuildmcp__take_screenshot({ - Layout looks correct **Check logs for errors:** -``` -mcp__xcodebuildmcp__get_sim_logs({ simulator_id: "[uuid]" }) -``` - -Look for: +Call `get_sim_logs` with the simulator UUID. Look for: - Crashes - Exceptions - Error-level log messages - Failed network requests - - ### 6. Human Verification (When Required) - - -Pause for human input when testing touches: +Pause for human input when testing touches flows that require device interaction. | Flow Type | What to Ask | |-----------|-------------| @@ -200,9 +106,10 @@ Pause for human input when testing touches: | Camera/Photos | "Grant permissions and verify camera works" | | Location | "Allow location access and verify map updates" | -Use AskUserQuestion: -```markdown -**Human Verification Needed** +Ask the user (using the platform's question tool — e.g., `AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini — or present numbered options and wait): + +``` +Human Verification Needed This test requires [flow type]. Please: 1. [Action to take on simulator] @@ -213,12 +120,8 @@ Did it work correctly? 2. No - describe the issue ``` - - ### 7. Handle Failures - - When a test fails: 1. **Document the failure:** @@ -226,9 +129,10 @@ When a test fails: - Capture console logs - Note reproduction steps -2. **Ask user how to proceed:** - ```markdown - **Test Failed: [screen/feature]** +2. **Ask the user how to proceed:** + + ``` + Test Failed: [screen/feature] Issue: [description] Logs: [relevant error messages] @@ -239,47 +143,38 @@ When a test fails: 3. Skip - Continue testing other screens ``` -3. **If "Fix now":** - - Investigate the issue in code - - Propose a fix - - Rebuild and retest - -4. **If "Create todo":** - - Create `{id}-pending-p1-xcode-{description}.md` - - Continue testing - - +3. **If "Fix now":** investigate, propose a fix, rebuild and retest +4. **If "Create todo":** create `{id}-pending-p1-xcode-{description}.md`, continue +5. **If "Skip":** log as skipped, continue ### 8. Test Summary - - -After all tests complete, present summary: +After all tests complete, present a summary: ```markdown -## 📱 Xcode Test Results +## Xcode Test Results **Project:** [project name] **Scheme:** [scheme name] **Simulator:** [simulator name] -### Build: ✅ Success / ❌ Failed +### Build: Success / Failed ### Screens Tested: [count] | Screen | Status | Notes | |--------|--------|-------| -| Launch | ✅ Pass | | -| Home | ✅ Pass | | -| Settings | ❌ Fail | Crash on tap | -| Profile | ⏭️ Skip | Requires login | +| Launch | Pass | | +| Home | Pass | | +| Settings | Fail | Crash on tap | +| Profile | Skip | Requires login | ### Console Errors: [count] - [List any errors found] ### Human Verifications: [count] -- Sign in with Apple: ✅ Confirmed -- Push notifications: ✅ Confirmed +- Sign in with Apple: Confirmed +- Push notifications: Confirmed ### Failures: [count] - Settings screen - crash on navigation @@ -290,43 +185,26 @@ After all tests complete, present summary: ### Result: [PASS / FAIL / PARTIAL] ``` - - ### 9. Cleanup - - After testing: -**Stop log capture:** -``` -mcp__xcodebuildmcp__stop_log_capture({ simulator_id: "[uuid]" }) -``` - -**Optionally shut down simulator:** -``` -mcp__xcodebuildmcp__shutdown_simulator({ simulator_id: "[uuid]" }) -``` - - +1. Call `stop_log_capture` with the simulator UUID +2. Optionally call `shutdown_simulator` with the simulator UUID ## Quick Usage Examples ```bash # Test with default scheme -/xcode-test +/test-xcode # Test specific scheme -/xcode-test MyApp-Debug +/test-xcode MyApp-Debug # Test after making changes -/xcode-test current +/test-xcode current ``` -## Integration with /ce:review +## Integration with ce:review -When reviewing PRs that touch iOS code, the `/ce:review` command can spawn this as a subagent: - -``` -Task general-purpose("Run /xcode-test for scheme [name]. Build, install on simulator, test key screens, check for crashes.") -``` +When reviewing PRs that touch iOS code, the `ce:review` workflow can spawn an agent to run this skill, build on the simulator, test key screens, and check for crashes.