Files
John Lamb aabad4a75a
Some checks failed
CI / pr-title (push) Has been cancelled
CI / test (push) Has been cancelled
Release PR / release-pr (push) Has been cancelled
Release PR / publish-cli (push) Has been cancelled
Merge step (i): reference sweep — fix stray old-convention refs
Applies triage doc 022. Most of the sweep was no-op because upstream already
renamed everything in place when we branched from origin/main. Remaining
fixes:

- ce-pr-comments-review/SKILL.md: replace compound-engineering:review:/
  workflow: prefixed agent refs with bare ce-<name> form (self-correction
  from step g where I accidentally used old convention)
- ce-essay-edit/SKILL.md: update /essay-outline reference to /ce-essay-outline

Prose references to generic words ("onboarding", "changelog") are left
alone — they are natural-language uses, not skill references.

Known follow-up: tests/review-skill-contract.test.ts and
tests/legacy-cleanup.test.ts still reference locally-removed agents
(ce-dhh-rails-reviewer, ce-kieran-rails-reviewer, ce-data-migration-expert,
ce-security-sentinel). These will need test updates + registration of
removed names in src/utils/legacy-cleanup.ts and
src/data/plugin-legacy-artifacts.ts — flagged for step (j).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:04:22 -05:00

141 lines
6.7 KiB
Markdown

---
name: ce-pr-comments-review
description: "Fetch PR review comments, pressure-test each for validity against the actual codebase, then walk through a decision-per-finding (Fix now / Clarify / Push back / Skip) and apply fixes for accepted items. Every comment is scrutinized before any code changes."
argument-hint: "[PR number, GitHub URL, or 'current' for current branch PR]"
---
# PR Comments Review
Reviewer comments are **suggestions, not orders**. This skill fetches them, pressure-tests each one against the actual codebase, and asks you to accept, clarify, push back, or skip — one decision per finding. Only accepted items change code.
The flow mirrors `ce-code-review`'s walk-through pattern: validate first, decide per finding, resolve in parallel where safe.
## When to use
- After a human reviewer leaves comments on an open PR and you want an agent pass that **doesn't blindly apply everything**.
- When a zip-agent or bot reviewer has left many comments and you need the validator pressure-test before acting on any of them.
- Before declaring a PR "responded to" — as a forcing function that no comment got silently ignored.
## Input
<pr_target> #$ARGUMENTS </pr_target>
If the input above is empty, ask: "Which PR should we review? (number, URL, or 'current')". Default to `current` when unambiguous.
## Workflow
### Phase 1: Fetch
Resolve the PR reference:
- Numeric → PR number.
- GitHub URL → extract PR number.
- `current` or empty → `gh pr status` on the current branch.
Then gather:
```bash
gh pr view <PR> --json number,title,url,author,headRefName,baseRefName
gh api repos/{owner}/{repo}/pulls/<PR>/comments
gh pr view <PR> --json reviews,reviewDecision
```
Group comments by file + line + thread for context. Skip:
- Bare acknowledgments ("LGTM", "Looks good").
- Questions that were answered inline in later replies.
- Threads already marked resolved.
### Phase 2: Pressure-test (validate each comment)
**Do not create tasks or prompt the user yet.** First validate.
For each remaining comment, dispatch a validator agent. In Claude Code:
```
Agent(subagent_type: "ce-zip-agent-validator",
description: "Validate PR comment",
prompt: "<comment context + file + line + surrounding code>")
```
The validator's job is to read the actual code and decide whether the comment applies. It returns a result of the shape:
| Field | Values |
|-------|--------|
| `assessment` | `Clear & Correct` / `Unclear` / `Likely Incorrect` / `YAGNI` |
| `recommended_action` | `Fix now` / `Clarify` / `Push back` / `Skip` |
| `verified_code` | Yes/No plus what was checked |
| `verified_tests` | Yes/No plus existing coverage |
| `verified_usage` | Yes/No plus how the code is used |
| `prior_decisions` | Yes/No plus any intentional design |
| `technical_justification` | Prose reasoning |
Prefer parallel dispatch for 2+ comments. For 5+ comments, batch in groups of 4 to keep context bounded.
Preserve every validator result — do not drop the `YAGNI` / `Likely Incorrect` outputs. The user sees all of them in Phase 4.
### Phase 3: Track as tasks
Use the platform's task tool (`TaskCreate` in Claude Code, `update_plan` in Codex, etc.) to register one task per validated comment. Task subject: `<priority>: <short title>`. Task description: include file:line, the validator's assessment, recommended action, and technical justification.
This replaces the old file-based `todos/*.md` layout. No files are written to disk.
### Phase 4: Walk-through (one decision per comment)
Present decisions one at a time using the platform's blocking question tool (call `ToolSearch` with `select:AskUserQuestion` first in Claude Code if the schema is not loaded; `request_user_input` in Codex; `ask_user` in Gemini and Pi via the `pi-ask-user` extension). Fall back to numbered options in chat only when no blocking tool is available.
For each task, show:
- The reviewer's original comment (verbatim).
- File + line + a code snippet around the location.
- The validator's assessment, technical justification, and recommended action.
Then ask:
**Stem:** `How should the agent handle this comment?`
**Options (the four match the validator's `recommended_action` vocabulary so the user can accept the recommendation with one click or override it):**
1. `Fix now` — apply the reviewer's suggestion (or a concrete variant).
2. `Clarify` — reply to the reviewer asking for specifics; do not change code.
3. `Push back` — reply disagreeing with the comment's premise; do not change code.
4. `Skip` — take no action; mark the task complete with a skip reason.
Order findings: `Clear & Correct` and `Likely Incorrect` first (they drive the biggest decisions), then `Unclear`, then `YAGNI`. Within each, newer threads before older ones.
**Group related findings** when the same thread touches multiple lines or files — present them together in a single decision step, not sequentially.
### Phase 5: Resolve
Apply decisions in parallel where possible.
- **Fix now** → dispatch `ce-pr-comment-resolver` per finding (or grouped set). Prefer parallel; fall back to sequential if dependencies exist. When the resolver is done, post a reply on the thread confirming the fix and include the commit SHA.
- **Clarify** → post the clarifying reply on the thread via `gh api`. Do not mark the thread resolved.
- **Push back** → post the disagreement (pulling the technical justification from the validator result) on the thread. Do not mark the thread resolved.
- **Skip** → do nothing on the thread. Note the skip reason in the task summary.
If 5+ `Fix now` decisions are made, batch the resolvers in groups of 4, returning only short status summaries (file, line, fix applied, tests run).
### Phase 6: Summarize
Print a concise report:
- Count by decision: `N fixed | N clarified | N pushed back | N skipped`.
- One-line per fix with file:line.
- Any validator disagreements (user overrode the recommendation) flagged for future review.
If any `Fix now` commits landed, remind the user to push and mark the threads resolved on GitHub when appropriate:
```bash
gh pr view <PR> --json reviewThreads
# Resolve threads via GraphQL mutation as needed
```
## Important rules
- **Every comment is pressure-tested.** No comment leaves Phase 2 without a validator assessment.
- **User makes the call on every finding.** The validator recommends; the user decides.
- **No file-based state.** Tasks live in the platform's task tracker; validator outputs live in task descriptions. Nothing writes to `todos/*.md`.
- **Parallel where safe, sequential when required.** Fixes touching the same file must be serialized; independent fixes can run in parallel.
- **Preserve reviewer quotes verbatim.** When replying on threads, the reviewer's original wording must be traceable in the context.