Merge step (f/g/h): retire todo system, retool pr-comments-review, rewrite ce-upstream-merge
Applies triage doc 021 in full. Legacy file-based todo system is gone; the two local workflows that depended on it are retooled around platform- native task tracking and ce-code-review-style walkthrough patterns. Deletions: - commands/resolve_todo_parallel.md — obsolete; ce-pr-comments-review now orchestrates resolution inline - commands/pr-comments-to-todos.md — replaced by ce-pr-comments-review skill - skills/ce-upstream-merge/assets/merge-triage-template.md — template now lives inline in SKILL.md since decisions live in the task tracker Retool: - New skill skills/ce-pr-comments-review/SKILL.md: fetch PR comments, pressure-test each via validator agent, walk through with AskUserQuestion (Fix now / Clarify / Push back / Skip), dispatch ce-pr-comment-resolver for accepted fixes. Uses TaskCreate instead of todos/*.md. Rewrite: - skills/ce-upstream-merge/SKILL.md: drops the "create todos in todos/" pattern entirely. Phase 2 builds a platform-task decision set instead. Phase 3 walks through with AskUserQuestion. Phase 4 splits execution into 7 named checkpoints (b-h) so the user can inspect and course- correct mid-flight. Adds explicit "hidden conflicts" detection in Phase 1 (upstream rename of file local deleted, upstream deletion of file local depends on, structural refactors). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,140 @@
|
||||
---
|
||||
name: ce-pr-comments-review
|
||||
description: "Fetch PR review comments, pressure-test each for validity against the actual codebase, then walk through a decision-per-finding (Fix now / Clarify / Push back / Skip) and apply fixes for accepted items. Every comment is scrutinized before any code changes."
|
||||
argument-hint: "[PR number, GitHub URL, or 'current' for current branch PR]"
|
||||
---
|
||||
|
||||
# PR Comments Review
|
||||
|
||||
Reviewer comments are **suggestions, not orders**. This skill fetches them, pressure-tests each one against the actual codebase, and asks you to accept, clarify, push back, or skip — one decision per finding. Only accepted items change code.
|
||||
|
||||
The flow mirrors `ce-code-review`'s walk-through pattern: validate first, decide per finding, resolve in parallel where safe.
|
||||
|
||||
## When to use
|
||||
|
||||
- After a human reviewer leaves comments on an open PR and you want an agent pass that **doesn't blindly apply everything**.
|
||||
- When a zip-agent or bot reviewer has left many comments and you need the validator pressure-test before acting on any of them.
|
||||
- Before declaring a PR "responded to" — as a forcing function that no comment got silently ignored.
|
||||
|
||||
## Input
|
||||
|
||||
<pr_target> #$ARGUMENTS </pr_target>
|
||||
|
||||
If the input above is empty, ask: "Which PR should we review? (number, URL, or 'current')". Default to `current` when unambiguous.
|
||||
|
||||
## Workflow
|
||||
|
||||
### Phase 1: Fetch
|
||||
|
||||
Resolve the PR reference:
|
||||
|
||||
- Numeric → PR number.
|
||||
- GitHub URL → extract PR number.
|
||||
- `current` or empty → `gh pr status` on the current branch.
|
||||
|
||||
Then gather:
|
||||
|
||||
```bash
|
||||
gh pr view <PR> --json number,title,url,author,headRefName,baseRefName
|
||||
gh api repos/{owner}/{repo}/pulls/<PR>/comments
|
||||
gh pr view <PR> --json reviews,reviewDecision
|
||||
```
|
||||
|
||||
Group comments by file + line + thread for context. Skip:
|
||||
|
||||
- Bare acknowledgments ("LGTM", "Looks good").
|
||||
- Questions that were answered inline in later replies.
|
||||
- Threads already marked resolved.
|
||||
|
||||
### Phase 2: Pressure-test (validate each comment)
|
||||
|
||||
**Do not create tasks or prompt the user yet.** First validate.
|
||||
|
||||
For each remaining comment, dispatch a validator agent. In Claude Code:
|
||||
|
||||
```
|
||||
Agent(subagent_type: "compound-engineering:review:ce-zip-agent-validator",
|
||||
description: "Validate PR comment",
|
||||
prompt: "<comment context + file + line + surrounding code>")
|
||||
```
|
||||
|
||||
The validator's job is to read the actual code and decide whether the comment applies. It returns a result of the shape:
|
||||
|
||||
| Field | Values |
|
||||
|-------|--------|
|
||||
| `assessment` | `Clear & Correct` / `Unclear` / `Likely Incorrect` / `YAGNI` |
|
||||
| `recommended_action` | `Fix now` / `Clarify` / `Push back` / `Skip` |
|
||||
| `verified_code` | Yes/No plus what was checked |
|
||||
| `verified_tests` | Yes/No plus existing coverage |
|
||||
| `verified_usage` | Yes/No plus how the code is used |
|
||||
| `prior_decisions` | Yes/No plus any intentional design |
|
||||
| `technical_justification` | Prose reasoning |
|
||||
|
||||
Prefer parallel dispatch for 2+ comments. For 5+ comments, batch in groups of 4 to keep context bounded.
|
||||
|
||||
Preserve every validator result — do not drop the `YAGNI` / `Likely Incorrect` outputs. The user sees all of them in Phase 4.
|
||||
|
||||
### Phase 3: Track as tasks
|
||||
|
||||
Use the platform's task tool (`TaskCreate` in Claude Code, `update_plan` in Codex, etc.) to register one task per validated comment. Task subject: `<priority>: <short title>`. Task description: include file:line, the validator's assessment, recommended action, and technical justification.
|
||||
|
||||
This replaces the old file-based `todos/*.md` layout. No files are written to disk.
|
||||
|
||||
### Phase 4: Walk-through (one decision per comment)
|
||||
|
||||
Present decisions one at a time using the platform's blocking question tool (call `ToolSearch` with `select:AskUserQuestion` first in Claude Code if the schema is not loaded; `request_user_input` in Codex; `ask_user` in Gemini and Pi via the `pi-ask-user` extension). Fall back to numbered options in chat only when no blocking tool is available.
|
||||
|
||||
For each task, show:
|
||||
|
||||
- The reviewer's original comment (verbatim).
|
||||
- File + line + a code snippet around the location.
|
||||
- The validator's assessment, technical justification, and recommended action.
|
||||
|
||||
Then ask:
|
||||
|
||||
**Stem:** `How should the agent handle this comment?`
|
||||
|
||||
**Options (the four match the validator's `recommended_action` vocabulary so the user can accept the recommendation with one click or override it):**
|
||||
|
||||
1. `Fix now` — apply the reviewer's suggestion (or a concrete variant).
|
||||
2. `Clarify` — reply to the reviewer asking for specifics; do not change code.
|
||||
3. `Push back` — reply disagreeing with the comment's premise; do not change code.
|
||||
4. `Skip` — take no action; mark the task complete with a skip reason.
|
||||
|
||||
Order findings: `Clear & Correct` and `Likely Incorrect` first (they drive the biggest decisions), then `Unclear`, then `YAGNI`. Within each, newer threads before older ones.
|
||||
|
||||
**Group related findings** when the same thread touches multiple lines or files — present them together in a single decision step, not sequentially.
|
||||
|
||||
### Phase 5: Resolve
|
||||
|
||||
Apply decisions in parallel where possible.
|
||||
|
||||
- **Fix now** → dispatch `compound-engineering:workflow:ce-pr-comment-resolver` per finding (or grouped set). Prefer parallel; fall back to sequential if dependencies exist. When the resolver is done, post a reply on the thread confirming the fix and include the commit SHA.
|
||||
- **Clarify** → post the clarifying reply on the thread via `gh api`. Do not mark the thread resolved.
|
||||
- **Push back** → post the disagreement (pulling the technical justification from the validator result) on the thread. Do not mark the thread resolved.
|
||||
- **Skip** → do nothing on the thread. Note the skip reason in the task summary.
|
||||
|
||||
If 5+ `Fix now` decisions are made, batch the resolvers in groups of 4, returning only short status summaries (file, line, fix applied, tests run).
|
||||
|
||||
### Phase 6: Summarize
|
||||
|
||||
Print a concise report:
|
||||
|
||||
- Count by decision: `N fixed | N clarified | N pushed back | N skipped`.
|
||||
- One-line per fix with file:line.
|
||||
- Any validator disagreements (user overrode the recommendation) flagged for future review.
|
||||
|
||||
If any `Fix now` commits landed, remind the user to push and mark the threads resolved on GitHub when appropriate:
|
||||
|
||||
```bash
|
||||
gh pr view <PR> --json reviewThreads
|
||||
# Resolve threads via GraphQL mutation as needed
|
||||
```
|
||||
|
||||
## Important rules
|
||||
|
||||
- **Every comment is pressure-tested.** No comment leaves Phase 2 without a validator assessment.
|
||||
- **User makes the call on every finding.** The validator recommends; the user decides.
|
||||
- **No file-based state.** Tasks live in the platform's task tracker; validator outputs live in task descriptions. Nothing writes to `todos/*.md`.
|
||||
- **Parallel where safe, sequential when required.** Fixes touching the same file must be serialized; independent fixes can run in parallel.
|
||||
- **Preserve reviewer quotes verbatim.** When replying on threads, the reviewer's original wording must be traceable in the context.
|
||||
Reference in New Issue
Block a user