diff --git a/plugins/compound-engineering/skills/ce-review/SKILL.md b/plugins/compound-engineering/skills/ce-review/SKILL.md index 261d2ce..b2b409c 100644 --- a/plugins/compound-engineering/skills/ce-review/SKILL.md +++ b/plugins/compound-engineering/skills/ce-review/SKILL.md @@ -435,10 +435,10 @@ Convert multiple reviewer JSON payloads into one deduplicated, confidence-gated ### Stage 6: Synthesize and present -Assemble the final report using the review output template included below: +Assemble the final report using **pipe-delimited markdown tables for findings** from the review output template included below. The table format is mandatory for finding rows in interactive mode — do not render findings as freeform text blocks or horizontal-rule-separated prose. Other report sections (Applied Fixes, Learnings, Coverage, etc.) use bullet lists and the `---` separator before the verdict, as shown in the template. 1. **Header.** Scope, intent, mode, reviewer team with per-conditional justifications. -2. **Findings.** Grouped by severity (P0, P1, P2, P3). Each finding shows file, issue, reviewer(s), confidence, and synthesized route. +2. **Findings.** Rendered as pipe-delimited tables grouped by severity (`### P0 -- Critical`, `### P1 -- High`, `### P2 -- Moderate`, `### P3 -- Low`). Each finding row shows `#`, file, issue, reviewer(s), confidence, and synthesized route. Omit empty severity levels. Never render findings as freeform text blocks or numbered lists. 3. **Requirements Completeness.** Include only when a plan was found in Stage 2b. For each requirement (R1, R2, etc.) and implementation unit in the plan, report whether corresponding work appears in the diff. Use a simple checklist: met / not addressed / partially addressed. Routing depends on `plan_source`: - **`explicit`** (caller-provided or PR body): Flag unaddressed requirements as P1 findings with `autofix_class: manual`, `owner: downstream-resolver`. These enter the residual actionable queue and can become todos. - **`inferred`** (auto-discovered): Flag unaddressed requirements as P3 findings with `autofix_class: advisory`, `owner: human`. These stay in the report only — no todos, no autonomous follow-up. An inferred plan match is a hint, not a contract. @@ -455,6 +455,8 @@ Assemble the final report using the review output template included below: Do not include time estimates. +**Format verification:** Before delivering the report, verify the findings sections use pipe-delimited table rows (`| # | File | Issue | ... |`) not freeform text. If you catch yourself rendering findings as prose blocks separated by horizontal rules or bullet points, stop and reformat into tables. + ### Headless output format In `mode:headless`, replace the interactive pipe-delimited table report with a structured text envelope. The envelope follows the same structural pattern as document-review's headless output (completion header, metadata block, findings grouped by autofix_class, trailing sections) while using ce:review's own section headings and per-finding fields. @@ -565,8 +567,9 @@ After presenting findings and verdict (Stage 6), route the next steps by mode. R **Interactive mode** - Apply `safe_auto -> review-fixer` findings automatically without asking. These are safe by definition. -- Ask a policy question only when `gated_auto` or `manual` findings remain after safe fixes: +- Ask a policy question **using the platform's blocking question tool** (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini) only when `gated_auto` or `manual` findings remain after safe fixes. Do not replace with a conversational open-ended question. Adapt the options to match what actually remains: + **When `gated_auto` findings are present** (with or without `manual`): ``` Safe fixes have been applied. What should I do with the remaining findings? 1. Review and approve specific gated fixes (Recommended) @@ -574,6 +577,14 @@ After presenting findings and verdict (Stage 6), route the next steps by mode. R 3. Report only -- no further action ``` + **When only `manual` findings remain** (no `gated_auto`): + ``` + Safe fixes have been applied. The remaining findings need manual resolution. What should I do? + 1. Leave as residual work (Recommended) + 2. Report only -- no further action + ``` + + If no blocking question tool is available, present the applicable numbered options as text and wait for the user's selection before proceeding. - If no `gated_auto` or `manual` findings remain after safe fixes, skip the policy question entirely — report what was fixed and proceed to next steps. - Only include `gated_auto` findings in the fixer queue after the user explicitly approves the specific items. Do not widen the queue based on severity alone. diff --git a/plugins/compound-engineering/skills/ce-review/references/findings-schema.json b/plugins/compound-engineering/skills/ce-review/references/findings-schema.json index 611912f..445d7ec 100644 --- a/plugins/compound-engineering/skills/ce-review/references/findings-schema.json +++ b/plugins/compound-engineering/skills/ce-review/references/findings-schema.json @@ -114,10 +114,10 @@ "P3": "Low-impact, narrow scope, minor improvement. User's discretion." }, "autofix_classes": { - "safe_auto": "Local, deterministic code or test fix suitable for the in-skill fixer in autonomous mode.", - "gated_auto": "Concrete fix exists, but it changes behavior, permissions, contracts, or other sensitive areas that deserve explicit approval.", - "manual": "Actionable issue that should become residual work rather than an in-skill autofix.", - "advisory": "Informational or operational item that should be surfaced in the report only." + "safe_auto": "Local, deterministic code or test fix suitable for the in-skill fixer. Examples: extract duplicated helper, add missing nil check, fix off-by-one, add missing test, remove dead code. Do not default to advisory when a concrete safe fix exists.", + "gated_auto": "Concrete fix exists, but it changes behavior, permissions, contracts, or other sensitive areas that deserve explicit approval. Examples: add auth to unprotected endpoint, change API response shape.", + "manual": "Actionable issue that requires design decisions or cross-cutting changes. Examples: redesign data model, add pagination strategy, choose between architectural approaches.", + "advisory": "Informational or operational item that should be surfaced in the report only. Examples: design asymmetry the PR improves but does not fully resolve, residual risk notes, deployment considerations." }, "owners": { "review-fixer": "The in-skill fixer can own this when policy allows.", diff --git a/plugins/compound-engineering/skills/ce-review/references/review-output-template.md b/plugins/compound-engineering/skills/ce-review/references/review-output-template.md index ec1a5d5..ee6cfd7 100644 --- a/plugins/compound-engineering/skills/ce-review/references/review-output-template.md +++ b/plugins/compound-engineering/skills/ce-review/references/review-output-template.md @@ -92,9 +92,30 @@ Use this **exact format** when presenting synthesized review findings. Findings > **Fix order:** P0 auth bypass -> P1 memory/pagination -> P2 error handling if straightforward ``` +## Anti-patterns + +Do NOT produce output like this. The following is wrong: + +```markdown +Findings + +Sev: P1 +File: foo.go:42 +Issue: Some problem description +Reviewer(s): adversarial +Confidence: 0.85 +Route: advisory -> human +──────────────────────────────────────── +Sev: P2 +File: bar.go:99 +Issue: Another problem +``` + +This fails because: no pipe-delimited tables, no severity-grouped `###` headers, uses box-drawing horizontal rules, no numbered findings, no `## Code Review Results` title, and the verdict is not in a blockquote. Always use the table format from the example above. + ## Formatting Rules -- **Pipe-delimited markdown tables** -- never ASCII box-drawing characters +- **Pipe-delimited markdown tables** for findings -- never ASCII box-drawing characters or per-finding horizontal-rule separators between entries (the report-level `---` before the verdict is still required) - **Severity-grouped sections** -- `### P0 -- Critical`, `### P1 -- High`, `### P2 -- Moderate`, `### P3 -- Low`. Omit empty severity levels. - **Always include file:line location** for code review issues - **Reviewer column** shows which persona(s) flagged the issue. Multiple reviewers = cross-reviewer agreement. diff --git a/plugins/compound-engineering/skills/ce-review/references/subagent-template.md b/plugins/compound-engineering/skills/ce-review/references/subagent-template.md index d68fb00..57d10fa 100644 --- a/plugins/compound-engineering/skills/ce-review/references/subagent-template.md +++ b/plugins/compound-engineering/skills/ce-review/references/subagent-template.md @@ -44,7 +44,12 @@ Rules: - Every finding MUST include at least one evidence item grounded in the actual code. - Set pre_existing to true ONLY for issues in unchanged code that are unrelated to this diff. If the diff makes the issue newly relevant, it is NOT pre-existing. - You are operationally read-only. You may use non-mutating inspection commands, including read-oriented `git` / `gh` commands, to gather evidence. Do not edit files, change branches, commit, push, create PRs, or otherwise mutate the checkout or repository state. -- Set `autofix_class` conservatively. Use `safe_auto` only when the fix is local, deterministic, and low-risk. Use `gated_auto` when a concrete fix exists but changes behavior/contracts/permissions. Use `manual` for actionable residual work. Use `advisory` for report-only items that should not become code-fix work. +- Set `autofix_class` accurately -- not every finding is `advisory`. Use this decision guide: + - `safe_auto`: The fix is local and deterministic — the fixer can apply it mechanically without design judgment. Examples: extracting a duplicated helper, adding a missing nil/null check, fixing an off-by-one, adding a missing test for an untested code path, removing dead code. + - `gated_auto`: A concrete fix exists but it changes contracts, permissions, or crosses a module boundary in a way that deserves explicit approval. Examples: adding authentication to an unprotected endpoint, changing a public API response shape, switching from soft-delete to hard-delete. + - `manual`: Actionable work that requires design decisions or cross-cutting changes. Examples: redesigning a data model, choosing between two valid architectural approaches, adding pagination to an unbounded query. + - `advisory`: Report-only items that should not become code-fix work. Examples: noting a design asymmetry the PR improves but doesn't fully resolve, flagging a residual risk, deployment notes. + Do not default to `advisory` when uncertain -- if a concrete fix is obvious, classify it as `safe_auto` or `gated_auto`. - Set `owner` to the default next actor for this finding: `review-fixer`, `downstream-resolver`, `human`, or `release`. - Set `requires_verification` to true whenever the likely fix needs targeted tests, a focused re-review, or operational validation before it should be trusted. - suggested_fix is optional. Only include it when the fix is obvious and correct. A bad suggestion is worse than none.