diff --git a/plugins/compound-engineering/skills/document-review/SKILL.md b/plugins/compound-engineering/skills/document-review/SKILL.md index 468ced8..4d8c5a6 100644 --- a/plugins/compound-engineering/skills/document-review/SKILL.md +++ b/plugins/compound-engineering/skills/document-review/SKILL.md @@ -121,6 +121,7 @@ Fingerprint each finding using `normalize(section) + normalize(title)`. Normaliz When fingerprints match across personas: - If the findings recommend **opposing actions** (e.g., one says cut, the other says keep), do not merge -- preserve both for contradiction resolution in 3.5 - Otherwise merge: keep the highest severity, keep the highest confidence, union all evidence arrays, note all agreeing reviewers (e.g., "coherence, feasibility") +- **Coverage attribution:** Attribute the merged finding to the persona with the highest confidence. Decrement the losing persona's Findings count *and* the corresponding route bucket (Auto, Batch, or Present) so `Findings = Auto + Batch + Present` stays exact. ### 3.4 Promote Residual Concerns @@ -143,15 +144,17 @@ Specific conflict patterns: ### 3.6 Route by Autofix Class +**Severity and autofix_class are independent.** A P1 finding can be `auto` if the correct fix is deterministic. The test is not "how important?" but "can the fix be derived from the document's own content without judgment?" + | Autofix Class | Route | |---------------|-------| -| `auto` | Apply automatically -- local deterministic fix (terminology, formatting, cross-references, completeness corrections where the correct value is verifiable from the document itself) | -| `batch_confirm` | Group for single batch approval -- obvious fixes that touch meaning but have one clear correct answer | +| `auto` | Apply automatically -- fix is derivable from the document itself. One part of the document is clearly authoritative over another; reconcile toward the authority. | +| `batch_confirm` | Group for single batch approval -- one clear correct answer, but authors new content where exact wording needs verification | | `present` | Present individually for user judgment | Demote any `auto` finding that lacks a `suggested_fix` to `batch_confirm`. Demote any `batch_confirm` finding that lacks a `suggested_fix` to `present`. -**Completeness corrections eligible for `auto`:** A finding qualifies when the correct fix is deterministically derivable from other content in the document. Examples: a count says "6 units" but the document lists 7, a summary omits an item that appears in the detailed list, a cross-reference points to a renamed section. If the fix requires judgment about *what* to add (not just *that* something is missing), it belongs in `batch_confirm` or `present`. +**Auto-eligible patterns:** summary/detail mismatch (body is authoritative over overview), wrong counts, missing list entries derivable from elsewhere in the document, stale internal cross-references, terminology drift, prose/diagram contradictions where prose is more detailed. If the fix requires judgment about *what* to write (not just *that* something needs updating), it belongs in `batch_confirm` or `present`. ### 3.7 Sort @@ -168,12 +171,19 @@ Apply all `auto` findings to the document in a **single pass**: ### Batch Confirm -If any `batch_confirm` findings exist, present them as a group for a single approval: -- List the proposed fixes in a numbered table -- Use the platform's blocking question tool (AskUserQuestion in Claude Code, request_user_input in Codex, ask_user in Gemini) to ask: "Apply these N fixes? (yes/no/select)". If no blocking question tool is available, present the table with numbered options and wait for the user's reply before proceeding. -- If approved, apply all in a single pass -- If "select", let the user pick which to apply -- If rejected, demote remaining to the `present` findings list +If any `batch_confirm` findings exist: + +1. Present the proposed fixes in a numbered table (see template) +2. **Ask for approval using the platform's interactive question tool** -- do not print the question as plain text output: + - Claude Code: `AskUserQuestion` + - Codex: `request_user_input` + - Gemini: `ask_user` + - Fallback (no question tool available): present numbered options and stop; wait for the user's next message before proceeding +3. Question text: "Apply these N fixes? (yes/no/select)" +4. Handle the response: + - **yes**: Apply all in a single pass + - **select**: Let the user pick which to apply + - **no**: Demote remaining to the `present` findings list This turns N obvious-but-meaning-touching fixes into 1 interaction instead of N. @@ -198,7 +208,11 @@ These are pipeline artifacts and must not be flagged for removal. ## Phase 5: Next Action -Use the platform's blocking question tool when available (AskUserQuestion in Claude Code, request_user_input in Codex, ask_user in Gemini). Otherwise present numbered options and wait for the user's reply. +**Ask using the platform's interactive question tool** -- do not print the question as plain text output: +- Claude Code: `AskUserQuestion` +- Codex: `request_user_input` +- Gemini: `ask_user` +- Fallback (no question tool available): present numbered options and stop; wait for the user's next message Offer: diff --git a/plugins/compound-engineering/skills/document-review/references/findings-schema.json b/plugins/compound-engineering/skills/document-review/references/findings-schema.json index 52515e4..1346e97 100644 --- a/plugins/compound-engineering/skills/document-review/references/findings-schema.json +++ b/plugins/compound-engineering/skills/document-review/references/findings-schema.json @@ -97,9 +97,10 @@ "P3": "Minor improvement. User's discretion." }, "autofix_classes": { - "auto": "Local, deterministic document fix: terminology consistency, formatting, cross-reference correction, completeness corrections (wrong counts, missing list entries, undefined values where the correct value is verifiable from elsewhere in the document). Must be unambiguous.", - "batch_confirm": "Obvious fix with a clear correct answer, but requires authoring new content where exact wording needs verification. Grouped with other batch_confirm findings for a single approval rather than individual review. Examples: adding a missing implementation step that is mechanically implied, rewording a section to resolve an ambiguity. Note: fixes that are deterministically derivable from existing document content (updating a summary to match its own list, correcting a count, adding group headers to requirements) belong in auto, not here.", - "present": "Requires individual user judgment -- strategic questions, design tradeoffs, meaning-changing fixes, or findings where reasonable people could disagree on the right action." + "_principle": "Autofix class is independent of severity. A P1 finding can be auto if the fix is deterministic. The test: can the correct fix be derived from the document's own content without judgment?", + "auto": "Fix is derivable from the document itself -- one part is clearly authoritative over another, reconcile toward the authority. Includes: summary/detail mismatches, wrong counts, missing list entries, stale cross-references, terminology drift, prose/diagram contradictions where prose is authoritative. Must include suggested_fix.", + "batch_confirm": "One clear correct answer, but authors new content where exact wording needs verification. Grouped for single approval. Examples: adding a missing step mechanically implied by other content, defining an implied-but-unstated threshold. Must include suggested_fix.", + "present": "Requires individual user judgment -- strategic questions, design tradeoffs, or findings where reasonable people could disagree on the right action." }, "finding_types": { "error": "Something the document says that is wrong -- contradictions, incorrect statements, design tensions, incoherent tradeoffs. These are mistakes in what exists.", diff --git a/plugins/compound-engineering/skills/document-review/references/review-output-template.md b/plugins/compound-engineering/skills/document-review/references/review-output-template.md index 4e585d9..8bf8eb9 100644 --- a/plugins/compound-engineering/skills/document-review/references/review-output-template.md +++ b/plugins/compound-engineering/skills/document-review/references/review-output-template.md @@ -94,4 +94,4 @@ These fixes have one clear correct answer but touch document meaning. Apply all? - **P0-P3 sections**: Only include sections that have findings. Omit empty severity levels. Within each severity, separate into **Errors** and **Omissions** sub-headers. Omit a sub-header if that severity has none of that type. - **Residual Concerns**: Findings below confidence threshold that were promoted by cross-persona corroboration, plus unpromoted residual risks. Omit if none. - **Deferred Questions**: Questions for later workflow stages. Omit if none. -- **Coverage**: Always include. Shows which personas ran and their output counts broken down by route (Auto, Batch, Present). +- **Coverage**: Always include. All counts are **post-synthesis**. **Findings** must equal Auto + Batch + Present exactly -- if deduplication merged a finding across personas, attribute it to the persona with the highest confidence and reduce the other persona's count. **Residual** = count of `residual_risks` from this persona's raw output (not the promoted subset in the Residual Concerns section). diff --git a/plugins/compound-engineering/skills/document-review/references/subagent-template.md b/plugins/compound-engineering/skills/document-review/references/subagent-template.md index 870a8c2..997dbd7 100644 --- a/plugins/compound-engineering/skills/document-review/references/subagent-template.md +++ b/plugins/compound-engineering/skills/document-review/references/subagent-template.md @@ -25,11 +25,18 @@ Rules: - Set `finding_type` for every finding: - `error`: Something the document says that is wrong -- contradictions, incorrect statements, design tensions, incoherent tradeoffs. - `omission`: Something the document forgot to say -- missing mechanical steps, absent list entries, undefined thresholds, forgotten cross-references. -- Set `autofix_class` conservatively: - - `auto`: Deterministic fixes where the correct value is verifiable from the document itself -- terminology corrections, formatting fixes, cross-reference repairs, wrong counts, missing list entries where the correct entry exists elsewhere in the document. The fix must be unambiguous. - - `batch_confirm`: Obvious fix with one clear correct answer, but it touches document meaning. Examples: adding a missing implementation step that is mechanically implied by other content, updating a summary to match its own details. Use when reasonable people would agree on the fix but it goes beyond cosmetic correction. - - `present`: Everything else -- strategic questions, tradeoffs, design tensions where reasonable people could disagree, informational findings. -- `suggested_fix` is optional. Only include it when the fix is obvious and correct. For `present` findings, frame as a question instead. +- Set `autofix_class` based on determinism, not severity. A P1 finding can be `auto` if the correct fix is derivable from the document itself: + - `auto`: The correct fix is derivable from the document's own content without judgment about what to write. The test: is one part of the document clearly authoritative over another? If yes, reconcile toward the authority. Examples: + - Summary/detail mismatch: overview says "3 phases" but body describes 4 in detail -- update the summary + - Wrong count: "the following 3 steps" but 4 are listed -- fix the count + - Missing list entry where the correct entry exists elsewhere in the document + - Stale internal reference: "as described in Phase 3" but content moved to Phase 4 -- fix the pointer + - Terminology drift: document uses both "pipeline" and "workflow" for the same concept -- standardize to the more frequent term + - Prose/diagram contradiction where prose is more detailed and authoritative -- update the diagram description to match + Always include `suggested_fix` for auto findings. + - `batch_confirm`: One clear correct answer, but it authors new content where exact wording needs verification. The test: would reasonable people agree on WHAT to fix but potentially disagree on the exact PHRASING? Examples: adding a missing implementation step that is mechanically implied by other content, defining a threshold that is implied but never stated explicitly. Always include `suggested_fix` for batch_confirm findings. + - `present`: Requires judgment -- strategic questions, tradeoffs, design tensions where reasonable people could disagree, findings where the right action is unclear. +- `suggested_fix` is required for `auto` and `batch_confirm` findings (see above). For `present` findings, `suggested_fix` is optional -- include it only when the fix is obvious, and frame as a question when the right action is unclear. - If you find no issues, return an empty findings array. Still populate residual_risks and deferred_questions if applicable. - Use your suppress conditions. Do not flag issues that belong to other personas.