feat(document-review): collapse batch_confirm tier into auto (#432)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 17:03:05 -07:00
parent 3706a9764b
commit 0f5715d562
4 changed files with 41 additions and 79 deletions
--- a/plugins/compound-engineering/skills/document-review/references/findings-schema.json
+++ b/plugins/compound-engineering/skills/document-review/references/findings-schema.json
@@ -45,8 +45,8 @@
          },
          "autofix_class": {
            "type": "string",
-            "enum": ["auto", "batch_confirm", "present"],
-            "description": "How this issue should be handled. auto = local deterministic fix applied silently (terminology, formatting, cross-references, completeness corrections). batch_confirm = obvious fix with a clear correct answer, but touches meaning enough to warrant grouped approval. present = requires individual user judgment."
+            "enum": ["auto", "present"],
+            "description": "How this issue should be handled. auto = one clear correct fix that can be applied silently (terminology, formatting, cross-references, completeness corrections, additions mechanically implied by other content). present = requires individual user judgment."
          },
          "finding_type": {
            "type": "string",
@@ -97,9 +97,8 @@
      "P3": "Minor improvement. User's discretion."
    },
    "autofix_classes": {
-      "_principle": "Autofix class is independent of severity. A P1 finding can be auto if the fix is deterministic. The test: can the correct fix be derived from the document's own content without judgment?",
-      "auto": "Fix is derivable from the document itself -- one part is clearly authoritative over another, reconcile toward the authority. Includes: summary/detail mismatches, wrong counts, missing list entries, stale cross-references, terminology drift, prose/diagram contradictions where prose is authoritative. Must include suggested_fix.",
-      "batch_confirm": "One clear correct answer, but authors new content where exact wording needs verification. Grouped for single approval. Examples: adding a missing step mechanically implied by other content, defining an implied-but-unstated threshold. Must include suggested_fix.",
+      "_principle": "Autofix class is independent of severity. A P1 finding can be auto if the fix is obvious. The test: is there one clear correct fix, or does resolving this require judgment?",
+      "auto": "One clear correct fix -- applied silently. Includes both internal reconciliation (summary/detail mismatches, wrong counts, stale cross-references, terminology drift) and additions mechanically implied by other content (missing steps, unstated thresholds, completeness gaps where the correct content is obvious). Must include suggested_fix.",
      "present": "Requires individual user judgment -- strategic questions, design tradeoffs, or findings where reasonable people could disagree on the right action."
    },
    "finding_types": {
--- a/plugins/compound-engineering/skills/document-review/references/review-output-template.md
+++ b/plugins/compound-engineering/skills/document-review/references/review-output-template.md
@@ -15,22 +15,15 @@ Use this **exact format** when presenting synthesized review findings. Findings
 - security-lens -- plan adds public API endpoint with auth flow
 - scope-guardian -- plan has 15 requirements across 3 priority levels

-Applied 3 auto-fixes. Batched 2 fixes for approval. 4 findings to consider (2 errors, 2 omissions).
+Applied 5 auto-fixes. 4 findings to consider (2 errors, 2 omissions).

 ### Auto-fixes Applied

 - Standardized "pipeline"/"workflow" terminology to "pipeline" throughout (coherence)
 - Fixed cross-reference: Section 4 referenced "Section 3.2" which is actually "Section 3.1" (coherence)
 - Updated unit count from "6 units" to "7 units" to match listed units (coherence)
-
-### Batch Confirm
-
-These fixes have one clear correct answer but touch document meaning. Apply all?
-
-| # | Section | Fix | Reviewer |
-|---|---------|-----|----------|
-| 1 | Unit 4 | Add "update API rate-limit config" step -- implied by Unit 3's rate-limit introduction | feasibility |
-| 2 | Verification | Add auth token refresh to test scenarios -- required by Unit 2's token expiry handling | security-lens |
+- Added "update API rate-limit config" step to Unit 4 -- implied by Unit 3's rate-limit introduction (feasibility)
+- Added auth token refresh to test scenarios -- required by Unit 2's token expiry handling (security-lens)

 ### P0 -- Must Fix

@@ -76,22 +69,21 @@ These fixes have one clear correct answer but touch document meaning. Apply all?

 ### Coverage

-| Persona | Status | Findings | Auto | Batch | Present | Residual |
-|---------|--------|----------|------|-------|---------|----------|
-| coherence | completed | 3 | 2 | 0 | 1 | 0 |
-| feasibility | completed | 2 | 0 | 1 | 1 | 1 |
-| security-lens | completed | 2 | 0 | 1 | 1 | 0 |
-| scope-guardian | completed | 1 | 0 | 0 | 1 | 0 |
-| product-lens | not activated | -- | -- | -- | -- | -- |
-| design-lens | not activated | -- | -- | -- | -- | -- |
+| Persona | Status | Findings | Auto | Present | Residual |
+|---------|--------|----------|------|---------|----------|
+| coherence | completed | 4 | 3 | 1 | 0 |
+| feasibility | completed | 2 | 1 | 1 | 1 |
+| security-lens | completed | 2 | 1 | 1 | 0 |
+| scope-guardian | completed | 1 | 0 | 1 | 0 |
+| product-lens | not activated | -- | -- | -- | -- |
+| design-lens | not activated | -- | -- | -- | -- |
 ```

 ## Section Rules

- **Summary line**: Always present after the reviewer list. Format: "Applied N auto-fixes. Batched M fixes for approval. K findings to consider (X errors, Y omissions)." Omit any zero clause.
- **Auto-fixes Applied**: List fixes that were applied automatically (auto class). Omit section if none.
- **Batch Confirm**: Group `batch_confirm` findings for a single yes/no/select approval. Omit section if none.
+- **Summary line**: Always present after the reviewer list. Format: "Applied N auto-fixes. K findings to consider (X errors, Y omissions)." Omit any zero clause.
+- **Auto-fixes Applied**: List all fixes that were applied automatically (auto class). Include enough detail per fix to convey the substance -- especially for fixes that add content or touch document meaning. Omit section if none.
 - **P0-P3 sections**: Only include sections that have findings. Omit empty severity levels. Within each severity, separate into **Errors** and **Omissions** sub-headers. Omit a sub-header if that severity has none of that type.
 - **Residual Concerns**: Findings below confidence threshold that were promoted by cross-persona corroboration, plus unpromoted residual risks. Omit if none.
 - **Deferred Questions**: Questions for later workflow stages. Omit if none.
- **Coverage**: Always include. All counts are **post-synthesis**. **Findings** must equal Auto + Batch + Present exactly -- if deduplication merged a finding across personas, attribute it to the persona with the highest confidence and reduce the other persona's count. **Residual** = count of `residual_risks` from this persona's raw output (not the promoted subset in the Residual Concerns section).
+- **Coverage**: Always include. All counts are **post-synthesis**. **Findings** must equal Auto + Present exactly -- if deduplication merged a finding across personas, attribute it to the persona with the highest confidence and reduce the other persona's count. **Residual** = count of `residual_risks` from this persona's raw output (not the promoted subset in the Residual Concerns section).
--- a/plugins/compound-engineering/skills/document-review/references/subagent-template.md
+++ b/plugins/compound-engineering/skills/document-review/references/subagent-template.md
@@ -25,18 +25,14 @@ Rules:
 - Set `finding_type` for every finding:
  - `error`: Something the document says that is wrong -- contradictions, incorrect statements, design tensions, incoherent tradeoffs.
  - `omission`: Something the document forgot to say -- missing mechanical steps, absent list entries, undefined thresholds, forgotten cross-references.
- Set `autofix_class` based on determinism, not severity. A P1 finding can be `auto` if the correct fix is derivable from the document itself:
-  - `auto`: The correct fix is derivable from the document's own content without judgment about what to write. The test: is one part of the document clearly authoritative over another? If yes, reconcile toward the authority. Examples:
-    - Summary/detail mismatch: overview says "3 phases" but body describes 4 in detail -- update the summary
-    - Wrong count: "the following 3 steps" but 4 are listed -- fix the count
-    - Missing list entry where the correct entry exists elsewhere in the document
-    - Stale internal reference: "as described in Phase 3" but content moved to Phase 4 -- fix the pointer
-    - Terminology drift: document uses both "pipeline" and "workflow" for the same concept -- standardize to the more frequent term
-    - Prose/diagram contradiction where prose is more detailed and authoritative -- update the diagram description to match
+- Set `autofix_class` based on whether there is one clear correct fix, not on severity. A P1 finding can be `auto` if the fix is obvious:
+  - `auto`: One clear correct fix. Applied silently without asking. The test: is there only one reasonable way to resolve this? If yes, it is auto. Two categories:
+    - Internal reconciliation: one part of the document is authoritative over another -- reconcile toward the authority. Examples: summary/detail mismatches, wrong counts, missing list entries derivable from elsewhere, stale cross-references, terminology drift, prose/diagram contradictions where prose is authoritative.
+    - Implied additions: the correct content is mechanically obvious from the document's own context. Examples: adding a missing implementation step implied by other content, defining a threshold implied but never stated, completeness gaps where what to add is clear.
    Always include `suggested_fix` for auto findings.
-  - `batch_confirm`: One clear correct answer, but it authors new content where exact wording needs verification. The test: would reasonable people agree on WHAT to fix but potentially disagree on the exact PHRASING? Examples: adding a missing implementation step that is mechanically implied by other content, defining a threshold that is implied but never stated explicitly. Always include `suggested_fix` for batch_confirm findings.
+    NOT auto (the gap is clear but more than one reasonable fix exists): choosing an implementation approach when the document states a need without constraining how (e.g., "support offline mode" could mean service workers, local-first database, or queue-and-sync -- there is no single obvious answer), changing scope or priority where the author may have weighed tradeoffs the reviewer can't see (e.g., promoting a P2 to P1, or cutting a feature the document intentionally keeps at a lower tier).
  - `present`: Requires judgment -- strategic questions, tradeoffs, design tensions where reasonable people could disagree, findings where the right action is unclear.
- `suggested_fix` is required for `auto` and `batch_confirm` findings (see above). For `present` findings, `suggested_fix` is optional -- include it only when the fix is obvious, and frame as a question when the right action is unclear.
+- `suggested_fix` is required for `auto` findings. For `present` findings, `suggested_fix` is optional -- include it only when the fix is obvious, and frame as a question when the right action is unclear.
 - If you find no issues, return an empty findings array. Still populate residual_risks and deferred_questions if applicable.
 - Use your suppress conditions. Do not flag issues that belong to other personas.
 </output-contract>