pressure test pr feedback

2026-02-16 15:59:42 -06:00
parent 25543e66f5
commit b0755f4050
7 changed files with 451 additions and 12 deletions
--- a/plugins/compound-engineering/commands/workflows/review.md
+++ b/plugins/compound-engineering/commands/workflows/review.md
@@ -214,7 +214,53 @@ Remove duplicates, prioritize by severity and impact.

 </synthesis_tasks>

-#### Step 2: Create Todo Files Using file-todos Skill
+#### Step 2: Pressure Test Each Finding
+
+<critical_evaluation>
+
+**IMPORTANT: Treat agent findings as suggestions, not mandates.**
+
+Not all findings are equally valid. Apply engineering judgment before creating todos. The goal is to make the right call for the codebase, not rubber-stamp every suggestion.
+
+**For each finding, verify:**
+
+| Check | Question |
+|-------|----------|
+| **Code** | Does the concern actually apply to this specific code? |
+| **Tests** | Are there existing tests that already cover this case? |
+| **Usage** | How is this code used in practice? Does the concern matter? |
+| **Compatibility** | Would the suggested change break anything? |
+| **Prior Decisions** | Was this intentional? Is there a documented reason? |
+| **Cost vs Benefit** | Is the fix worth the effort and risk? |
+
+**Assess each finding:**
+
+| Assessment | Meaning |
+|------------|---------|
+| **Clear & Correct** | Valid concern, well-reasoned, applies here |
+| **Unclear** | Ambiguous or missing context |
+| **Likely Incorrect** | Agent misunderstands code, context, or requirements |
+| **YAGNI** | Over-engineering, premature abstraction, no clear benefit |
+| **Duplicate** | Already covered by another finding (merge into existing) |
+
+**IMPORTANT: ALL findings become todos.** Never drop agent feedback - include the pressure test assessment IN each todo so `/triage` can use it.
+
+Each todo will include:
+- The assessment (Clear & Correct / Unclear / Likely Incorrect / YAGNI)
+- The verification results (what was checked)
+- Technical justification (why valid, or why you think it should be skipped)
+- Recommended action for triage (Fix now / Clarify / Push back / Skip)
+
+**Provide technical justification for all assessments:**
+- Don't just label - explain WHY with specific reasoning
+- Reference codebase constraints, requirements, or trade-offs
+- Example: "This abstraction would be YAGNI - we only have one implementation and no plans for variants. Adding it now increases complexity without clear benefit."
+
+The human reviews during `/triage` and makes the final call.
+
+</critical_evaluation>
+
+#### Step 3: Create Todo Files Using file-todos Skill

 <critical_instruction> Use the file-todos skill to create todo files for ALL findings immediately. Do NOT present findings one-by-one asking for user approval. Create all todo files in parallel using the skill, then summarize results to user. </critical_instruction>

@@ -224,7 +270,7 @@ Remove duplicates, prioritize by severity and impact.

 - Create todo files directly using Write tool
 - All findings in parallel for speed
- Use standard template from `.claude/skills/file-todos/assets/todo-template.md`
+- Invoke `Skill: "compound-engineering:file-todos"` and read the template from its assets directory
 - Follow naming convention: `{issue_id}-pending-{priority}-{description}.md`

 **Option B: Sub-Agents in Parallel (Recommended for Scale)** For large PRs with 15+ findings, use sub-agents to create finding files in parallel:
@@ -266,13 +312,13 @@ Sub-agents can:

 2. Use file-todos skill for structured todo management:

-   ```bash
-   skill: file-todos
+   ```
+   Skill: "compound-engineering:file-todos"
   ```

   The skill provides:

-   - Template location: `.claude/skills/file-todos/assets/todo-template.md`
+   - Template at `./assets/todo-template.md` (relative to skill directory)
   - Naming convention: `{issue_id}-{status}-{priority}-{description}.md`
   - YAML frontmatter structure: status, priority, issue_id, tags, dependencies
   - All required sections: Problem Statement, Findings, Solutions, etc.
@@ -292,7 +338,7 @@ Sub-agents can:
   004-pending-p3-unused-parameter.md
   ```

-5. Follow template structure from file-todos skill: `.claude/skills/file-todos/assets/todo-template.md`
+5. Follow template structure from file-todos skill (read `./assets/todo-template.md` from skill directory)

 **Todo File Structure (from template):**

@@ -300,6 +346,10 @@ Each todo must include:

 - **YAML frontmatter**: status, priority, issue_id, tags, dependencies
 - **Problem Statement**: What's broken/missing, why it matters
+- **Assessment (Pressure Test)**: Verification results and engineering judgment
+  - Assessment: Clear & Correct / Unclear / YAGNI
+  - Verified: Code, Tests, Usage, Prior Decisions
+  - Technical Justification: Why this finding is valid (or why skipped)
 - **Findings**: Discoveries from agents with evidence/location
 - **Proposed Solutions**: 2-3 options, each with pros/cons/effort/risk
 - **Recommended Action**: (Filled during triage, leave blank initially)
@@ -333,7 +383,7 @@ Examples:

 **Tagging:** Always add `code-review` tag, plus: `security`, `performance`, `architecture`, `rails`, `quality`, etc.

-#### Step 3: Summary Report
+#### Step 4: Summary Report

 After creating all todo files, present comprehensive summary:

@@ -367,13 +417,27 @@ After creating all todo files, present comprehensive summary:

 ### Review Agents Used:

- kieran-rails-reviewer
+- kieran-python-reviewer
 - security-sentinel
 - performance-oracle
 - architecture-strategist
 - agent-native-reviewer
 - [other agents]

+### Assessment Summary (Pressure Test Results):
+
+All agent findings were pressure tested and included in todos:
+
+| Assessment | Count | Description |
+|------------|-------|-------------|
+| **Clear & Correct** | {X} | Valid concerns, recommend fixing |
+| **Unclear** | {X} | Need clarification before implementing |
+| **Likely Incorrect** | {X} | May misunderstand context - review during triage |
+| **YAGNI** | {X} | May be over-engineering - review during triage |
+| **Duplicate** | {X} | Merged into other findings |
+
+**Note:** All assessments are included in the todo files. Human judgment during `/triage` makes the final call on whether to accept, clarify, or reject each item.
+
 ### Next Steps:

 1. **Address P1 Findings**: CRITICAL - must be fixed before merge