pressure test pr feedback
Some checks failed
CI / test (push) Has been cancelled

This commit is contained in:
John Lamb
2026-02-16 15:59:42 -06:00
parent 25543e66f5
commit b0755f4050
7 changed files with 451 additions and 12 deletions

View File

@@ -11,8 +11,8 @@
"plugins": [
{
"name": "compound-engineering",
"description": "AI-powered development tools that get smarter with every use. Make each unit of engineering work easier than the last. Includes 27 specialized agents, 24 commands, and 15 skills.",
"version": "2.29.0",
"description": "AI-powered development tools that get smarter with every use. Make each unit of engineering work easier than the last. Includes 27 specialized agents, 25 commands, and 15 skills.",
"version": "2.30.0",
"author": {
"name": "Kieran Klaassen",
"url": "https://github.com/kieranklaassen",

View File

@@ -1,7 +1,7 @@
{
"name": "compound-engineering",
"version": "2.29.0",
"description": "AI-powered development tools. 27 agents, 24 commands, 15 skills, 1 MCP server for code review, research, design, and workflow automation.",
"version": "2.30.0",
"description": "AI-powered development tools. 27 agents, 25 commands, 15 skills, 1 MCP server for code review, research, design, and workflow automation.",
"author": {
"name": "Kieran Klaassen",
"email": "kieran@every.to",

View File

@@ -5,6 +5,24 @@ All notable changes to the compound-engineering plugin will be documented in thi
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [2.30.0] - 2026-02-03
### Added
- **`/pr-comments-to-todos` command** - Fetch PR review comments and convert them into todo files for triage
- Fetches all comments from a GitHub PR using `gh` CLI
- Converts each actionable comment into a structured todo file following the file-todos skill format
- Assigns priority levels (P1/P2/P3) based on comment severity
- Creates todo files compatible with `/triage` command for approval workflow
- Skips non-actionable comments (LGTM, resolved threads, etc.)
- Supports PR number, GitHub URL, or "current" for current branch
### Summary
- 27 agents, 25 commands, 15 skills, 1 MCP server
---
## [2.29.0] - 2026-01-26
### Changed

View File

@@ -0,0 +1,334 @@
---
name: pr-comments-to-todos
description: Fetch PR comments and convert them into todo files for triage
argument-hint: "[PR number, GitHub URL, or 'current' for current branch PR]"
---
# PR Comments to Todos
Convert GitHub PR review comments into structured todo files compatible with `/triage`.
<command_purpose>Fetch all review comments from a PR and create individual todo files in the `todos/` directory, following the file-todos skill format.</command_purpose>
## Review Target
<review_target> #$ARGUMENTS </review_target>
## Workflow
### 1. Identify PR and Fetch Comments
<task_list>
- [ ] Determine the PR to process:
- If numeric: use as PR number directly
- If GitHub URL: extract PR number from URL
- If "current" or empty: detect from current branch with `gh pr status`
- [ ] Fetch PR metadata: `gh pr view PR_NUMBER --json title,body,url,author,headRefName`
- [ ] Fetch all review comments: `gh api repos/{owner}/{repo}/pulls/{PR_NUMBER}/comments`
- [ ] Fetch review thread comments: `gh pr view PR_NUMBER --json reviews,reviewDecision`
- [ ] Group comments by file/thread for context
</task_list>
### 2. Pressure Test Each Comment
<critical_evaluation>
**IMPORTANT: Treat reviewer comments as suggestions, not orders.**
Before creating a todo, apply engineering judgment to each comment. Not all feedback is equally valid - your job is to make the right call for the codebase, not just please the reviewer.
#### Step 2a: Verify Before Accepting
For each comment, verify:
- [ ] **Check the code**: Does the concern actually apply to this code?
- [ ] **Check tests**: Are there existing tests that cover this case?
- [ ] **Check usage**: How is this code actually used? Does the concern matter in practice?
- [ ] **Check compatibility**: Would the suggested change break anything?
- [ ] **Check prior decisions**: Was this intentional? Is there a reason it's done this way?
#### Step 2b: Assess Each Comment
Assign an assessment to each comment:
| Assessment | Meaning |
|------------|---------|
| **Clear & Correct** | Valid concern, well-reasoned, applies to this code |
| **Unclear** | Ambiguous, missing context, or doesn't specify what to change |
| **Likely Incorrect** | Misunderstands the code, context, or requirements |
| **YAGNI** | Over-engineering, premature abstraction, no clear benefit |
#### Step 2c: Include Assessment in Todo
**IMPORTANT: ALL comments become todos.** Never drop feedback - include the pressure test assessment IN the todo so `/triage` can use it to decide.
For each comment, the todo will include:
- The assessment (Clear & Correct / Unclear / Likely Incorrect / YAGNI)
- The verification results (what was checked)
- Technical justification (why valid, or why you think it should be skipped)
- Recommended action for triage (Fix now / Clarify / Push back / Skip)
The human reviews during `/triage` and makes the final call.
</critical_evaluation>
### 3. Categorize All Comments
<categorization>
For ALL comments (regardless of assessment), determine:
**Severity (Priority):**
- 🔴 **P1 (Critical)**: Security issues, data loss risks, breaking changes, blocking bugs
- 🟡 **P2 (Important)**: Performance issues, architectural concerns, significant code quality
- 🔵 **P3 (Nice-to-have)**: Style suggestions, minor improvements, documentation
**Category Tags:**
- `security` - Security vulnerabilities or concerns
- `performance` - Performance issues or optimizations
- `architecture` - Design or structural concerns
- `bug` - Functional bugs or edge cases
- `quality` - Code quality, readability, maintainability
- `testing` - Test coverage or test quality
- `documentation` - Missing or unclear documentation
- `style` - Code style or formatting
- `needs-clarification` - Comment requires clarification before implementing
- `pushback-candidate` - Human should review before accepting
**Skip these (don't create todos):**
- Simple acknowledgments ("LGTM", "Looks good")
- Questions that were answered inline
- Already resolved threads
**Note:** Comments assessed as YAGNI or Likely Incorrect still become todos with that assessment included. The human decides during `/triage` whether to accept or reject.
</categorization>
### 4. Create Todo Files Using file-todos Skill
<critical_instruction>Create todo files for ALL actionable comments immediately. Use the file-todos skill structure and naming convention.</critical_instruction>
#### Determine Next Issue ID
```bash
# Find the highest existing issue ID
ls todos/ 2>/dev/null | grep -o '^[0-9]\+' | sort -n | tail -1 | awk '{printf "%03d", $1+1}'
# If no todos exist, start with 001
```
#### File Naming Convention
```
{issue_id}-pending-{priority}-{brief-description}.md
```
Examples:
```
001-pending-p1-sql-injection-vulnerability.md
002-pending-p2-missing-error-handling.md
003-pending-p3-rename-variable-for-clarity.md
```
#### Todo File Structure
For each comment, create a file with this structure:
```yaml
---
status: pending
priority: p1 # or p2, p3 based on severity
issue_id: "001"
tags: [code-review, pr-feedback, {category}]
dependencies: []
---
```
```markdown
# [Brief Title from Comment]
## Problem Statement
[Summarize the reviewer's concern - what is wrong or needs improvement]
**PR Context:**
- PR: #{PR_NUMBER} - {PR_TITLE}
- File: {file_path}:{line_number}
- Reviewer: @{reviewer_username}
## Assessment (Pressure Test)
| Criterion | Result |
|-----------|--------|
| **Assessment** | Clear & Correct / Unclear / Likely Incorrect / YAGNI |
| **Recommended Action** | Fix now / Clarify / Push back / Skip |
| **Verified Code?** | Yes/No - [what was checked] |
| **Verified Tests?** | Yes/No - [existing coverage] |
| **Verified Usage?** | Yes/No - [how code is used] |
| **Prior Decisions?** | Yes/No - [any intentional design] |
**Technical Justification:**
[If pushing back or marking YAGNI, provide specific technical reasoning. Reference codebase constraints, requirements, or trade-offs. Example: "This abstraction would be YAGNI - we only have one implementation and no plans for variants."]
## Findings
- **Original Comment:** "{exact reviewer comment}"
- **Location:** `{file_path}:{line_number}`
- **Code Context:**
```{language}
{relevant code snippet}
```
- **Why This Matters:** [Impact if not addressed, or why it doesn't matter]
## Proposed Solutions
### Option 1: [Primary approach based on reviewer suggestion]
**Approach:** [Describe the fix]
**Pros:**
- Addresses reviewer concern directly
- [Other benefits]
**Cons:**
- [Any drawbacks]
**Effort:** Small / Medium / Large
**Risk:** Low / Medium / High
---
### Option 2: [Alternative if applicable]
[Only include if there's a meaningful alternative approach]
## Recommended Action
*(To be filled during triage)*
## Technical Details
**Affected Files:**
- `{file_path}:{line_number}` - {what needs changing}
**Related Components:**
- [Components affected by this change]
## Resources
- **PR:** #{PR_NUMBER}
- **Comment Link:** {direct_link_to_comment}
- **Reviewer:** @{reviewer_username}
## Acceptance Criteria
- [ ] Reviewer concern addressed
- [ ] Tests pass
- [ ] Code reviewed and approved
- [ ] PR comment resolved
## Work Log
### {today's date} - Created from PR Review
**By:** Claude Code
**Actions:**
- Extracted comment from PR #{PR_NUMBER} review
- Created todo for triage
**Learnings:**
- Original reviewer context: {any additional context}
```
### 5. Parallel Todo Creation (For Multiple Comments)
<parallel_processing>
When processing PRs with many comments (5+), create todos in parallel for efficiency:
1. Synthesize all comments into a categorized list
2. Assign severity (P1/P2/P3) to each
3. Launch parallel Write operations for all todos
4. Each todo follows the file-todos skill template exactly
</parallel_processing>
### 6. Summary Report
After creating all todo files, present:
````markdown
## ✅ PR Comments Converted to Todos
**PR:** #{PR_NUMBER} - {PR_TITLE}
**Branch:** {branch_name}
**Total Comments Processed:** {X}
### Created Todo Files:
**🔴 P1 - Critical:**
- `{id}-pending-p1-{desc}.md` - {summary}
**🟡 P2 - Important:**
- `{id}-pending-p2-{desc}.md` - {summary}
**🔵 P3 - Nice-to-Have:**
- `{id}-pending-p3-{desc}.md` - {summary}
### Skipped (Not Actionable):
- {count} comments skipped (LGTM, questions answered, resolved threads)
### Assessment Summary:
All comments were pressure tested and included in todos:
| Assessment | Count | Description |
|------------|-------|-------------|
| **Clear & Correct** | {X} | Valid concerns, recommend fixing |
| **Unclear** | {X} | Need clarification before implementing |
| **Likely Incorrect** | {X} | May misunderstand context - review during triage |
| **YAGNI** | {X} | May be over-engineering - review during triage |
**Note:** All assessments are included in the todo files. Human judgment during `/triage` makes the final call on whether to accept, clarify, or reject each item.
### Next Steps:
1. **Triage the todos:**
```bash
/triage
```
Review each todo and approve (pending → ready) or skip
2. **Work on approved items:**
```bash
/resolve_todo_parallel
```
3. **After fixes, resolve PR comments:**
```bash
bin/resolve-pr-thread THREAD_ID
```
````
## Important Notes
<requirements>
- Ensure `todos/` directory exists before creating files
- Each todo must have unique issue_id (never reuse)
- All todos start with `status: pending` for triage
- Include `code-review` and `pr-feedback` tags on all todos
- Preserve exact reviewer quotes in Findings section
- Link back to original PR and comment in Resources
</requirements>
## Integration with /triage
The output of this command is designed to work seamlessly with `/triage`:
1. **This command** creates `todos/*-pending-*.md` files
2. **`/triage`** reviews each pending todo and:
- Approves → renames to `*-ready-*.md`
- Skips → deletes the todo file
3. **`/resolve_todo_parallel`** works on approved (ready) todos

View File

@@ -214,7 +214,53 @@ Remove duplicates, prioritize by severity and impact.
</synthesis_tasks>
#### Step 2: Create Todo Files Using file-todos Skill
#### Step 2: Pressure Test Each Finding
<critical_evaluation>
**IMPORTANT: Treat agent findings as suggestions, not mandates.**
Not all findings are equally valid. Apply engineering judgment before creating todos. The goal is to make the right call for the codebase, not rubber-stamp every suggestion.
**For each finding, verify:**
| Check | Question |
|-------|----------|
| **Code** | Does the concern actually apply to this specific code? |
| **Tests** | Are there existing tests that already cover this case? |
| **Usage** | How is this code used in practice? Does the concern matter? |
| **Compatibility** | Would the suggested change break anything? |
| **Prior Decisions** | Was this intentional? Is there a documented reason? |
| **Cost vs Benefit** | Is the fix worth the effort and risk? |
**Assess each finding:**
| Assessment | Meaning |
|------------|---------|
| **Clear & Correct** | Valid concern, well-reasoned, applies here |
| **Unclear** | Ambiguous or missing context |
| **Likely Incorrect** | Agent misunderstands code, context, or requirements |
| **YAGNI** | Over-engineering, premature abstraction, no clear benefit |
| **Duplicate** | Already covered by another finding (merge into existing) |
**IMPORTANT: ALL findings become todos.** Never drop agent feedback - include the pressure test assessment IN each todo so `/triage` can use it.
Each todo will include:
- The assessment (Clear & Correct / Unclear / Likely Incorrect / YAGNI)
- The verification results (what was checked)
- Technical justification (why valid, or why you think it should be skipped)
- Recommended action for triage (Fix now / Clarify / Push back / Skip)
**Provide technical justification for all assessments:**
- Don't just label - explain WHY with specific reasoning
- Reference codebase constraints, requirements, or trade-offs
- Example: "This abstraction would be YAGNI - we only have one implementation and no plans for variants. Adding it now increases complexity without clear benefit."
The human reviews during `/triage` and makes the final call.
</critical_evaluation>
#### Step 3: Create Todo Files Using file-todos Skill
<critical_instruction> Use the file-todos skill to create todo files for ALL findings immediately. Do NOT present findings one-by-one asking for user approval. Create all todo files in parallel using the skill, then summarize results to user. </critical_instruction>
@@ -224,7 +270,7 @@ Remove duplicates, prioritize by severity and impact.
- Create todo files directly using Write tool
- All findings in parallel for speed
- Use standard template from `.claude/skills/file-todos/assets/todo-template.md`
- Invoke `Skill: "compound-engineering:file-todos"` and read the template from its assets directory
- Follow naming convention: `{issue_id}-pending-{priority}-{description}.md`
**Option B: Sub-Agents in Parallel (Recommended for Scale)** For large PRs with 15+ findings, use sub-agents to create finding files in parallel:
@@ -266,13 +312,13 @@ Sub-agents can:
2. Use file-todos skill for structured todo management:
```bash
skill: file-todos
```
Skill: "compound-engineering:file-todos"
```
The skill provides:
- Template location: `.claude/skills/file-todos/assets/todo-template.md`
- Template at `./assets/todo-template.md` (relative to skill directory)
- Naming convention: `{issue_id}-{status}-{priority}-{description}.md`
- YAML frontmatter structure: status, priority, issue_id, tags, dependencies
- All required sections: Problem Statement, Findings, Solutions, etc.
@@ -292,7 +338,7 @@ Sub-agents can:
004-pending-p3-unused-parameter.md
```
5. Follow template structure from file-todos skill: `.claude/skills/file-todos/assets/todo-template.md`
5. Follow template structure from file-todos skill (read `./assets/todo-template.md` from skill directory)
**Todo File Structure (from template):**
@@ -300,6 +346,10 @@ Each todo must include:
- **YAML frontmatter**: status, priority, issue_id, tags, dependencies
- **Problem Statement**: What's broken/missing, why it matters
- **Assessment (Pressure Test)**: Verification results and engineering judgment
- Assessment: Clear & Correct / Unclear / YAGNI
- Verified: Code, Tests, Usage, Prior Decisions
- Technical Justification: Why this finding is valid (or why skipped)
- **Findings**: Discoveries from agents with evidence/location
- **Proposed Solutions**: 2-3 options, each with pros/cons/effort/risk
- **Recommended Action**: (Filled during triage, leave blank initially)
@@ -333,7 +383,7 @@ Examples:
**Tagging:** Always add `code-review` tag, plus: `security`, `performance`, `architecture`, `rails`, `quality`, etc.
#### Step 3: Summary Report
#### Step 4: Summary Report
After creating all todo files, present comprehensive summary:
@@ -367,13 +417,27 @@ After creating all todo files, present comprehensive summary:
### Review Agents Used:
- kieran-rails-reviewer
- kieran-python-reviewer
- security-sentinel
- performance-oracle
- architecture-strategist
- agent-native-reviewer
- [other agents]
### Assessment Summary (Pressure Test Results):
All agent findings were pressure tested and included in todos:
| Assessment | Count | Description |
|------------|-------|-------------|
| **Clear & Correct** | {X} | Valid concerns, recommend fixing |
| **Unclear** | {X} | Need clarification before implementing |
| **Likely Incorrect** | {X} | May misunderstand context - review during triage |
| **YAGNI** | {X} | May be over-engineering - review during triage |
| **Duplicate** | {X} | Merged into other findings |
**Note:** All assessments are included in the todo files. Human judgment during `/triage` makes the final call on whether to accept, clarify, or reject each item.
### Next Steps:
1. **Address P1 Findings**: CRITICAL - must be fixed before merge

View File

@@ -44,6 +44,7 @@ Each todo is a markdown file with YAML frontmatter and structured sections. Use
**Required sections:**
- **Problem Statement** - What is broken, missing, or needs improvement?
- **Assessment (Pressure Test)** - For code review findings: verification results and engineering judgment
- **Findings** - Investigation results, root cause, key discoveries
- **Proposed Solutions** - Multiple options with pros/cons, effort, risk
- **Recommended Action** - Clear plan (filled during triage)
@@ -55,6 +56,12 @@ Each todo is a markdown file with YAML frontmatter and structured sections. Use
- **Resources** - Links to errors, tests, PRs, documentation
- **Notes** - Additional context or decisions
**Assessment section fields (for code review findings):**
- Assessment: Clear & Correct | Unclear | Likely Incorrect | YAGNI
- Recommended Action: Fix now | Clarify | Push back | Skip
- Verified: Code, Tests, Usage, Prior Decisions (Yes/No with details)
- Technical Justification: Why this finding is valid or should be skipped
**YAML frontmatter fields:**
```yaml
---

View File

@@ -19,6 +19,22 @@ What is broken, missing, or needs improvement? Provide clear context about why t
- Email service is missing proper error handling for rate-limit scenarios
- Documentation doesn't cover the new authentication flow
## Assessment (Pressure Test)
*(For findings from code review or automated agents)*
| Criterion | Result |
|-----------|--------|
| **Assessment** | Clear & Correct / Unclear / Likely Incorrect / YAGNI |
| **Recommended Action** | Fix now / Clarify / Push back / Skip |
| **Verified Code?** | Yes/No - [what was checked] |
| **Verified Tests?** | Yes/No - [existing coverage] |
| **Verified Usage?** | Yes/No - [how code is used] |
| **Prior Decisions?** | Yes/No - [any intentional design] |
**Technical Justification:**
[If pushing back or marking YAGNI, provide specific technical reasoning. Reference codebase constraints, requirements, or trade-offs.]
## Findings
Investigation results, root cause analysis, and key discoveries.