diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
index cb4e5a6..0d59a61 100644
--- a/.claude-plugin/marketplace.json
+++ b/.claude-plugin/marketplace.json
@@ -11,8 +11,8 @@
"plugins": [
{
"name": "compound-engineering",
- "description": "AI-powered development tools that get smarter with every use. Make each unit of engineering work easier than the last. Includes 27 specialized agents, 24 commands, and 15 skills.",
- "version": "2.29.0",
+ "description": "AI-powered development tools that get smarter with every use. Make each unit of engineering work easier than the last. Includes 27 specialized agents, 25 commands, and 15 skills.",
+ "version": "2.30.0",
"author": {
"name": "Kieran Klaassen",
"url": "https://github.com/kieranklaassen",
diff --git a/plugins/compound-engineering/.claude-plugin/plugin.json b/plugins/compound-engineering/.claude-plugin/plugin.json
index 5e3b99b..840391d 100644
--- a/plugins/compound-engineering/.claude-plugin/plugin.json
+++ b/plugins/compound-engineering/.claude-plugin/plugin.json
@@ -1,7 +1,7 @@
{
"name": "compound-engineering",
- "version": "2.29.0",
- "description": "AI-powered development tools. 27 agents, 24 commands, 15 skills, 1 MCP server for code review, research, design, and workflow automation.",
+ "version": "2.30.0",
+ "description": "AI-powered development tools. 27 agents, 25 commands, 15 skills, 1 MCP server for code review, research, design, and workflow automation.",
"author": {
"name": "Kieran Klaassen",
"email": "kieran@every.to",
diff --git a/plugins/compound-engineering/CHANGELOG.md b/plugins/compound-engineering/CHANGELOG.md
index 2827afd..cb9bd01 100644
--- a/plugins/compound-engineering/CHANGELOG.md
+++ b/plugins/compound-engineering/CHANGELOG.md
@@ -5,6 +5,24 @@ All notable changes to the compound-engineering plugin will be documented in thi
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [2.30.0] - 2026-02-03
+
+### Added
+
+- **`/pr-comments-to-todos` command** - Fetch PR review comments and convert them into todo files for triage
+ - Fetches all comments from a GitHub PR using `gh` CLI
+ - Converts each actionable comment into a structured todo file following the file-todos skill format
+ - Assigns priority levels (P1/P2/P3) based on comment severity
+ - Creates todo files compatible with `/triage` command for approval workflow
+ - Skips non-actionable comments (LGTM, resolved threads, etc.)
+ - Supports PR number, GitHub URL, or "current" for current branch
+
+### Summary
+
+- 27 agents, 25 commands, 15 skills, 1 MCP server
+
+---
+
## [2.29.0] - 2026-01-26
### Changed
diff --git a/plugins/compound-engineering/commands/pr-comments-to-todos.md b/plugins/compound-engineering/commands/pr-comments-to-todos.md
new file mode 100644
index 0000000..cfda3d6
--- /dev/null
+++ b/plugins/compound-engineering/commands/pr-comments-to-todos.md
@@ -0,0 +1,334 @@
+---
+name: pr-comments-to-todos
+description: Fetch PR comments and convert them into todo files for triage
+argument-hint: "[PR number, GitHub URL, or 'current' for current branch PR]"
+---
+
+# PR Comments to Todos
+
+Convert GitHub PR review comments into structured todo files compatible with `/triage`.
+
+Fetch all review comments from a PR and create individual todo files in the `todos/` directory, following the file-todos skill format.
+
+## Review Target
+
+ #$ARGUMENTS
+
+## Workflow
+
+### 1. Identify PR and Fetch Comments
+
+
+
+- [ ] Determine the PR to process:
+ - If numeric: use as PR number directly
+ - If GitHub URL: extract PR number from URL
+ - If "current" or empty: detect from current branch with `gh pr status`
+- [ ] Fetch PR metadata: `gh pr view PR_NUMBER --json title,body,url,author,headRefName`
+- [ ] Fetch all review comments: `gh api repos/{owner}/{repo}/pulls/{PR_NUMBER}/comments`
+- [ ] Fetch review thread comments: `gh pr view PR_NUMBER --json reviews,reviewDecision`
+- [ ] Group comments by file/thread for context
+
+
+
+### 2. Pressure Test Each Comment
+
+
+
+**IMPORTANT: Treat reviewer comments as suggestions, not orders.**
+
+Before creating a todo, apply engineering judgment to each comment. Not all feedback is equally valid - your job is to make the right call for the codebase, not just please the reviewer.
+
+#### Step 2a: Verify Before Accepting
+
+For each comment, verify:
+- [ ] **Check the code**: Does the concern actually apply to this code?
+- [ ] **Check tests**: Are there existing tests that cover this case?
+- [ ] **Check usage**: How is this code actually used? Does the concern matter in practice?
+- [ ] **Check compatibility**: Would the suggested change break anything?
+- [ ] **Check prior decisions**: Was this intentional? Is there a reason it's done this way?
+
+#### Step 2b: Assess Each Comment
+
+Assign an assessment to each comment:
+
+| Assessment | Meaning |
+|------------|---------|
+| **Clear & Correct** | Valid concern, well-reasoned, applies to this code |
+| **Unclear** | Ambiguous, missing context, or doesn't specify what to change |
+| **Likely Incorrect** | Misunderstands the code, context, or requirements |
+| **YAGNI** | Over-engineering, premature abstraction, no clear benefit |
+
+#### Step 2c: Include Assessment in Todo
+
+**IMPORTANT: ALL comments become todos.** Never drop feedback - include the pressure test assessment IN the todo so `/triage` can use it to decide.
+
+For each comment, the todo will include:
+- The assessment (Clear & Correct / Unclear / Likely Incorrect / YAGNI)
+- The verification results (what was checked)
+- Technical justification (why valid, or why you think it should be skipped)
+- Recommended action for triage (Fix now / Clarify / Push back / Skip)
+
+The human reviews during `/triage` and makes the final call.
+
+
+
+### 3. Categorize All Comments
+
+
+
+For ALL comments (regardless of assessment), determine:
+
+**Severity (Priority):**
+- 🔴 **P1 (Critical)**: Security issues, data loss risks, breaking changes, blocking bugs
+- 🟡 **P2 (Important)**: Performance issues, architectural concerns, significant code quality
+- 🔵 **P3 (Nice-to-have)**: Style suggestions, minor improvements, documentation
+
+**Category Tags:**
+- `security` - Security vulnerabilities or concerns
+- `performance` - Performance issues or optimizations
+- `architecture` - Design or structural concerns
+- `bug` - Functional bugs or edge cases
+- `quality` - Code quality, readability, maintainability
+- `testing` - Test coverage or test quality
+- `documentation` - Missing or unclear documentation
+- `style` - Code style or formatting
+- `needs-clarification` - Comment requires clarification before implementing
+- `pushback-candidate` - Human should review before accepting
+
+**Skip these (don't create todos):**
+- Simple acknowledgments ("LGTM", "Looks good")
+- Questions that were answered inline
+- Already resolved threads
+
+**Note:** Comments assessed as YAGNI or Likely Incorrect still become todos with that assessment included. The human decides during `/triage` whether to accept or reject.
+
+
+
+### 4. Create Todo Files Using file-todos Skill
+
+Create todo files for ALL actionable comments immediately. Use the file-todos skill structure and naming convention.
+
+#### Determine Next Issue ID
+
+```bash
+# Find the highest existing issue ID
+ls todos/ 2>/dev/null | grep -o '^[0-9]\+' | sort -n | tail -1 | awk '{printf "%03d", $1+1}'
+# If no todos exist, start with 001
+```
+
+#### File Naming Convention
+
+```
+{issue_id}-pending-{priority}-{brief-description}.md
+```
+
+Examples:
+```
+001-pending-p1-sql-injection-vulnerability.md
+002-pending-p2-missing-error-handling.md
+003-pending-p3-rename-variable-for-clarity.md
+```
+
+#### Todo File Structure
+
+For each comment, create a file with this structure:
+
+```yaml
+---
+status: pending
+priority: p1 # or p2, p3 based on severity
+issue_id: "001"
+tags: [code-review, pr-feedback, {category}]
+dependencies: []
+---
+```
+
+```markdown
+# [Brief Title from Comment]
+
+## Problem Statement
+
+[Summarize the reviewer's concern - what is wrong or needs improvement]
+
+**PR Context:**
+- PR: #{PR_NUMBER} - {PR_TITLE}
+- File: {file_path}:{line_number}
+- Reviewer: @{reviewer_username}
+
+## Assessment (Pressure Test)
+
+| Criterion | Result |
+|-----------|--------|
+| **Assessment** | Clear & Correct / Unclear / Likely Incorrect / YAGNI |
+| **Recommended Action** | Fix now / Clarify / Push back / Skip |
+| **Verified Code?** | Yes/No - [what was checked] |
+| **Verified Tests?** | Yes/No - [existing coverage] |
+| **Verified Usage?** | Yes/No - [how code is used] |
+| **Prior Decisions?** | Yes/No - [any intentional design] |
+
+**Technical Justification:**
+[If pushing back or marking YAGNI, provide specific technical reasoning. Reference codebase constraints, requirements, or trade-offs. Example: "This abstraction would be YAGNI - we only have one implementation and no plans for variants."]
+
+## Findings
+
+- **Original Comment:** "{exact reviewer comment}"
+- **Location:** `{file_path}:{line_number}`
+- **Code Context:**
+ ```{language}
+ {relevant code snippet}
+ ```
+- **Why This Matters:** [Impact if not addressed, or why it doesn't matter]
+
+## Proposed Solutions
+
+### Option 1: [Primary approach based on reviewer suggestion]
+
+**Approach:** [Describe the fix]
+
+**Pros:**
+- Addresses reviewer concern directly
+- [Other benefits]
+
+**Cons:**
+- [Any drawbacks]
+
+**Effort:** Small / Medium / Large
+
+**Risk:** Low / Medium / High
+
+---
+
+### Option 2: [Alternative if applicable]
+
+[Only include if there's a meaningful alternative approach]
+
+## Recommended Action
+
+*(To be filled during triage)*
+
+## Technical Details
+
+**Affected Files:**
+- `{file_path}:{line_number}` - {what needs changing}
+
+**Related Components:**
+- [Components affected by this change]
+
+## Resources
+
+- **PR:** #{PR_NUMBER}
+- **Comment Link:** {direct_link_to_comment}
+- **Reviewer:** @{reviewer_username}
+
+## Acceptance Criteria
+
+- [ ] Reviewer concern addressed
+- [ ] Tests pass
+- [ ] Code reviewed and approved
+- [ ] PR comment resolved
+
+## Work Log
+
+### {today's date} - Created from PR Review
+
+**By:** Claude Code
+
+**Actions:**
+- Extracted comment from PR #{PR_NUMBER} review
+- Created todo for triage
+
+**Learnings:**
+- Original reviewer context: {any additional context}
+```
+
+### 5. Parallel Todo Creation (For Multiple Comments)
+
+
+
+When processing PRs with many comments (5+), create todos in parallel for efficiency:
+
+1. Synthesize all comments into a categorized list
+2. Assign severity (P1/P2/P3) to each
+3. Launch parallel Write operations for all todos
+4. Each todo follows the file-todos skill template exactly
+
+
+
+### 6. Summary Report
+
+After creating all todo files, present:
+
+````markdown
+## ✅ PR Comments Converted to Todos
+
+**PR:** #{PR_NUMBER} - {PR_TITLE}
+**Branch:** {branch_name}
+**Total Comments Processed:** {X}
+
+### Created Todo Files:
+
+**🔴 P1 - Critical:**
+- `{id}-pending-p1-{desc}.md` - {summary}
+
+**🟡 P2 - Important:**
+- `{id}-pending-p2-{desc}.md` - {summary}
+
+**🔵 P3 - Nice-to-Have:**
+- `{id}-pending-p3-{desc}.md` - {summary}
+
+### Skipped (Not Actionable):
+- {count} comments skipped (LGTM, questions answered, resolved threads)
+
+### Assessment Summary:
+
+All comments were pressure tested and included in todos:
+
+| Assessment | Count | Description |
+|------------|-------|-------------|
+| **Clear & Correct** | {X} | Valid concerns, recommend fixing |
+| **Unclear** | {X} | Need clarification before implementing |
+| **Likely Incorrect** | {X} | May misunderstand context - review during triage |
+| **YAGNI** | {X} | May be over-engineering - review during triage |
+
+**Note:** All assessments are included in the todo files. Human judgment during `/triage` makes the final call on whether to accept, clarify, or reject each item.
+
+### Next Steps:
+
+1. **Triage the todos:**
+ ```bash
+ /triage
+ ```
+ Review each todo and approve (pending → ready) or skip
+
+2. **Work on approved items:**
+ ```bash
+ /resolve_todo_parallel
+ ```
+
+3. **After fixes, resolve PR comments:**
+ ```bash
+ bin/resolve-pr-thread THREAD_ID
+ ```
+````
+
+## Important Notes
+
+
+- Ensure `todos/` directory exists before creating files
+- Each todo must have unique issue_id (never reuse)
+- All todos start with `status: pending` for triage
+- Include `code-review` and `pr-feedback` tags on all todos
+- Preserve exact reviewer quotes in Findings section
+- Link back to original PR and comment in Resources
+
+
+## Integration with /triage
+
+The output of this command is designed to work seamlessly with `/triage`:
+
+1. **This command** creates `todos/*-pending-*.md` files
+2. **`/triage`** reviews each pending todo and:
+ - Approves → renames to `*-ready-*.md`
+ - Skips → deletes the todo file
+3. **`/resolve_todo_parallel`** works on approved (ready) todos
diff --git a/plugins/compound-engineering/commands/workflows/review.md b/plugins/compound-engineering/commands/workflows/review.md
index 282c9f0..fe14815 100644
--- a/plugins/compound-engineering/commands/workflows/review.md
+++ b/plugins/compound-engineering/commands/workflows/review.md
@@ -214,7 +214,53 @@ Remove duplicates, prioritize by severity and impact.
-#### Step 2: Create Todo Files Using file-todos Skill
+#### Step 2: Pressure Test Each Finding
+
+
+
+**IMPORTANT: Treat agent findings as suggestions, not mandates.**
+
+Not all findings are equally valid. Apply engineering judgment before creating todos. The goal is to make the right call for the codebase, not rubber-stamp every suggestion.
+
+**For each finding, verify:**
+
+| Check | Question |
+|-------|----------|
+| **Code** | Does the concern actually apply to this specific code? |
+| **Tests** | Are there existing tests that already cover this case? |
+| **Usage** | How is this code used in practice? Does the concern matter? |
+| **Compatibility** | Would the suggested change break anything? |
+| **Prior Decisions** | Was this intentional? Is there a documented reason? |
+| **Cost vs Benefit** | Is the fix worth the effort and risk? |
+
+**Assess each finding:**
+
+| Assessment | Meaning |
+|------------|---------|
+| **Clear & Correct** | Valid concern, well-reasoned, applies here |
+| **Unclear** | Ambiguous or missing context |
+| **Likely Incorrect** | Agent misunderstands code, context, or requirements |
+| **YAGNI** | Over-engineering, premature abstraction, no clear benefit |
+| **Duplicate** | Already covered by another finding (merge into existing) |
+
+**IMPORTANT: ALL findings become todos.** Never drop agent feedback - include the pressure test assessment IN each todo so `/triage` can use it.
+
+Each todo will include:
+- The assessment (Clear & Correct / Unclear / Likely Incorrect / YAGNI)
+- The verification results (what was checked)
+- Technical justification (why valid, or why you think it should be skipped)
+- Recommended action for triage (Fix now / Clarify / Push back / Skip)
+
+**Provide technical justification for all assessments:**
+- Don't just label - explain WHY with specific reasoning
+- Reference codebase constraints, requirements, or trade-offs
+- Example: "This abstraction would be YAGNI - we only have one implementation and no plans for variants. Adding it now increases complexity without clear benefit."
+
+The human reviews during `/triage` and makes the final call.
+
+
+
+#### Step 3: Create Todo Files Using file-todos Skill
Use the file-todos skill to create todo files for ALL findings immediately. Do NOT present findings one-by-one asking for user approval. Create all todo files in parallel using the skill, then summarize results to user.
@@ -224,7 +270,7 @@ Remove duplicates, prioritize by severity and impact.
- Create todo files directly using Write tool
- All findings in parallel for speed
-- Use standard template from `.claude/skills/file-todos/assets/todo-template.md`
+- Invoke `Skill: "compound-engineering:file-todos"` and read the template from its assets directory
- Follow naming convention: `{issue_id}-pending-{priority}-{description}.md`
**Option B: Sub-Agents in Parallel (Recommended for Scale)** For large PRs with 15+ findings, use sub-agents to create finding files in parallel:
@@ -266,13 +312,13 @@ Sub-agents can:
2. Use file-todos skill for structured todo management:
- ```bash
- skill: file-todos
+ ```
+ Skill: "compound-engineering:file-todos"
```
The skill provides:
- - Template location: `.claude/skills/file-todos/assets/todo-template.md`
+ - Template at `./assets/todo-template.md` (relative to skill directory)
- Naming convention: `{issue_id}-{status}-{priority}-{description}.md`
- YAML frontmatter structure: status, priority, issue_id, tags, dependencies
- All required sections: Problem Statement, Findings, Solutions, etc.
@@ -292,7 +338,7 @@ Sub-agents can:
004-pending-p3-unused-parameter.md
```
-5. Follow template structure from file-todos skill: `.claude/skills/file-todos/assets/todo-template.md`
+5. Follow template structure from file-todos skill (read `./assets/todo-template.md` from skill directory)
**Todo File Structure (from template):**
@@ -300,6 +346,10 @@ Each todo must include:
- **YAML frontmatter**: status, priority, issue_id, tags, dependencies
- **Problem Statement**: What's broken/missing, why it matters
+- **Assessment (Pressure Test)**: Verification results and engineering judgment
+ - Assessment: Clear & Correct / Unclear / YAGNI
+ - Verified: Code, Tests, Usage, Prior Decisions
+ - Technical Justification: Why this finding is valid (or why skipped)
- **Findings**: Discoveries from agents with evidence/location
- **Proposed Solutions**: 2-3 options, each with pros/cons/effort/risk
- **Recommended Action**: (Filled during triage, leave blank initially)
@@ -333,7 +383,7 @@ Examples:
**Tagging:** Always add `code-review` tag, plus: `security`, `performance`, `architecture`, `rails`, `quality`, etc.
-#### Step 3: Summary Report
+#### Step 4: Summary Report
After creating all todo files, present comprehensive summary:
@@ -367,13 +417,27 @@ After creating all todo files, present comprehensive summary:
### Review Agents Used:
-- kieran-rails-reviewer
+- kieran-python-reviewer
- security-sentinel
- performance-oracle
- architecture-strategist
- agent-native-reviewer
- [other agents]
+### Assessment Summary (Pressure Test Results):
+
+All agent findings were pressure tested and included in todos:
+
+| Assessment | Count | Description |
+|------------|-------|-------------|
+| **Clear & Correct** | {X} | Valid concerns, recommend fixing |
+| **Unclear** | {X} | Need clarification before implementing |
+| **Likely Incorrect** | {X} | May misunderstand context - review during triage |
+| **YAGNI** | {X} | May be over-engineering - review during triage |
+| **Duplicate** | {X} | Merged into other findings |
+
+**Note:** All assessments are included in the todo files. Human judgment during `/triage` makes the final call on whether to accept, clarify, or reject each item.
+
### Next Steps:
1. **Address P1 Findings**: CRITICAL - must be fixed before merge
diff --git a/plugins/compound-engineering/skills/file-todos/SKILL.md b/plugins/compound-engineering/skills/file-todos/SKILL.md
index 793dfdd..2faab35 100644
--- a/plugins/compound-engineering/skills/file-todos/SKILL.md
+++ b/plugins/compound-engineering/skills/file-todos/SKILL.md
@@ -44,6 +44,7 @@ Each todo is a markdown file with YAML frontmatter and structured sections. Use
**Required sections:**
- **Problem Statement** - What is broken, missing, or needs improvement?
+- **Assessment (Pressure Test)** - For code review findings: verification results and engineering judgment
- **Findings** - Investigation results, root cause, key discoveries
- **Proposed Solutions** - Multiple options with pros/cons, effort, risk
- **Recommended Action** - Clear plan (filled during triage)
@@ -55,6 +56,12 @@ Each todo is a markdown file with YAML frontmatter and structured sections. Use
- **Resources** - Links to errors, tests, PRs, documentation
- **Notes** - Additional context or decisions
+**Assessment section fields (for code review findings):**
+- Assessment: Clear & Correct | Unclear | Likely Incorrect | YAGNI
+- Recommended Action: Fix now | Clarify | Push back | Skip
+- Verified: Code, Tests, Usage, Prior Decisions (Yes/No with details)
+- Technical Justification: Why this finding is valid or should be skipped
+
**YAML frontmatter fields:**
```yaml
---
diff --git a/plugins/compound-engineering/skills/file-todos/assets/todo-template.md b/plugins/compound-engineering/skills/file-todos/assets/todo-template.md
index d241f2d..83baf9e 100644
--- a/plugins/compound-engineering/skills/file-todos/assets/todo-template.md
+++ b/plugins/compound-engineering/skills/file-todos/assets/todo-template.md
@@ -19,6 +19,22 @@ What is broken, missing, or needs improvement? Provide clear context about why t
- Email service is missing proper error handling for rate-limit scenarios
- Documentation doesn't cover the new authentication flow
+## Assessment (Pressure Test)
+
+*(For findings from code review or automated agents)*
+
+| Criterion | Result |
+|-----------|--------|
+| **Assessment** | Clear & Correct / Unclear / Likely Incorrect / YAGNI |
+| **Recommended Action** | Fix now / Clarify / Push back / Skip |
+| **Verified Code?** | Yes/No - [what was checked] |
+| **Verified Tests?** | Yes/No - [existing coverage] |
+| **Verified Usage?** | Yes/No - [how code is used] |
+| **Prior Decisions?** | Yes/No - [any intentional design] |
+
+**Technical Justification:**
+[If pushing back or marking YAGNI, provide specific technical reasoning. Reference codebase constraints, requirements, or trade-offs.]
+
## Findings
Investigation results, root cause analysis, and key discoveries.