feat(cli-readiness-reviewer): add conditional review persona for CLI agent readiness (#471)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Trevin Chow
2026-03-31 19:19:54 -07:00
committed by GitHub
parent 01ce065e0c
commit c56c7667df
6 changed files with 312 additions and 3 deletions

View File

@@ -0,0 +1,65 @@
---
date: 2026-03-30
topic: cli-readiness-review-persona
---
# CLI Agent-Readiness Review Persona in ce:review
## Problem Frame
The `cli-agent-readiness-reviewer` agent exists as a standalone deep-audit tool, but developers only benefit from it if they know it exists and invoke it explicitly. Most CLI code gets reviewed through `ce:review`, which has no CLI-specific lens. Agent-readiness issues (prose-only output, missing `--json`, interactive prompts without bypass, unbounded list output) ship undetected because no review persona covers them.
Adding CLI readiness as a conditional persona in ce:review makes this expertise automatic -- the developer runs their normal review and gets CLI agent-readiness findings alongside security, performance, and other concerns.
## Requirements
**Persona Selection**
- R1. ce:review's orchestrator selects the CLI readiness persona based on diff analysis (same pattern as security-reviewer, performance-reviewer, etc.) -- not always-on
- R2. Activation signals: diff touches CLI command definitions, argument parsing, CLI framework usage, or command handler implementations. The orchestrator uses judgment (not keyword matching), consistent with how all other conditional personas are activated
- R3. Non-overlapping scope with agent-native-reviewer: CLI readiness evaluates CLI command structure and agent-friendliness; agent-native evaluates UI/agent tool parity. Both may activate on the same diff if it touches both CLI and UI code -- their findings address different concerns. Overlap is possible and handled during synthesis rather than prevented mechanically
**Persona Behavior**
- R4. Once dispatched, the persona self-scopes: identifies the framework, detects changed commands from the diff, and evaluates against the 7 principles from the standalone `cli-agent-readiness-reviewer` agent (used as reference material, not dispatched directly)
- R5. The persona returns findings in ce:review's standard JSON findings schema (same as all other conditional personas). For design-level findings that span multiple files or concern missing capabilities, use the most relevant command handler file as the canonical location
- R6. Severity mapping: Blocker -> P1, Friction -> P2, Optimization -> P3. The severity ceiling is P1 -- CLI readiness issues make the CLI harder for agents to use, they do not crash or corrupt
- R7. Autofix class: all findings use autofix_class `manual` or `advisory` with owner `human`. CLI readiness findings are design decisions (JSON schema design, flag semantics, error message content) that should not be auto-applied
- R8. Framework-idiomatic recommendations: findings reference the specific framework's patterns (e.g., "add `@click.option('--json', ...)` " for Click, not generic "add a --json flag")
**Integration**
- R9. Create a new lightweight persona agent file in `agents/review/` that distills the 7 principles into a code-review-oriented persona producing structured JSON findings. Add it to `ce-review/references/persona-catalog.md` in the cross-cutting conditional section with activation description and severity guidance
- R10. The existing standalone `cli-agent-readiness-reviewer` agent stays unchanged -- it remains available for direct invocation and whole-CLI audits. The new persona references the same principles but is optimized for ce:review's dispatch pattern and output format
## Success Criteria
- A ce:review run on a PR that modifies CLI command handlers includes CLI readiness findings in the review report without the user asking
- A ce:review run on a PR that only modifies React components or Rails views does not dispatch the CLI readiness persona
- Findings use framework-specific language matching the CLI's detected framework
- All findings have severity P1, P2, or P3 (never P0) and autofix_class `manual` or `advisory`
## Scope Boundaries
- This does not modify the standalone `cli-agent-readiness-reviewer` agent
- This does not add CLI awareness to ce:brainstorm or ce:plan (deferred -- ce:review alone covers the highest-value case)
- This does not introduce autofix for CLI readiness findings
## Key Decisions
- **New persona agent file**: A lightweight agent in `agents/review/` that distills the standalone agent's 7 principles into structured JSON findings. This matches how every other conditional persona works (security-reviewer, performance-reviewer, etc. are all separate agent files). The standalone agent's narrative report format doesn't match ce:review's JSON findings schema, and prompt surgery at dispatch time would be fragile.
- **Conditional, not always-on**: Follows the existing pattern where the orchestrator selects personas based on diff content. The persona never runs on non-CLI diffs.
- **Persona self-scopes**: The persona does its own framework detection and subcommand identification after dispatch. ce:review's orchestrator only decides whether to dispatch, not what framework is in use.
- **No autofix**: All findings route to human review. CLI readiness issues require design judgment.
- **Severity ceiling is P1**: CLI readiness issues don't crash the software -- they make it harder for agents to use. The highest reasonable severity is P1 (should fix), not P0 (must fix before merge).
## Outstanding Questions
### Deferred to Planning
- [Affects R9][Needs research] How much of the standalone agent's content should the new persona include directly vs. reference? The standalone agent is 24K+ (the largest review agent) -- the persona should be much smaller, distilling the principles into code-review-oriented checks rather than reproducing the full Framework Idioms Reference.
- [Affects R4][Needs research] Should the persona evaluate all 7 principles on every dispatch, or should it prioritize principles by command type (as the standalone agent does) and cap findings to avoid flooding the review with low-signal items?
## Next Steps
-> `/ce:plan` for structured implementation planning

View File

@@ -0,0 +1,172 @@
---
title: "feat: Add CLI agent-readiness conditional persona to ce:review"
type: feat
status: active
date: 2026-03-30
origin: docs/brainstorms/2026-03-30-cli-readiness-review-persona-requirements.md
---
# Add CLI Agent-Readiness Conditional Persona to ce:review
## Overview
Create a lightweight review persona that evaluates CLI code for agent readiness during ce:review. The persona distills the standalone `cli-agent-readiness-reviewer` agent's 7 principles into a compact, diff-focused reviewer that produces structured JSON findings -- matching the pattern of every other conditional persona (security-reviewer, performance-reviewer, etc.).
## Problem Frame
The `cli-agent-readiness-reviewer` agent exists but only fires when someone knows to invoke it. CLI code that passes through ce:review gets no agent-readiness feedback. Adding a conditional persona makes this automatic. (see origin: docs/brainstorms/2026-03-30-cli-readiness-review-persona-requirements.md)
## Requirements Trace
- R1. Conditional selection by orchestrator based on diff analysis
- R2. Activation on CLI command definitions, argument parsing, CLI framework usage
- R3. Non-overlapping scope with agent-native-reviewer
- R4. Self-scoping: framework detection and command identification from diff
- R5. Standard JSON findings schema output
- R6. Severity mapping: Blocker->P1, Friction->P2, Optimization->P3 (never P0 -- CLI readiness issues don't crash or corrupt)
- R7. Autofix class: `manual` or `advisory` with owner `human`
- R8. Framework-idiomatic recommendations in suggested_fix
- R9. New persona agent file + persona catalog entry
- R10. Standalone agent unchanged
## Scope Boundaries
- Does not modify the standalone `cli-agent-readiness-reviewer` agent
- Does not add CLI awareness to ce:brainstorm or ce:plan
- Does not introduce autofix for CLI readiness findings
## Context & Research
### Relevant Code and Patterns
- Persona agent pattern: `plugins/compound-engineering/agents/review/security-reviewer.md` (3.4 KB), `performance-reviewer.md` (3.0 KB) -- exact structure to follow
- Persona catalog: `plugins/compound-engineering/skills/ce-review/references/persona-catalog.md` -- cross-cutting conditional section
- Subagent template: `plugins/compound-engineering/skills/ce-review/references/subagent-template.md` -- provides output schema, scope rules, PR context (persona does not need to include these)
- Standalone agent: `plugins/compound-engineering/agents/review/cli-agent-readiness-reviewer.md` (24.3 KB) -- source of the 7 principles to distill
- Agent-native-reviewer: `plugins/compound-engineering/agents/review/agent-native-reviewer.md` -- non-overlapping domain reference
### Institutional Learnings
- Conditional personas are 3.0-5.7 KB with a fixed structure: frontmatter, identity paragraph, hunting patterns, confidence calibration, suppress list, output format
- The subagent template injects the findings schema, scope rules, and PR context -- the persona file only needs domain-specific content
- Activation is orchestrator judgment (not keyword matching) -- the catalog describes the conceptual domain
## Key Technical Decisions
- **Distill, don't reproduce**: The 7 principles become ~8 hunting pattern bullets. No Framework Idioms Reference in the persona -- the model uses its general knowledge of detected frameworks for `suggested_fix` specificity. Keeps the persona under 5 KB. (see origin: Key Decisions -- "New persona agent file")
- **All 7 principles, weighted by command type**: Evaluate all principles on every dispatch, but include a condensed command-type priority table so the persona weights findings appropriately (e.g., structured output matters most for read/query commands, idempotency matters most for mutating commands). Cap at ~5-7 findings to avoid flooding. (Resolves deferred question from origin)
- **Severity ceiling is P1**: CLI readiness issues never reach P0. Blocker->P1, Friction->P2, Optimization->P3. (see origin: Key Decisions)
- **No autofix**: All findings use `manual` or `advisory` autofix_class with `human` owner. CLI readiness findings require design judgment. (see origin: Key Decisions)
- **Framework detection as a behavior instruction**: Rather than embedding framework-specific patterns, instruct the persona to "detect the CLI framework from imports in the diff and provide framework-idiomatic recommendations in suggested_fix." This keeps the file small while satisfying R8.
## Open Questions
### Resolved During Planning
- **How much content from the standalone agent?** Distill the 7 principles into hunting pattern bullets (~1 sentence each). Include a condensed command-type priority table. No Framework Idioms Reference, no step-by-step methodology, no examples section. Target ~4 KB.
- **All principles or prioritize?** All 7, weighted by command type. The persona detects command types from the diff and adjusts which principles get the most attention. Cap at 5-7 findings per review.
### Deferred to Implementation
- Exact wording of hunting pattern bullets -- will be refined when writing the agent file, using the standalone agent's principle descriptions as source material
## Implementation Units
- [ ] **Unit 1: Create the persona agent file**
**Goal:** Create `cli-readiness-reviewer.md` in the review agents directory, following the exact structure of existing conditional personas.
**Requirements:** R4, R5, R6, R7, R8
**Dependencies:** None
**Files:**
- Create: `plugins/compound-engineering/agents/review/cli-readiness-reviewer.md`
**Approach:**
- Follow the exact structure of `security-reviewer.md` and `performance-reviewer.md`: frontmatter, identity paragraph, hunting patterns, confidence calibration, suppress list, output format
- Frontmatter: `name: cli-readiness-reviewer`, description in the standard conditional persona format, `model: inherit`, `tools: Read, Grep, Glob, Bash`, `color: blue`
- Identity paragraph: establishes the persona's lens -- evaluating CLI code for how well it serves autonomous agents, not just human users
- "What you're hunting for" section: distill the 7 principles into ~8 bullets. Each bullet names the issue pattern and why it matters for agents. Include a condensed command-type priority note
- "Confidence calibration": high (0.80+) for issues directly visible in the diff (missing --json flag, prompt without bypass); moderate (0.60-0.79) for issues that depend on context beyond the diff (whether other commands already have structured output); low (<0.60) suppress
- "What you don't flag": agent-native parity concerns (that's agent-native-reviewer's domain), non-CLI code, framework choice itself, test files, documentation-only changes
- "Output format": standard JSON template with severity capped at P1, autofix_class restricted to `manual`/`advisory`, owner always `human`
- Include severity mapping guidance: Blocker->P1, Friction->P2, Optimization->P3
- Include framework detection instruction: "Detect the CLI framework from imports in the diff. Reference framework-idiomatic patterns in suggested_fix (e.g., Click decorators, Cobra persistent flags, clap derive macros)."
**Patterns to follow:**
- `plugins/compound-engineering/agents/review/security-reviewer.md` -- structure, sections, size
- `plugins/compound-engineering/agents/review/performance-reviewer.md` -- structure, brevity
- `plugins/compound-engineering/agents/review/cli-agent-readiness-reviewer.md` -- source of the 7 principles to distill (Principles 1-7, lines 94-252)
**Test scenarios:**
- Happy path: persona file parses valid YAML frontmatter with all required fields (name, description, model, tools, color)
- Happy path: persona content follows the 6-section structure (identity, hunting patterns, calibration, suppress, output format)
- Edge case: persona file size is within the 3-5.7 KB range of existing personas (not bloated with framework reference material)
**Verification:**
- File exists at the expected path with valid frontmatter
- File follows the exact 6-section structure of existing conditional personas
- File size is under 6 KB
- All 7 CLI readiness principles are represented in hunting patterns
- Severity guidance caps at P1
- Autofix class restricted to manual/advisory
- No Framework Idioms Reference reproduced from the standalone agent
---
- [ ] **Unit 2: Add persona to the catalog**
**Goal:** Register the new persona in the ce:review persona catalog so the orchestrator knows when to dispatch it.
**Requirements:** R1, R2, R3, R9
**Dependencies:** Unit 1
**Files:**
- Modify: `plugins/compound-engineering/skills/ce-review/references/persona-catalog.md`
- Modify: `plugins/compound-engineering/README.md`
**Approach:**
- Add a row in the cross-cutting conditional personas table
- Persona name: `cli-readiness`
- Agent reference: `compound-engineering:review:cli-readiness-reviewer`
- Activation: "CLI command definitions, argument parsing, CLI framework usage, command handler implementations"
- Use domain description style (not framework names) consistent with other conditional personas
- Place after the existing conditional personas, before the stack-specific section
- Update the persona catalog section header from "Conditional (7 personas)" to "Conditional (8 personas)"
- Update the total persona count from 16 to 17 in persona-catalog.md header and ce-review SKILL.md
- Add cli-readiness-reviewer to the Review agents table in `plugins/compound-engineering/README.md` and verify the agent count
**Patterns to follow:**
- Existing conditional persona entries in `persona-catalog.md` (security, performance, api-contract, etc.)
**Test scenarios:**
- Happy path: `bun test` passes (no frontmatter or parsing regressions)
- Happy path: catalog entry follows the same column format as other conditional personas
- Edge case: activation description uses domain language, not specific framework names
**Verification:**
- The catalog has a new row for cli-readiness in the cross-cutting conditional section
- The agent reference uses the fully-qualified namespace
- The activation description is domain-level, not keyword-level
## System-Wide Impact
- **Interaction graph:** ce:review's orchestrator reads the diff, decides to dispatch cli-readiness-reviewer alongside other conditional personas. Findings flow through the standard merge/dedup pipeline (Stage 5) into the review report
- **API surface parity:** agent-native-reviewer covers UI/agent parity; cli-readiness-reviewer covers CLI agent-friendliness. Both may activate on the same diff -- their findings are complementary and handled by ce:review's existing dedup fingerprinting
- **Unchanged invariants:** The standalone `cli-agent-readiness-reviewer` agent is untouched. Direct invocations continue to work exactly as before
## Risks & Dependencies
| Risk | Mitigation |
|------|------------|
| Persona too large if principles aren't distilled enough | Target 4 KB, use security-reviewer as size benchmark. If over 6 KB, trim framework guidance |
| Persona findings flood the review with low-signal items | Cap at 5-7 findings via confidence calibration. Optimization-level items get P3 severity (user's discretion) |
## Sources & References
- **Origin document:** [docs/brainstorms/2026-03-30-cli-readiness-review-persona-requirements.md](docs/brainstorms/2026-03-30-cli-readiness-review-persona-requirements.md)
- Related code: `plugins/compound-engineering/agents/review/security-reviewer.md`, `performance-reviewer.md`
- Related code: `plugins/compound-engineering/agents/review/cli-agent-readiness-reviewer.md` (source of 7 principles)
- Related code: `plugins/compound-engineering/skills/ce-review/references/persona-catalog.md`