Merge upstream origin/main (v2.60.0) with fork customizations preserved

Incorporates 78 upstream commits while preserving all local fork intent:
- Keep deleted: dhh-rails, kieran-rails, dspy-ruby, andrew-kane-gem-writer (FastAPI pivot)
- Merge both: ce-review (zip-agent + design-conformance wiring),
  kieran-python-reviewer (pipeline + FastAPI conventions),
  ce-brainstorm/ce-plan/ce-work (improvements + deploy wiring),
  todo-create (template refs + assessment block),
  best-practices-researcher (rename + FastAPI refs)
- Accept remote: 142 remote-only files, plugin.json, README.md
- Keep local: 71 local-only files (custom agents, skills, commands, voice)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
John Lamb
2026-03-31 12:28:53 -05:00
153 changed files with 12801 additions and 3761 deletions

View File

@@ -102,9 +102,10 @@
"_meta": {
"confidence_thresholds": {
"suppress": "Below 0.60 -- do not report. Finding is speculative noise.",
"flag": "0.60-0.69 -- include only when the persona's calibration says the issue is actionable at that confidence.",
"report": "0.70+ -- report with full confidence."
"suppress": "Below 0.60 -- do not report. Finding is speculative noise. Exception: P0 findings at 0.50+ may be reported.",
"flag": "0.60-0.69 -- include only when the issue is clearly actionable with concrete evidence.",
"confident": "0.70-0.84 -- real and important. Report with full evidence.",
"certain": "0.85-1.00 -- verifiable from the code alone. Report."
},
"severity_definitions": {
"P0": "Critical breakage, exploitable vulnerability, data loss/corruption. Must fix before merge.",
@@ -113,10 +114,10 @@
"P3": "Low-impact, narrow scope, minor improvement. User's discretion."
},
"autofix_classes": {
"safe_auto": "Local, deterministic code or test fix suitable for the in-skill fixer in autonomous mode.",
"gated_auto": "Concrete fix exists, but it changes behavior, permissions, contracts, or other sensitive areas that deserve explicit approval.",
"manual": "Actionable issue that should become residual work rather than an in-skill autofix.",
"advisory": "Informational or operational item that should be surfaced in the report only."
"safe_auto": "Local, deterministic code or test fix suitable for the in-skill fixer. Examples: extract duplicated helper, add missing nil check, fix off-by-one, add missing test, remove dead code. Do not default to advisory when a concrete safe fix exists.",
"gated_auto": "Concrete fix exists, but it changes behavior, permissions, contracts, or other sensitive areas that deserve explicit approval. Examples: add auth to unprotected endpoint, change API response shape.",
"manual": "Actionable issue that requires design decisions or cross-cutting changes. Examples: redesign data model, add pagination strategy, choose between architectural approaches.",
"advisory": "Informational or operational item that should be surfaced in the report only. Examples: design asymmetry the PR improves but does not fully resolve, residual risk notes, deployment considerations."
},
"owners": {
"review-fixer": "The in-skill fixer can own this when policy allows.",

View File

@@ -1,8 +1,8 @@
# Persona Catalog
13 reviewer personas organized in three tiers, plus CE-specific agents. The orchestrator uses this catalog to select which reviewers to spawn for each review.
21 reviewer personas organized into always-on, cross-cutting conditional, stack-specific conditional, and language/framework conditional layers, plus CE-specific agents. The orchestrator uses this catalog to select which reviewers to spawn for each review.
## Always-on (3 personas + 2 CE agents)
## Always-on (4 personas + 2 CE agents)
Spawned on every review regardless of diff content.
@@ -13,6 +13,7 @@ Spawned on every review regardless of diff content.
| `correctness` | `compound-engineering:review:correctness-reviewer` | Logic errors, edge cases, state bugs, error propagation, intent compliance |
| `testing` | `compound-engineering:review:testing-reviewer` | Coverage gaps, weak assertions, brittle tests, missing edge case tests |
| `maintainability` | `compound-engineering:review:maintainability-reviewer` | Coupling, complexity, naming, dead code, premature abstraction |
| `project-standards` | `compound-engineering:review:project-standards-reviewer` | CLAUDE.md and AGENTS.md compliance -- frontmatter, references, naming, cross-platform portability, tool selection |
**CE agents (unstructured output, synthesized separately):**
@@ -21,7 +22,7 @@ Spawned on every review regardless of diff content.
| `compound-engineering:review:agent-native-reviewer` | Verify new features are agent-accessible |
| `compound-engineering:research:learnings-researcher` | Search docs/solutions/ for past issues related to this PR's modules and patterns |
## Conditional (5 personas)
## Conditional (7 personas)
Spawned when the orchestrator identifies relevant patterns in the diff. The orchestrator reads the full diff and reasons about selection -- this is agent judgment, not keyword matching.
@@ -32,6 +33,20 @@ Spawned when the orchestrator identifies relevant patterns in the diff. The orch
| `api-contract` | `compound-engineering:review:api-contract-reviewer` | Route definitions, serializer/interface changes, event schemas, exported type signatures, API versioning |
| `data-migrations` | `compound-engineering:review:data-migrations-reviewer` | Migration files, schema changes, backfill scripts, data transformations |
| `reliability` | `compound-engineering:review:reliability-reviewer` | Error handling, retry logic, circuit breakers, timeouts, background jobs, async handlers, health checks |
| `adversarial` | `compound-engineering:review:adversarial-reviewer` | Diff has >=50 changed non-test, non-generated, non-lockfile lines, OR touches auth, payments, data mutations, external API integrations, or other high-risk domains |
| `previous-comments` | `compound-engineering:review:previous-comments-reviewer` | **PR-only.** Reviewing a PR that has existing review comments or review threads from prior review rounds. Skip entirely when no PR metadata was gathered in Stage 1. |
## Stack-Specific Conditional (5 personas)
These reviewers keep their original opinionated lens. They are additive with the cross-cutting personas above, not replacements for them.
| Persona | Agent | Select when diff touches... |
|---------|-------|---------------------------|
| `dhh-rails` | `compound-engineering:review:dhh-rails-reviewer` | Rails architecture, service objects, authentication/session choices, Hotwire-vs-SPA boundaries, or abstractions that may fight Rails conventions |
| `kieran-rails` | `compound-engineering:review:kieran-rails-reviewer` | Rails controllers, models, views, jobs, components, routes, or other application-layer Ruby code where clarity and conventions matter |
| `kieran-python` | `compound-engineering:review:kieran-python-reviewer` | Python modules, endpoints, services, scripts, or typed domain code |
| `kieran-typescript` | `compound-engineering:review:kieran-typescript-reviewer` | TypeScript components, services, hooks, utilities, or shared types |
| `julik-frontend-races` | `compound-engineering:review:julik-frontend-races-reviewer` | Stimulus/Turbo controllers, DOM event wiring, timers, async UI flows, animations, or frontend state transitions with race potential |
## Language & Framework Conditional (5 personas)
@@ -47,7 +62,7 @@ Spawned when the orchestrator identifies language or framework-specific patterns
## CE Conditional Agents (design, migration & external review)
These CE-native agents provide specialized analysis beyond what the persona agents cover.
These CE-native agents provide specialized analysis beyond what the persona agents cover. Spawn them when the diff includes database migrations, schema.rb, data backfills, design documents, or when the PR originates from specific platforms.
| Agent | Focus | Select when... |
|-------|-------|----------------|
@@ -58,8 +73,9 @@ These CE-native agents provide specialized analysis beyond what the persona agen
## Selection rules
1. **Always spawn all 3 always-on personas** plus the 2 CE always-on agents.
2. **For each conditional persona**, the orchestrator reads the diff and decides whether the persona's domain is relevant. This is a judgment call, not a keyword match.
3. **For language/framework conditional personas**, spawn when the diff contains files matching the persona's language or framework domain. Multiple language personas can be active simultaneously (e.g., both `python-quality` and `typescript-quality` if the diff touches both).
4. **For CE conditional agents**, check each agent's selection criteria. `design-conformance-reviewer`: spawn when the repo contains design docs or an active plan matching the branch. `schema-drift-detector` and `deployment-verification-agent`: spawn when the diff includes migration files (`db/migrate/*.rb`, `db/schema.rb`) or data backfill scripts. `zip-agent-validator`: spawn when the PR URL contains `git.zoominfo.com`.
5. **Announce the team** before spawning with a one-line justification per conditional reviewer selected.
1. **Always spawn all 4 always-on personas** plus the 2 CE always-on agents.
2. **For each cross-cutting conditional persona**, the orchestrator reads the diff and decides whether the persona's domain is relevant. This is a judgment call, not a keyword match.
3. **For each stack-specific conditional persona**, use file types and changed patterns as a starting point, then decide whether the diff actually introduces meaningful work for that reviewer. Do not spawn language-specific reviewers just because one config or generated file happens to match the extension.
4. **For each language/framework conditional persona**, check whether the diff touches language or framework-specific patterns that warrant deeper domain expertise. These are additive with stack-specific personas, not replacements.
5. **For CE conditional agents**, spawn when applicable: migration files (`db/migrate/*.rb`, `db/schema.rb`) or data backfill scripts trigger schema-drift-detector and deployment-verification-agent; design documents or active plans trigger design-conformance-reviewer; PR URLs containing `git.zoominfo.com` trigger zip-agent-validator.
6. **Announce the team** before spawning with a one-line justification per conditional reviewer selected.

View File

@@ -0,0 +1,94 @@
#!/usr/bin/env bash
# Resolve the review base branch and compute the merge-base for ce:review.
# Handles fork-safe remote resolution, PR metadata, and multi-fallback detection.
#
# Usage: bash references/resolve-base.sh
# Output: BASE:<sha> on success, ERROR:<message> on failure.
#
# Detects the base branch from (in priority order):
# 1. PR metadata (base ref + base repo for fork safety)
# 2. origin/HEAD symbolic ref
# 3. gh repo view defaultBranchRef
# 4. Common branch names: main, master, develop, trunk
set -euo pipefail
REVIEW_BASE_BRANCH=""
PR_BASE_REPO=""
PR_BASE_REMOTE=""
BASE_REF=""
# Step 1: Try PR metadata (handles fork workflows)
if command -v gh >/dev/null 2>&1; then
PR_META=$(gh pr view --json baseRefName,url 2>/dev/null || true)
if [ -n "$PR_META" ]; then
REVIEW_BASE_BRANCH=$(echo "$PR_META" | jq -r '.baseRefName // empty' 2>/dev/null || true)
PR_BASE_REPO=$(echo "$PR_META" | jq -r '.url // empty' 2>/dev/null | sed -n 's#https://github.com/\([^/]*/[^/]*\)/pull/.*#\1#p' || true)
fi
fi
# Step 2: Fall back to origin/HEAD
if [ -z "$REVIEW_BASE_BRANCH" ]; then
REVIEW_BASE_BRANCH=$(git symbolic-ref --quiet --short refs/remotes/origin/HEAD 2>/dev/null | sed 's#^origin/##' || true)
fi
# Step 3: Fall back to gh repo view
if [ -z "$REVIEW_BASE_BRANCH" ] && command -v gh >/dev/null 2>&1; then
REVIEW_BASE_BRANCH=$(gh repo view --json defaultBranchRef --jq '.defaultBranchRef.name' 2>/dev/null || true)
fi
# Step 4: Fall back to common branch names
if [ -z "$REVIEW_BASE_BRANCH" ]; then
for candidate in main master develop trunk; do
if git rev-parse --verify "origin/$candidate" >/dev/null 2>&1 || git rev-parse --verify "$candidate" >/dev/null 2>&1; then
REVIEW_BASE_BRANCH="$candidate"
break
fi
done
fi
# Resolve the base ref from the correct remote (fork-safe)
if [ -n "$REVIEW_BASE_BRANCH" ]; then
if [ -n "$PR_BASE_REPO" ]; then
PR_BASE_REMOTE=$(git remote -v | awk "index(\$2, \"github.com:$PR_BASE_REPO\") || index(\$2, \"github.com/$PR_BASE_REPO\") {print \$1; exit}")
if [ -n "$PR_BASE_REMOTE" ]; then
git rev-parse --verify "$PR_BASE_REMOTE/$REVIEW_BASE_BRANCH" >/dev/null 2>&1 || git fetch --no-tags "$PR_BASE_REMOTE" "$REVIEW_BASE_BRANCH:refs/remotes/$PR_BASE_REMOTE/$REVIEW_BASE_BRANCH" 2>/dev/null || git fetch --no-tags "$PR_BASE_REMOTE" "$REVIEW_BASE_BRANCH" 2>/dev/null || true
BASE_REF=$(git rev-parse --verify "$PR_BASE_REMOTE/$REVIEW_BASE_BRANCH" 2>/dev/null || true)
fi
fi
if [ -z "$BASE_REF" ]; then
# Only try origin if it exists as a remote; otherwise skip to avoid
# confusing errors in fork setups where origin points at the user's fork.
if git remote get-url origin >/dev/null 2>&1; then
git rev-parse --verify "origin/$REVIEW_BASE_BRANCH" >/dev/null 2>&1 || git fetch --no-tags origin "$REVIEW_BASE_BRANCH:refs/remotes/origin/$REVIEW_BASE_BRANCH" 2>/dev/null || git fetch --no-tags origin "$REVIEW_BASE_BRANCH" 2>/dev/null || true
BASE_REF=$(git rev-parse --verify "origin/$REVIEW_BASE_BRANCH" 2>/dev/null || true)
fi
# Fall back to a bare local ref only if remote resolution failed
if [ -z "$BASE_REF" ]; then
BASE_REF=$(git rev-parse --verify "$REVIEW_BASE_BRANCH" 2>/dev/null || true)
fi
fi
fi
# Compute merge-base
if [ -n "$BASE_REF" ]; then
BASE=$(git merge-base HEAD "$BASE_REF" 2>/dev/null) || BASE=""
if [ -z "$BASE" ] && [ "$(git rev-parse --is-shallow-repository 2>/dev/null || echo false)" = "true" ]; then
if git remote get-url origin >/dev/null 2>&1; then
git fetch --no-tags --unshallow origin 2>/dev/null || true
BASE=$(git merge-base HEAD "$BASE_REF" 2>/dev/null) || BASE=""
fi
if [ -z "$BASE" ] && [ -n "$PR_BASE_REMOTE" ] && [ "$PR_BASE_REMOTE" != "origin" ]; then
git fetch --no-tags --unshallow "$PR_BASE_REMOTE" 2>/dev/null || true
BASE=$(git merge-base HEAD "$BASE_REF" 2>/dev/null) || BASE=""
fi
fi
else
BASE=""
fi
if [ -n "$BASE" ]; then
echo "BASE:$BASE"
else
echo "ERROR:Unable to resolve review base branch locally. Fetch the base branch and rerun, or provide a PR number so the review scope can be determined from PR metadata."
fi

View File

@@ -92,6 +92,15 @@ Use this **exact format** when presenting synthesized review findings. Findings
- Residual risks: No rate limiting on export endpoint
- Testing gaps: No test for concurrent export requests
### Zip Agent Validation
- Evaluated: 8 zip-agent comments
- Validated: 2 (appear as findings #3 and #6 above)
- Collapsed: 6
- `app/services/order_service.rb:45`: "Missing error handling" -- handled by ApplicationService base class rescue
- `app/controllers/api/orders_controller.rb:18`: "Unbounded query" -- pagination enforced by ApiController concern
- _(4 more collapsed for stylistic/formatting concerns)_
---
> **Verdict:** Ready with fixes
@@ -101,16 +110,37 @@ Use this **exact format** when presenting synthesized review findings. Findings
> **Fix order:** P0 auth bypass -> P1 memory/pagination -> P2 error handling if straightforward
```
## Anti-patterns
Do NOT produce output like this. The following is wrong:
```markdown
Findings
Sev: P1
File: foo.go:42
Issue: Some problem description
Reviewer(s): adversarial
Confidence: 0.85
Route: advisory -> human
────────────────────────────────────────
Sev: P2
File: bar.go:99
Issue: Another problem
```
This fails because: no pipe-delimited tables, no severity-grouped `###` headers, uses box-drawing horizontal rules, no numbered findings, no `## Code Review Results` title, and the verdict is not in a blockquote. Always use the table format from the example above.
## Formatting Rules
- **Pipe-delimited markdown tables** -- never ASCII box-drawing characters
- **Pipe-delimited markdown tables** for findings -- never ASCII box-drawing characters or per-finding horizontal-rule separators between entries (the report-level `---` before the verdict is still required)
- **Severity-grouped sections** -- `### P0 -- Critical`, `### P1 -- High`, `### P2 -- Moderate`, `### P3 -- Low`. Omit empty severity levels.
- **Always include file:line location** for code review issues
- **Reviewer column** shows which persona(s) flagged the issue. Multiple reviewers = cross-reviewer agreement.
- **Confidence column** shows the finding's confidence score
- **Route column** shows the synthesized handling decision as ``<autofix_class> -> <owner>``.
- **Header includes** scope, intent, and reviewer team with per-conditional justifications
- **Mode line** -- include `interactive`, `autofix`, or `report-only`
- **Mode line** -- include `interactive`, `autofix`, `report-only`, or `headless`
- **Applied Fixes section** -- include only when a fix phase ran in this review invocation
- **Residual Actionable Work section** -- include only when unresolved actionable findings were handed off for later work
- **Pre-existing section** -- separate table, no confidence column (these are informational)
@@ -120,6 +150,19 @@ Use this **exact format** when presenting synthesized review findings. Findings
- **Deployment Notes section** -- key checklist items from deployment-verification-agent. Omit if the agent did not run.
- **Zip Agent Validation section** -- summary of zip-agent comment evaluation: total, validated (with cross-references to findings table), collapsed (with reasons). Omit if the agent did not run.
- **Coverage section** -- suppressed count, residual risks, testing gaps, failed reviewers
- **Zip Agent Validation section** -- summary of zip-agent comment evaluation: total, validated (with cross-references to findings table), collapsed (with reasons). Omit if the agent did not run.
- **Summary uses blockquotes** for verdict, reasoning, and fix order
- **Horizontal rule** (`---`) separates findings from verdict
- **`###` headers** for each section -- never plain text headers
## Headless Mode Format
In `mode:headless`, replace the interactive pipe-delimited table report with a structured text envelope. The headless format is defined in the `### Headless output format` section of SKILL.md. Key differences from the interactive format:
- **No pipe-delimited tables.** Findings use `[severity][autofix_class -> owner] File: <file:line> -- <title>` line format with indented Why/Evidence/Suggested fix lines.
- **Findings grouped by autofix_class** (gated-auto, manual, advisory) instead of severity. Within each group, findings are sorted by severity.
- **Verdict in header** (top of output) instead of bottom, so programmatic callers get it first.
- **`Artifact:` line** in metadata header gives callers the path to the full run artifact.
- **`[needs-verification]` marker** on findings where `requires_verification: true`.
- **Evidence lines** included per finding.
- **Completion signal:** "Review complete" as the final line.

View File

@@ -22,18 +22,45 @@ Return ONLY valid JSON matching the findings schema below. No prose, no markdown
{schema}
Confidence rubric (0.0-1.0 scale):
- 0.00-0.29: Not confident / likely false positive. Do not report.
- 0.30-0.49: Somewhat confident. Do not report -- too speculative for actionable review.
- 0.50-0.59: Moderately confident. Real but uncertain. Do not report unless P0 severity.
- 0.60-0.69: Confident enough to flag. Include only when the issue is clearly actionable.
- 0.70-0.84: Highly confident. Real and important. Report with full evidence.
- 0.85-1.00: Certain. Verifiable from the code alone. Report.
Suppress threshold: 0.60. Do not emit findings below 0.60 confidence (except P0 at 0.50+).
False-positive categories to actively suppress:
- Pre-existing issues unrelated to this diff (mark pre_existing: true for unchanged code the diff does not interact with; if the diff makes it newly relevant, it is secondary, not pre-existing)
- Pedantic style nitpicks that a linter/formatter would catch
- Code that looks wrong but is intentional (check comments, commit messages, PR description for intent)
- Issues already handled elsewhere in the codebase (check callers, guards, middleware)
- Suggestions that restate what the code already does in different words
- Generic "consider adding" advice without a concrete failure mode
Rules:
- Suppress any finding below your stated confidence floor (see your Confidence calibration section).
- Every finding MUST include at least one evidence item grounded in the actual code.
- Set pre_existing to true ONLY for issues in unchanged code that are unrelated to this diff. If the diff makes the issue newly relevant, it is NOT pre-existing.
- You are operationally read-only. You may use non-mutating inspection commands, including read-oriented `git` / `gh` commands, to gather evidence. Do not edit files, change branches, commit, push, create PRs, or otherwise mutate the checkout or repository state.
- Set `autofix_class` conservatively. Use `safe_auto` only when the fix is local, deterministic, and low-risk. Use `gated_auto` when a concrete fix exists but changes behavior/contracts/permissions. Use `manual` for actionable residual work. Use `advisory` for report-only items that should not become code-fix work.
- Set `autofix_class` accurately -- not every finding is `advisory`. Use this decision guide:
- `safe_auto`: The fix is local and deterministic — the fixer can apply it mechanically without design judgment. Examples: extracting a duplicated helper, adding a missing nil/null check, fixing an off-by-one, adding a missing test for an untested code path, removing dead code.
- `gated_auto`: A concrete fix exists but it changes contracts, permissions, or crosses a module boundary in a way that deserves explicit approval. Examples: adding authentication to an unprotected endpoint, changing a public API response shape, switching from soft-delete to hard-delete.
- `manual`: Actionable work that requires design decisions or cross-cutting changes. Examples: redesigning a data model, choosing between two valid architectural approaches, adding pagination to an unbounded query.
- `advisory`: Report-only items that should not become code-fix work. Examples: noting a design asymmetry the PR improves but doesn't fully resolve, flagging a residual risk, deployment notes.
Do not default to `advisory` when uncertain -- if a concrete fix is obvious, classify it as `safe_auto` or `gated_auto`.
- Set `owner` to the default next actor for this finding: `review-fixer`, `downstream-resolver`, `human`, or `release`.
- Set `requires_verification` to true whenever the likely fix needs targeted tests, a focused re-review, or operational validation before it should be trusted.
- suggested_fix is optional. Only include it when the fix is obvious and correct. A bad suggestion is worse than none.
- If you find no issues, return an empty findings array. Still populate residual_risks and testing_gaps if applicable.
- **Intent verification:** Compare the code changes against the stated intent (and PR title/body when available). If the code does something the intent does not describe, or fails to do something the intent promises, flag it as a finding. Mismatches between stated intent and actual code are high-value findings.
</output-contract>
<pr-context>
{pr_metadata}
</pr-context>
<review-context>
Intent: {intent_summary}
@@ -52,5 +79,6 @@ Diff:
| `{diff_scope_rules}` | `references/diff-scope.md` content | Primary/secondary/pre-existing tier rules |
| `{schema}` | `references/findings-schema.json` content | The JSON schema reviewers must conform to |
| `{intent_summary}` | Stage 2 output | 2-3 line description of what the change is trying to accomplish |
| `{pr_metadata}` | Stage 1 output | PR title, body, and URL when reviewing a PR. Empty string when reviewing a branch or standalone checkout |
| `{file_list}` | Stage 1 output | List of changed files from the scope step |
| `{diff}` | Stage 1 output | The actual diff content to review |