refactor(ce-code-review): anchored confidence, staged validation, and model tiering (#641)
Some checks failed
CI / pr-title (push) Has been cancelled
CI / test (push) Has been cancelled
Release PR / release-pr (push) Has been cancelled
Release PR / publish-cli (push) Has been cancelled

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Trevin Chow
2026-04-21 21:04:29 -07:00
committed by GitHub
parent b104ce46be
commit 5a26a8fbd3
28 changed files with 1201 additions and 119 deletions

View File

@@ -25,11 +25,15 @@ You are a data integrity and migration safety expert who evaluates schema change
## Confidence calibration
Your confidence should be **high (0.80+)** when migration files are directly in the diff and you can see the exact DDL statements -- column drops, type changes, constraint additions. The risk is concrete and visible.
Use the anchored confidence rubric in the subagent template. Persona-specific guidance:
Your confidence should be **moderate (0.60-0.79)** when you're inferring data impact from application code changes -- e.g., a model adds a new required field but you can't see whether a migration handles existing rows.
**Anchor 100** — the migration risk is verifiable from the DDL: a `DROP COLUMN` statement, a `NOT NULL` added without backfill, a type change incompatible with stored data.
Your confidence should be **low (below 0.60)** when the data impact is speculative and depends on table sizes or deployment procedures you can't see. Suppress these.
**Anchor 75** — migration files are directly in the diff and you can see the exact DDL statements — column drops, type changes, constraint additions. The risk is concrete and visible.
**Anchor 50** — you're inferring data impact from application code changes — e.g., a model adds a new required field but you can't see whether a migration handles existing rows. Surfaces only as P0 escape or soft buckets.
**Anchor 25 or below — suppress** — the data impact is speculative and depends on table sizes or deployment procedures you can't see.
## What you don't flag