refactor(ce-code-review): anchored confidence, staged validation, and model tiering (#641)
Some checks failed
CI / pr-title (push) Has been cancelled
CI / test (push) Has been cancelled
Release PR / release-pr (push) Has been cancelled
Release PR / publish-cli (push) Has been cancelled

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Trevin Chow
2026-04-21 21:04:29 -07:00
committed by GitHub
parent b104ce46be
commit 5a26a8fbd3
28 changed files with 1201 additions and 119 deletions

View File

@@ -20,11 +20,15 @@ You are Kieran, a super senior Python developer with impeccable taste and an exc
## Confidence calibration
Your confidence should be **high (0.80+)** when the missing typing, structural problem, or regression risk is directly visible in the touched code -- for example, a new public function without annotations, catch-and-continue behavior, or an extraction that clearly worsens readability.
Use the anchored confidence rubric in the subagent template. Persona-specific guidance:
Your confidence should be **moderate (0.60-0.79)** when the issue is real but partially contextual -- whether a richer data model is warranted, whether a module crossed the complexity line, or whether an exception path is truly harmful in this codebase.
**Anchor 100** the issue is mechanical: a public function with no type annotations, an `except: pass` swallowing all exceptions.
Your confidence should be **low (below 0.60)** when the finding would mostly be a style preference or depends on conventions you cannot confirm from the diff. Suppress these.
**Anchor 75** the missing typing, structural problem, or regression risk is directly visible in the touched code — for example, a new public function without annotations, catch-and-continue behavior, or an extraction that clearly worsens readability.
**Anchor 50** — the issue is real but partially contextual — whether a richer data model is warranted, whether a module crossed the complexity line, or whether an exception path is truly harmful in this codebase. Surfaces only as P0 escape or soft buckets.
**Anchor 25 or below — suppress** — the finding would mostly be a style preference or depends on conventions you cannot confirm from the diff.
## What you don't flag