feat: promote ce:review-beta to stable ce:review (#371)

2026-03-24 21:00:38 -07:00
parent 4e3af07962
commit 7c5ff445e3
25 changed files with 556 additions and 1137 deletions
--- a/docs/solutions/skill-design/beta-promotion-orchestration-contract.md
+++ b/docs/solutions/skill-design/beta-promotion-orchestration-contract.md
@@ -0,0 +1,44 @@
+---
+title: “Beta-to-stable promotions must update orchestration callers atomically”
+category: skill-design
+date: 2026-03-23
+module: plugins/compound-engineering/skills
+component: SKILL.md
+tags:
+  - skill-design
+  - beta-testing
+  - rollout-safety
+  - orchestration
+severity: medium
+description: “When promoting a beta skill to stable, update all orchestration callers in the same PR so they pass correct mode flags instead of inheriting defaults.”
+related:
+  - docs/solutions/skill-design/beta-skills-framework.md
+---
+
+## Problem
+
+When a beta skill introduces new invocation semantics (e.g., explicit mode flags), promoting it over its stable counterpart without updating orchestration callers causes those callers to silently inherit the wrong default behavior.
+
+## Solution
+
+Treat promotion as an orchestration contract change, not a file rename.
+
+1. Replace the stable skill with the promoted content
+2. Update every workflow that invokes the skill in the same PR
+3. Hardcode the intended mode at each callsite instead of relying on the default
+4. Add or update contract tests so the orchestration assumptions are executable
+
+## Applied: ce:review-beta -> ce:review (2026-03-24)
+
+This pattern was applied when promoting `ce:review-beta` to stable. The caller contract:
+
+- `lfg` -> `/ce:review mode:autofix`
+- `slfg` parallel phase -> `/ce:review mode:report-only`
+- Contract test in `tests/review-skill-contract.test.ts` enforces these mode flags
+
+## Prevention
+
+- When a beta skill changes invocation semantics, its promotion plan must include caller updates as a first-class implementation unit
+- Promotion PRs should be atomic: promote the skill and update orchestrators in the same branch
+- Add contract coverage for the promoted callsites so future refactors cannot silently drop required mode flags
+- Do not rely on “remembering later” for orchestration mode changes; encode them in docs, plans, and tests
--- a/docs/solutions/skill-design/beta-skills-framework.md
+++ b/docs/solutions/skill-design/beta-skills-framework.md
@@ -13,7 +13,7 @@ severity: medium
 description: "Pattern for trialing new skill versions alongside stable ones using a -beta suffix. Covers naming, plan file naming, internal references, and promotion path."
 related:
  - docs/solutions/skill-design/compound-refresh-skill-improvements.md
-  - docs/solutions/skill-design/review-skill-promotion-orchestration-contract.md
+  - docs/solutions/skill-design/beta-promotion-orchestration-contract.md
 ---

 ## Problem
@@ -80,7 +80,7 @@ When the beta version is validated:
 8. Verify `lfg`/`slfg` work with the promoted skill
 9. Verify `ce:work` consumes plans from the promoted skill

-If the beta skill changed its invocation contract, promotion must also update all orchestration callers in the same PR instead of relying on the stable default behavior. See [review-skill-promotion-orchestration-contract.md](./review-skill-promotion-orchestration-contract.md) for the concrete review-skill example.
+If the beta skill changed its invocation contract, promotion must also update all orchestration callers in the same PR instead of relying on the stable default behavior. See [beta-promotion-orchestration-contract.md](./beta-promotion-orchestration-contract.md) for the concrete review-skill example.

 ## Validation

--- a/docs/solutions/skill-design/review-skill-promotion-orchestration-contract.md
+++ b/docs/solutions/skill-design/review-skill-promotion-orchestration-contract.md
@@ -1,80 +0,0 @@
---
-title: "Promoting review-beta to stable must update orchestration callers in the same change"
-category: skill-design
-date: 2026-03-23
-module: plugins/compound-engineering/skills
-component: SKILL.md
-tags:
-  - skill-design
-  - beta-testing
-  - rollout-safety
-  - orchestration
-  - review-workflow
-severity: medium
-description: "When ce:review-beta is promoted to stable, update lfg/slfg in the same PR so they pass the correct mode instead of inheriting the interactive default."
-related:
-  - docs/solutions/skill-design/beta-skills-framework.md
-  - docs/plans/2026-03-23-001-feat-ce-review-beta-pipeline-mode-beta-plan.md
---
-
-## Problem
-
-`ce:review-beta` introduces an explicit mode contract:
-
- default `interactive`
- `mode:autonomous`
- `mode:report-only`
-
-That is correct for direct user invocation, but it creates a promotion hazard. If the beta skill is later promoted over stable `ce:review` without updating its orchestration callers, the surrounding workflows will silently inherit the interactive default.
-
-For the current review workflow family, that would be wrong:
-
- `lfg` should run review in `mode:autonomous`
- `slfg` should run review in `mode:report-only` during its parallel review/browser phase
-
-Without those caller changes, promotion would keep the skill name stable while changing its contract, which is exactly the kind of boundary drift that tends to escape manual review.
-
-## Solution
-
-Treat promotion as an orchestration contract change, not a file rename.
-
-When promoting `ce:review-beta` to stable:
-
-1. Replace stable `ce:review` with the promoted content
-2. Update every workflow that invokes `ce:review` in the same PR
-3. Hardcode the intended mode at each callsite instead of relying on the default
-4. Add or update contract tests so the orchestration assumptions are executable
-
-For the review workflow family, the expected caller contract is:
-
- `lfg` -> `ce:review mode:autonomous`
- `slfg` parallel phase -> `ce:review mode:report-only`
- any mutating review step in `slfg` must happen later, sequentially, or in an isolated checkout/worktree
-
-## Why This Lives Here
-
-This is not a good `AGENTS.md` note:
-
- it is specific to one beta-to-stable promotion
- it is easy for a temporary repo-global reminder to become stale
- future planning and review work is more likely to search `docs/solutions/skill-design/` than to rediscover an old ad hoc note in `AGENTS.md`
-
-The durable memory should live with the other skill-design rollout patterns.
-
-## Prevention
-
- When a beta skill changes invocation semantics, its promotion plan must include caller updates as a first-class implementation unit
- Promotion PRs should be atomic: promote the skill and update orchestrators in the same branch
- Add contract coverage for the promoted callsites so future refactors cannot silently drop required mode flags
- Do not rely on “remembering later” for orchestration mode changes; encode them in docs, plans, and tests
-
-## Lifecycle Note
-
-This note is intentionally tied to the `ce:review-beta` -> `ce:review` promotion window.
-
-Once that promotion is complete and the stable orchestrators/tests already encode the contract:
-
- update or archive this doc if it no longer adds distinct value
- do not leave it behind as a stale reminder for a promotion that already happened
-
-If the final stable design differs from the current expectation, revise this doc during the promotion PR so the historical note matches what actually shipped.
--- a/docs/solutions/workflow/todo-status-lifecycle.md
+++ b/docs/solutions/workflow/todo-status-lifecycle.md
@@ -11,7 +11,6 @@ tags:
 related_components:
  - plugins/compound-engineering/skills/todo-resolve/
  - plugins/compound-engineering/skills/ce-review/
-  - plugins/compound-engineering/skills/ce-review-beta/
  - plugins/compound-engineering/skills/todo-triage/
  - plugins/compound-engineering/skills/todo-create/
 problem_type: correctness-gap
@@ -21,12 +20,11 @@ problem_type: correctness-gap

 ## Problem

-The todo system defines a three-state lifecycle (`pending` -> `ready` -> `complete`) across three skills (`todo-create`, `todo-triage`, `todo-resolve`). Two review skills create todos with different status assumptions:
+The todo system defines a three-state lifecycle (`pending` -> `ready` -> `complete`) across three skills (`todo-create`, `todo-triage`, `todo-resolve`). Different sources create todos with different status assumptions:

 | Source | Status created | Reasoning |
 |--------|---------------|-----------|
-| `ce:review` | `pending` | Dumps all findings, expects separate `/todo-triage` |
-| `ce:review-beta` | `ready` | Built-in triage: confidence gating (>0.60), merge/dedup across 8 personas, owner routing. Only creates todos for `downstream-resolver` findings |
+| `ce:review` (autofix mode) | `ready` | Built-in triage: confidence gating (>0.60), merge/dedup across 8 personas, owner routing. Only creates todos for `downstream-resolver` findings |
 | `todo-create` (manual) | `pending` (default) | Template default |
 | `test-browser`, `test-xcode` | via `todo-create` | Inherit default |

@@ -46,9 +44,9 @@ Updated `todo-resolve` to partition todos by status in its Analyze step:

 This is a single-file change scoped to `todo-resolve/SKILL.md`. No schema changes, no new fields, no changes to `todo-create` or `todo-triage` -- just enforcement of the existing contract at the resolve boundary.

-## Key Insight: Review-Beta Promotion Eliminates Automated `pending`
+## Key Insight: No Automated Source Creates `pending` Todos

-Once `ce:review-beta` is promoted to stable (replacing `ce:review`), no automated source creates `pending` todos. The `pending` status becomes exclusively a human-authored state for manually created work items that need triage before action.
+No automated source creates `pending` todos. The `pending` status is exclusively a human-authored state for manually created work items that need triage before action.

 The safety model becomes:
 - **`ready`** = autofix-eligible. Triage already happened upstream (either built into the review pipeline or via explicit `/todo-triage`).
@@ -76,6 +74,6 @@ When a skill creates artifacts for downstream consumption, it should state which

 ## Cross-References

- [review-skill-promotion-orchestration-contract.md](../skill-design/review-skill-promotion-orchestration-contract.md) -- promotion hazard: if mode flags are dropped during promotion, the wrong artifacts are produced upstream
+- [beta-promotion-orchestration-contract.md](../skill-design/beta-promotion-orchestration-contract.md) -- promotion hazard: if mode flags are dropped during promotion, the wrong artifacts are produced upstream
 - [compound-refresh-skill-improvements.md](../skill-design/compound-refresh-skill-improvements.md) -- "conservative confidence in autonomous mode" principle that motivates status enforcement
 - [claude-permissions-optimizer-classification-fix.md](../skill-design/claude-permissions-optimizer-classification-fix.md) -- "pipeline ordering is an architectural invariant" pattern; the same concept applies to the review -> triage -> resolve pipeline