feat(ce-work-beta): add beta Codex delegation mode (#476)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Trevin Chow
2026-04-09 00:29:12 -07:00
committed by GitHub
parent 044a035e77
commit 31b0686c2e
15 changed files with 1549 additions and 1889 deletions

View File

@@ -191,7 +191,6 @@ Look for signals such as:
- The user explicitly asks for TDD, test-first, or characterization-first work
- The origin document calls for test-first implementation or exploratory hardening of legacy code
- Local research shows the target area is legacy, weakly tested, or historically fragile, suggesting characterization coverage before changing behavior
- The user asks for external delegation, says "use codex", "delegate mode", or mentions token conservation -- add `Execution target: external-delegate` to implementation units that are pure code writing
When the signal is clear, carry it forward silently in the relevant implementation units.
@@ -357,7 +356,7 @@ For each unit, include:
- **Dependencies** - what must exist first
- **Files** - repo-relative file paths to create, modify, or test (never absolute paths)
- **Approach** - key decisions, data flow, component boundaries, or integration notes
- **Execution note** - optional, only when the unit benefits from a non-default execution posture such as test-first, characterization-first, or external delegation
- **Execution note** - optional, only when the unit benefits from a non-default execution posture such as test-first or characterization-first
- **Technical design** - optional pseudo-code or diagram when the unit's approach is non-obvious and prose alone would leave it ambiguous. Frame explicitly as directional guidance, not implementation specification
- **Patterns to follow** - existing code or conventions to mirror
- **Test scenarios** - enumerate the specific test cases the implementer should write, right-sized to the unit's complexity and risk. Consider each category below and include scenarios from every category that applies to this unit. A simple config change may need one scenario; a payment flow may need a dozen. The quality signal is specificity — each scenario should name the input, action, and expected outcome so the implementer doesn't have to invent coverage. For units with no behavioral change (pure config, scaffolding, styling), use `Test expectation: none -- [reason]` instead of leaving the field blank.
@@ -373,7 +372,6 @@ Use `Execution note` sparingly. Good uses include:
- `Execution note: Start with a failing integration test for the request/response contract.`
- `Execution note: Add characterization coverage before modifying this legacy parser.`
- `Execution note: Implement new domain behavior test-first.`
- `Execution note: Execution target: external-delegate`
Do not expand units into literal `RED/GREEN/REFACTOR` substeps.
@@ -512,7 +510,7 @@ deepened: YYYY-MM-DD # optional, set when the confidence check substantively st
**Approach:**
- [Key design or sequencing decision]
**Execution note:** [Optional test-first, characterization-first, external-delegate, or other execution posture signal]
**Execution note:** [Optional test-first, characterization-first, or other execution posture signal]
**Technical design:** *(optional -- pseudo-code or diagram when the unit's approach is non-obvious. Directional guidance, not implementation specification.)*