feat(testing): close the testing gap in ce:work, ce:plan, and testing-reviewer (#438)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -336,7 +336,7 @@ For each unit, include:
|
||||
- **Execution note** - optional, only when the unit benefits from a non-default execution posture such as test-first, characterization-first, or external delegation
|
||||
- **Technical design** - optional pseudo-code or diagram when the unit's approach is non-obvious and prose alone would leave it ambiguous. Frame explicitly as directional guidance, not implementation specification
|
||||
- **Patterns to follow** - existing code or conventions to mirror
|
||||
- **Test scenarios** - enumerate the specific test cases the implementer should write, right-sized to the unit's complexity and risk. Consider each category below and include scenarios from every category that applies to this unit. A simple config change may need one scenario; a payment flow may need a dozen. The quality signal is specificity — each scenario should name the input, action, and expected outcome so the implementer doesn't have to invent coverage.
|
||||
- **Test scenarios** - enumerate the specific test cases the implementer should write, right-sized to the unit's complexity and risk. Consider each category below and include scenarios from every category that applies to this unit. A simple config change may need one scenario; a payment flow may need a dozen. The quality signal is specificity — each scenario should name the input, action, and expected outcome so the implementer doesn't have to invent coverage. For units with no behavioral change (pure config, scaffolding, styling), use `Test expectation: none -- [reason]` instead of leaving the field blank.
|
||||
- **Happy path behaviors** - core functionality with expected inputs and outputs
|
||||
- **Edge cases** (when the unit has meaningful boundaries) - boundary values, empty inputs, nil/null states, concurrent access
|
||||
- **Error and failure paths** (when the unit has failure modes) - invalid input, downstream service failures, timeout behavior, permission denials
|
||||
@@ -496,7 +496,7 @@ deepened: YYYY-MM-DD # optional, set when the confidence check substantively st
|
||||
- [Existing file, class, or pattern]
|
||||
|
||||
**Test scenarios:**
|
||||
<!-- Include only categories that apply to this unit. Omit categories that don't. -->
|
||||
<!-- Include only categories that apply to this unit. Omit categories that don't. For units with no behavioral change, use "Test expectation: none -- [reason]" instead of leaving this section blank. -->
|
||||
- [Scenario: specific input/action -> expected outcome. Prefix with category — Happy path, Edge case, Error path, or Integration — to signal intent]
|
||||
|
||||
**Verification:**
|
||||
@@ -622,6 +622,7 @@ Before finalizing, check:
|
||||
- If test-first or characterization-first posture was explicit or strongly implied, the relevant units carry it forward with a lightweight `Execution note`
|
||||
- Each feature-bearing unit has test scenarios from every applicable category (happy path, edge cases, error paths, integration) — right-sized to the unit's complexity, not padded or skimped
|
||||
- Test scenarios name specific inputs, actions, and expected outcomes without becoming test code
|
||||
- Feature-bearing units with blank or missing test scenarios are flagged as incomplete — feature-bearing units must have actual test scenarios, not just an annotation. The `Test expectation: none -- [reason]` annotation is only valid for non-feature-bearing units (pure config, scaffolding, styling)
|
||||
- Deferred items are explicit and not hidden as fake certainty
|
||||
- If a High-Level Technical Design section is included, it uses the right medium for the work, carries the non-prescriptive framing, and does not contain implementation code (no imports, exact signatures, or framework-specific syntax)
|
||||
- Per-unit technical design fields, if present, are concise and directional rather than copy-paste-ready
|
||||
@@ -748,6 +749,7 @@ If the plan already has a `deepened:` date:
|
||||
- Units are too large, too vague, or broken into micro-steps
|
||||
- Approach notes are thin or do not name the pattern to follow
|
||||
- Test scenarios are vague (don't name inputs and expected outcomes), skip applicable categories (e.g., no error paths for a unit with failure modes, no integration scenarios for a unit crossing layers), or are disproportionate to the unit's complexity
|
||||
- Feature-bearing units have blank or missing test scenarios (feature-bearing units require actual test scenarios; the `Test expectation: none` annotation is only valid for non-feature-bearing units)
|
||||
- Verification outcomes are vague or not expressed as observable results
|
||||
|
||||
**System-Wide Impact**
|
||||
|
||||
@@ -151,6 +151,7 @@ Determine how to proceed based on what was provided in `<input_document>`.
|
||||
- Add, update, or remove tests to match implementation changes (see Test Discovery below)
|
||||
- Run System-Wide Test Check (see below)
|
||||
- Run tests after changes
|
||||
- Assess testing coverage: did this task change behavior? If yes, were tests written or updated? If no tests were added, is the justification deliberate (e.g., pure config, no behavioral change)?
|
||||
- Mark task as completed
|
||||
- Evaluate for incremental commit (see below)
|
||||
```
|
||||
@@ -295,7 +296,7 @@ Determine how to proceed based on what was provided in `<input_document>`.
|
||||
|
||||
3. **Final Validation**
|
||||
- All tasks marked completed
|
||||
- All tests pass
|
||||
- Testing addressed -- tests pass and new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
|
||||
- Linting passes
|
||||
- Code follows existing patterns
|
||||
- Figma designs match (if applicable)
|
||||
@@ -503,7 +504,7 @@ Before creating PR, verify:
|
||||
|
||||
- [ ] All clarifying questions asked and answered
|
||||
- [ ] All tasks marked completed
|
||||
- [ ] Tests pass (run project's test command)
|
||||
- [ ] Testing addressed -- tests pass AND new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
|
||||
- [ ] Linting passes (use linting-agent)
|
||||
- [ ] Code follows existing patterns
|
||||
- [ ] Figma designs match implementation (if applicable)
|
||||
|
||||
@@ -150,6 +150,7 @@ Determine how to proceed based on what was provided in `<input_document>`.
|
||||
- Add, update, or remove tests to match implementation changes (see Test Discovery below)
|
||||
- Run System-Wide Test Check (see below)
|
||||
- Run tests after changes
|
||||
- Assess testing coverage: did this task change behavior? If yes, were tests written or updated? If no tests were added, is the justification deliberate (e.g., pure config, no behavioral change)?
|
||||
- Mark task as completed
|
||||
- Evaluate for incremental commit (see below)
|
||||
```
|
||||
@@ -286,7 +287,7 @@ Determine how to proceed based on what was provided in `<input_document>`.
|
||||
|
||||
3. **Final Validation**
|
||||
- All tasks marked completed
|
||||
- All tests pass
|
||||
- Testing addressed -- tests pass and new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
|
||||
- Linting passes
|
||||
- Code follows existing patterns
|
||||
- Figma designs match (if applicable)
|
||||
@@ -430,7 +431,7 @@ Before creating PR, verify:
|
||||
|
||||
- [ ] All clarifying questions asked and answered
|
||||
- [ ] All tasks marked completed
|
||||
- [ ] Tests pass (run project's test command)
|
||||
- [ ] Testing addressed -- tests pass AND new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
|
||||
- [ ] Linting passes (use linting-agent)
|
||||
- [ ] Code follows existing patterns
|
||||
- [ ] Figma designs match implementation (if applicable)
|
||||
|
||||
Reference in New Issue
Block a user