release: v2.35.1 — add system-wide test check to /workflows:work

- Add System-Wide Test Check to work command task execution loop (5 questions: callbacks, real chain coverage, orphaned state, API parity, error alignment)
- Add integration test guidance to Test Continuously section
- Add System-Wide Impact sections to plan templates (MORE + A LOT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Kieran Klaassen
2026-02-18 21:50:51 -08:00
parent d53ef1e837
commit 174cd4cff4
4 changed files with 56 additions and 1 deletions

View File

@@ -1,6 +1,6 @@
{
"name": "compound-engineering",
"version": "2.35.0",
"version": "2.35.1",
"description": "AI-powered development tools. 29 agents, 22 commands, 19 skills, 1 MCP server for code review, research, design, and workflow automation.",
"author": {
"name": "Kieran Klaassen",

View File

@@ -5,6 +5,15 @@ All notable changes to the compound-engineering plugin will be documented in thi
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [2.35.1] - 2026-02-18
### Changed
- **`/workflows:work` system-wide test check** — Added "System-Wide Test Check" to the task execution loop. Before marking a task done, forces five questions: what callbacks/middleware fire when this runs? Do tests exercise the real chain or just mocked isolation? Can failure leave orphaned state? What other interfaces need the same change? Do error strategies align across layers? Includes skip criteria for leaf-node changes. Also added integration test guidance to the "Test Continuously" section.
- **`/workflows:plan` system-wide impact templates** — Added "System-Wide Impact" section to MORE and A LOT plan templates (interaction graph, error propagation, state lifecycle, API surface parity, integration test scenarios) as lightweight prompts to flag risks during planning.
---
## [2.35.0] - 2026-02-17
### Fixed

View File

@@ -255,6 +255,14 @@ date: YYYY-MM-DD
- Performance implications
- Security considerations
## System-Wide Impact
- **Interaction graph**: [What callbacks/middleware/observers fire when this runs?]
- **Error propagation**: [How do errors flow across layers? Do retry strategies align?]
- **State lifecycle risks**: [Can partial failure leave orphaned/inconsistent state?]
- **API surface parity**: [What other interfaces expose similar functionality and need the same change?]
- **Integration test scenarios**: [Cross-layer scenarios that unit tests won't catch]
## Acceptance Criteria
- [ ] Detailed requirement 1
@@ -344,6 +352,28 @@ date: YYYY-MM-DD
[Other solutions evaluated and why rejected]
## System-Wide Impact
### Interaction Graph
[Map the chain reaction: what callbacks, middleware, observers, and event handlers fire when this code runs? Trace at least two levels deep. Document: "Action X triggers Y, which calls Z, which persists W."]
### Error & Failure Propagation
[Trace errors from lowest layer up. List specific error classes and where they're handled. Identify retry conflicts, unhandled error types, and silent failure swallowing.]
### State Lifecycle Risks
[Walk through each step that persists state. Can partial failure orphan rows, duplicate records, or leave caches stale? Document cleanup mechanisms or their absence.]
### API Surface Parity
[List all interfaces (classes, DSLs, endpoints) that expose equivalent functionality. Note which need updating and which share the code path.]
### Integration Test Scenarios
[3-5 cross-layer test scenarios that unit tests with mocks would never catch. Include expected behavior for each.]
## Acceptance Criteria
### Functional Requirements

View File

@@ -92,12 +92,27 @@ This command takes a work document (plan, specification, or todo file) and execu
- Look for similar patterns in codebase
- Implement following existing conventions
- Write tests for new functionality
- Run System-Wide Test Check (see below)
- Run tests after changes
- Mark task as completed in TodoWrite
- Mark off the corresponding checkbox in the plan file ([ ] → [x])
- Evaluate for incremental commit (see below)
```
**System-Wide Test Check** — Before marking a task done, pause and ask:
| Question | What to do |
|----------|------------|
| **What fires when this runs?** Callbacks, middleware, observers, event handlers — trace two levels out from your change. | Read the actual code (not docs) for callbacks on models you touch, middleware in the request chain, `after_*` hooks. |
| **Do my tests exercise the real chain?** If every dependency is mocked, the test proves your logic works *in isolation* — it says nothing about the interaction. | Write at least one integration test that uses real objects through the full callback/middleware chain. No mocks for the layers that interact. |
| **Can failure leave orphaned state?** If your code persists state (DB row, cache, file) before calling an external service, what happens when the service fails? Does retry create duplicates? | Trace the failure path with real objects. If state is created before the risky call, test that failure cleans up or that retry is idempotent. |
| **What other interfaces expose this?** Mixins, DSLs, alternative entry points (Agent vs Chat vs ChatMethods). | Grep for the method/behavior in related classes. If parity is needed, add it now — not as a follow-up. |
| **Do error strategies align across layers?** Retry middleware + application fallback + framework error handling — do they conflict or create double execution? | List the specific error classes at each layer. Verify your rescue list matches what the lower layer actually raises. |
**When to skip:** Leaf-node changes with no callbacks, no state persistence, no parallel interfaces. If the change is purely additive (new helper method, new view partial), the check takes 10 seconds and the answer is "nothing fires, skip."
**When this matters most:** Any change that touches models with callbacks, error handling with fallback/retry, or functionality exposed through multiple interfaces.
**IMPORTANT**: Always update the original plan document by checking off completed items. Use the Edit tool to change `- [ ]` to `- [x]` for each task you finish. This keeps the plan as a living document showing progress and ensures no checkboxes are left unchecked.
2. **Incremental Commits**
@@ -143,6 +158,7 @@ This command takes a work document (plan, specification, or todo file) and execu
- Don't wait until the end to test
- Fix failures immediately
- Add new tests for new functionality
- **Unit tests with mocks prove logic in isolation. Integration tests with real objects prove the layers work together.** If your change touches callbacks, middleware, or error handling — you need both.
5. **Figma Design Sync** (if applicable)