feat(ce-plan): reduce token usage by extracting conditional references (#489)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -93,6 +93,10 @@ When adding or modifying skills, verify compliance with the skill spec:
|
||||
This resolves relative to the SKILL.md and substitutes content before the model sees it. If a file is over ~150 lines, prefer a backtick path even if it is always needed
|
||||
- [ ] For files the agent needs to *execute* (scripts, shell templates), always use backtick paths -- `@` would inline the script as text content instead of keeping it as an executable file
|
||||
|
||||
### Conditional and Late-Sequence Extraction
|
||||
|
||||
Skill content loaded at trigger time is carried in every subsequent message — every tool call, agent dispatch, and response. This carrying cost compounds across the session. For skills that orchestrate many tool or agent calls, extract blocks to `references/` when they are conditional (only execute under specific conditions) or late-sequence (only needed after many prior calls) and represent a meaningful share of the skill (~20%+). The more tool/agent calls a skill makes, the more aggressively to extract. Replace extracted blocks with a 1-3 line stub stating the condition and a backtick path reference (e.g., "Read `references/deepening-workflow.md`"). Never use `@` for extracted blocks — it inlines content at load time, defeating the extraction.
|
||||
|
||||
### Writing Style
|
||||
|
||||
- [ ] Use imperative/infinitive form (verb-first instructions)
|
||||
|
||||
@@ -587,35 +587,7 @@ For larger `Deep` plans, extend the core template only when useful with sections
|
||||
|
||||
#### 4.4 Visual Communication in Plan Documents
|
||||
|
||||
Section 3.4 covers diagrams about the *solution being planned* (pseudo-code, mermaid sequences, state diagrams). The existing Section 4.3 mermaid rule encourages those solution-design diagrams within Technical Design and per-unit fields. This guidance covers a different concern: visual aids that help readers *navigate and comprehend the plan document itself* -- dependency graphs, interaction diagrams, and comparison tables that make plan structure scannable.
|
||||
|
||||
Visual aids are conditional on content patterns, not on plan depth classification -- a Lightweight plan about a complex multi-unit workflow may warrant a dependency graph; a Deep plan about a straightforward feature may not.
|
||||
|
||||
**When to include:**
|
||||
|
||||
| Plan describes... | Visual aid | Placement |
|
||||
|---|---|---|
|
||||
| 4+ implementation units with non-linear dependencies (parallelism, diamonds, fan-in/fan-out) | Mermaid dependency graph | Before or after the Implementation Units heading |
|
||||
| System-Wide Impact naming 3+ interacting surfaces or cross-layer effects | Mermaid interaction or component diagram | Within the System-Wide Impact section |
|
||||
| Problem/Overview involving 3+ behavioral modes, states, or variants | Markdown comparison table | Within Overview or Problem Frame |
|
||||
| Key Technical Decisions with 3+ interacting decisions, or Alternative Approaches with 3+ alternatives | Markdown comparison table | Within the relevant section |
|
||||
|
||||
**When to skip:**
|
||||
- The plan has 3 or fewer units in a straight dependency chain -- the Dependencies field on each unit is sufficient
|
||||
- Prose already communicates the relationships clearly
|
||||
- The visual would duplicate what the High-Level Technical Design section already shows
|
||||
- The visual describes code-level detail (specific method names, SQL columns, API field lists)
|
||||
|
||||
**Format selection:**
|
||||
- **Mermaid** (default) for dependency graphs and interaction diagrams -- 5-15 nodes, no in-box annotations, standard flowchart shapes. Use `TB` (top-to-bottom) direction so diagrams stay narrow in both rendered and source form. Source should be readable as fallback in diff views and terminals.
|
||||
- **ASCII/box-drawing diagrams** for annotated flows that need rich in-box content -- file path layouts, decision logic branches, multi-column spatial arrangements. More expressive than mermaid when the diagram's value comes from annotations within nodes. Follow 80-column max for code blocks, use vertical stacking.
|
||||
- **Markdown tables** for mode/variant comparisons and decision/approach comparisons.
|
||||
- Keep diagrams proportionate to the plan. A 6-unit linear chain gets a simple 6-node graph. A complex dependency graph with fan-out and fan-in may need 10-15 nodes -- that is fine if every node earns its place.
|
||||
- Place inline at the point of relevance, not in a separate section.
|
||||
- Plan-structure level only -- unit dependencies, component interactions, mode comparisons, impact surfaces. Not implementation architecture, data schemas, or code structure (those belong in Section 3.4).
|
||||
- Prose is authoritative: when a visual aid and its surrounding prose disagree, the prose governs.
|
||||
|
||||
After generating a visual aid, verify it accurately represents the plan sections it illustrates -- correct dependency edges, no missing surfaces, no merged units.
|
||||
When the plan contains 4+ implementation units with non-linear dependencies, 3+ interacting surfaces in System-Wide Impact, 3+ behavioral modes/variants in Overview or Problem Frame, or 3+ interacting decisions in Key Technical Decisions or alternatives in Alternative Approaches, read `references/visual-communication.md` for diagram and table guidance. This covers plan-structure visuals (dependency graphs, interaction diagrams, comparison tables) — not solution-design diagrams, which are covered in Section 3.4.
|
||||
|
||||
### Phase 5: Final Review, Write File, and Handoff
|
||||
|
||||
@@ -701,323 +673,12 @@ Build a risk profile. Treat these as high-risk signals:
|
||||
|
||||
If the plan already appears sufficiently grounded and the thin-grounding override does not apply, report "Confidence check passed — no sections need strengthening" and skip to Phase 5.3.8 (Document Review). Document-review always runs regardless of whether deepening was needed — the two tools catch different classes of issues.
|
||||
|
||||
##### 5.3.3 Score Confidence Gaps
|
||||
##### 5.3.3–5.3.7 Deepening Execution
|
||||
|
||||
Use a checklist-first, risk-weighted scoring pass.
|
||||
When deepening is warranted, read `references/deepening-workflow.md` for confidence scoring checklists, section-to-agent dispatch mapping, execution mode selection, research execution, interactive finding review, and plan synthesis instructions. Execute steps 5.3.3 through 5.3.7 from that file, then return here for 5.3.8.
|
||||
|
||||
For each section, compute:
|
||||
- **Trigger count** - number of checklist problems that apply
|
||||
- **Risk bonus** - add 1 if the topic is high-risk and this section is materially relevant to that risk
|
||||
- **Critical-section bonus** - add 1 for `Key Technical Decisions`, `Implementation Units`, `System-Wide Impact`, `Risks & Dependencies`, or `Open Questions` in `Standard` or `Deep` plans
|
||||
##### 5.3.8–5.4 Document Review, Final Checks, and Post-Generation Options
|
||||
|
||||
Treat a section as a candidate if:
|
||||
- it hits **2+ total points**, or
|
||||
- it hits **1+ point** in a high-risk domain and the section is materially important
|
||||
|
||||
Choose only the top **2-5** sections by score. If deepening a lightweight plan (high-risk exception), cap at **1-2** sections.
|
||||
|
||||
If the plan already has a `deepened:` date:
|
||||
- Prefer sections that have not yet been substantially strengthened, if their scores are comparable
|
||||
- Revisit an already-deepened section only when it still scores clearly higher than alternatives
|
||||
|
||||
**Section Checklists:**
|
||||
|
||||
**Requirements Trace**
|
||||
- Requirements are vague or disconnected from implementation units
|
||||
- Success criteria are missing or not reflected downstream
|
||||
- Units do not clearly advance the traced requirements
|
||||
- Origin requirements are not clearly carried forward
|
||||
|
||||
**Context & Research / Sources & References**
|
||||
- Relevant repo patterns are named but never used in decisions or implementation units
|
||||
- Cited learnings or references do not materially shape the plan
|
||||
- High-risk work lacks appropriate external or internal grounding
|
||||
- Research is generic instead of tied to this repo or this plan
|
||||
|
||||
**Key Technical Decisions**
|
||||
- A decision is stated without rationale
|
||||
- Rationale does not explain tradeoffs or rejected alternatives
|
||||
- The decision does not connect back to scope, requirements, or origin context
|
||||
- An obvious design fork exists but the plan never addresses why one path won
|
||||
|
||||
**Open Questions**
|
||||
- Product blockers are hidden as assumptions
|
||||
- Planning-owned questions are incorrectly deferred to implementation
|
||||
- Resolved questions have no clear basis in repo context, research, or origin decisions
|
||||
- Deferred items are too vague to be useful later
|
||||
|
||||
**High-Level Technical Design (when present)**
|
||||
- The sketch uses the wrong medium for the work
|
||||
- The sketch contains implementation code rather than pseudo-code
|
||||
- The non-prescriptive framing is missing or weak
|
||||
- The sketch does not connect to the key technical decisions or implementation units
|
||||
|
||||
**High-Level Technical Design (when absent)** *(Standard or Deep plans only)*
|
||||
- The work involves DSL design, API surface design, multi-component integration, complex data flow, or state-heavy lifecycle
|
||||
- Key technical decisions would be easier to validate with a visual or pseudo-code representation
|
||||
- The approach section of implementation units is thin and a higher-level technical design would provide context
|
||||
|
||||
**Implementation Units**
|
||||
- Dependency order is unclear or likely wrong
|
||||
- File paths or test file paths are missing where they should be explicit
|
||||
- Units are too large, too vague, or broken into micro-steps
|
||||
- Approach notes are thin or do not name the pattern to follow
|
||||
- Test scenarios are vague (don't name inputs and expected outcomes), skip applicable categories (e.g., no error paths for a unit with failure modes, no integration scenarios for a unit crossing layers), or are disproportionate to the unit's complexity
|
||||
- Feature-bearing units have blank or missing test scenarios (feature-bearing units require actual test scenarios; the `Test expectation: none` annotation is only valid for non-feature-bearing units)
|
||||
- Verification outcomes are vague or not expressed as observable results
|
||||
|
||||
**System-Wide Impact**
|
||||
- Affected interfaces, callbacks, middleware, entry points, or parity surfaces are missing
|
||||
- Failure propagation is underexplored
|
||||
- State lifecycle, caching, or data integrity risks are absent where relevant
|
||||
- Integration coverage is weak for cross-layer work
|
||||
|
||||
**Risks & Dependencies / Documentation / Operational Notes**
|
||||
- Risks are listed without mitigation
|
||||
- Rollout, monitoring, migration, or support implications are missing when warranted
|
||||
- External dependency assumptions are weak or unstated
|
||||
- Security, privacy, performance, or data risks are absent where they obviously apply
|
||||
|
||||
Use the plan's own `Context & Research` and `Sources & References` as evidence. If those sections cite a pattern, learning, or risk that never affects decisions, implementation units, or verification, treat that as a confidence gap.
|
||||
|
||||
##### 5.3.4 Report and Dispatch Targeted Research
|
||||
|
||||
Before dispatching agents, report what sections are being strengthened and why:
|
||||
|
||||
```text
|
||||
Strengthening [section names] — [brief reason for each, e.g., "decision rationale is thin", "cross-boundary effects aren't mapped"]
|
||||
```
|
||||
|
||||
For each selected section, choose the smallest useful agent set. Do **not** run every agent. Use at most **1-3 agents per section** and usually no more than **8 agents total**.
|
||||
|
||||
Use fully-qualified agent names inside Task calls.
|
||||
|
||||
**Deterministic Section-to-Agent Mapping:**
|
||||
|
||||
**Requirements Trace / Open Questions classification**
|
||||
- `compound-engineering:workflow:spec-flow-analyzer` for missing user flows, edge cases, and handoff gaps
|
||||
- `compound-engineering:research:repo-research-analyst` (Scope: `architecture, patterns`) for repo-grounded patterns, conventions, and implementation reality checks
|
||||
|
||||
**Context & Research / Sources & References gaps**
|
||||
- `compound-engineering:research:learnings-researcher` for institutional knowledge and past solved problems
|
||||
- `compound-engineering:research:framework-docs-researcher` for official framework or library behavior
|
||||
- `compound-engineering:research:best-practices-researcher` for current external patterns and industry guidance
|
||||
- Add `compound-engineering:research:git-history-analyzer` only when historical rationale or prior art is materially missing
|
||||
|
||||
**Key Technical Decisions**
|
||||
- `compound-engineering:review:architecture-strategist` for design integrity, boundaries, and architectural tradeoffs
|
||||
- Add `compound-engineering:research:framework-docs-researcher` or `compound-engineering:research:best-practices-researcher` when the decision needs external grounding beyond repo evidence
|
||||
|
||||
**High-Level Technical Design**
|
||||
- `compound-engineering:review:architecture-strategist` for validating that the technical design accurately represents the intended approach and identifying gaps
|
||||
- `compound-engineering:research:repo-research-analyst` (Scope: `architecture, patterns`) for grounding the technical design in existing repo patterns and conventions
|
||||
- Add `compound-engineering:research:best-practices-researcher` when the technical design involves a DSL, API surface, or pattern that benefits from external validation
|
||||
|
||||
**Implementation Units / Verification**
|
||||
- `compound-engineering:research:repo-research-analyst` (Scope: `patterns`) for concrete file targets, patterns to follow, and repo-specific sequencing clues
|
||||
- `compound-engineering:review:pattern-recognition-specialist` for consistency, duplication risks, and alignment with existing patterns
|
||||
- Add `compound-engineering:workflow:spec-flow-analyzer` when sequencing depends on user flow or handoff completeness
|
||||
|
||||
**System-Wide Impact**
|
||||
- `compound-engineering:review:architecture-strategist` for cross-boundary effects, interface surfaces, and architectural knock-on impact
|
||||
- Add the specific specialist that matches the risk:
|
||||
- `compound-engineering:review:performance-oracle` for scalability, latency, throughput, and resource-risk analysis
|
||||
- `compound-engineering:review:security-sentinel` for auth, validation, exploit surfaces, and security boundary review
|
||||
- `compound-engineering:review:data-integrity-guardian` for migrations, persistent state safety, consistency, and data lifecycle risks
|
||||
|
||||
**Risks & Dependencies / Operational Notes**
|
||||
- Use the specialist that matches the actual risk:
|
||||
- `compound-engineering:review:security-sentinel` for security, auth, privacy, and exploit risk
|
||||
- `compound-engineering:review:data-integrity-guardian` for persistent data safety, constraints, and transaction boundaries
|
||||
- `compound-engineering:review:data-migration-expert` for migration realism, backfills, and production data transformation risk
|
||||
- `compound-engineering:review:deployment-verification-agent` for rollout checklists, rollback planning, and launch verification
|
||||
- `compound-engineering:review:performance-oracle` for capacity, latency, and scaling concerns
|
||||
|
||||
**Agent Prompt Shape:**
|
||||
|
||||
For each selected section, pass:
|
||||
- The scope prefix from the mapping above when the agent supports scoped invocation
|
||||
- A short plan summary
|
||||
- The exact section text
|
||||
- Why the section was selected, including which checklist triggers fired
|
||||
- The plan depth and risk profile
|
||||
- A specific question to answer
|
||||
|
||||
Instruct the agent to return:
|
||||
- findings that change planning quality
|
||||
- stronger rationale, sequencing, verification, risk treatment, or references
|
||||
- no implementation code
|
||||
- no shell commands
|
||||
|
||||
##### 5.3.5 Choose Research Execution Mode
|
||||
|
||||
Use the lightest mode that will work:
|
||||
|
||||
- **Direct mode** - Default. Use when the selected section set is small and the parent can safely read the agent outputs inline.
|
||||
- **Artifact-backed mode** - Use only when the selected research scope is large enough that inline returns would create unnecessary context pressure.
|
||||
|
||||
Signals that justify artifact-backed mode:
|
||||
- More than 5 agents are likely to return meaningful findings
|
||||
- The selected section excerpts are long enough that repeating them in multiple agent outputs would be wasteful
|
||||
- The topic is high-risk and likely to attract bulky source-backed analysis
|
||||
|
||||
If artifact-backed mode is not clearly warranted, stay in direct mode.
|
||||
|
||||
Artifact-backed mode uses a per-run scratch directory under `.context/compound-engineering/ce-plan/deepen/`.
|
||||
|
||||
##### 5.3.6 Run Targeted Research
|
||||
|
||||
Launch the selected agents in parallel using the execution mode chosen above. If the current platform does not support parallel dispatch, run them sequentially instead.
|
||||
|
||||
Prefer local repo and institutional evidence first. Use external research only when the gap cannot be closed responsibly from repo context or already-cited sources.
|
||||
|
||||
If a selected section can be improved by reading the origin document more carefully, do that before dispatching external agents.
|
||||
|
||||
**Direct mode:** Have each selected agent return its findings directly to the parent. Keep the return payload focused: strongest findings only, the evidence or sources that matter, the concrete planning improvement implied by the finding.
|
||||
|
||||
**Artifact-backed mode:** For each selected agent, instruct it to write one compact artifact file in the scratch directory and return only a short completion summary. Each artifact should contain: target section, why selected, 3-7 findings, source-backed rationale, the specific plan change implied by each finding. No implementation code, no shell commands.
|
||||
|
||||
If an artifact is missing or clearly malformed, re-run that agent or fall back to direct-mode reasoning for that section.
|
||||
|
||||
If agent outputs conflict:
|
||||
- Prefer repo-grounded and origin-grounded evidence over generic advice
|
||||
- Prefer official framework documentation over secondary best-practice summaries when the conflict is about library behavior
|
||||
- If a real tradeoff remains, record it explicitly in the plan
|
||||
|
||||
##### 5.3.6b Interactive Finding Review (Interactive Mode Only)
|
||||
|
||||
Skip this step in auto mode — proceed directly to 5.3.7.
|
||||
|
||||
In interactive mode, present each agent's findings to the user before integration. For each agent that returned findings:
|
||||
|
||||
1. **Summarize the agent and its target section** — e.g., "The architecture-strategist reviewed Key Technical Decisions and found:"
|
||||
2. **Present the findings concisely** — bullet the key points, not the raw agent output. Include enough context for the user to evaluate: what the agent found, what evidence supports it, and what plan change it implies.
|
||||
3. **Ask the user** using the platform's blocking question tool when available (see Interaction Method):
|
||||
- **Accept** — integrate these findings into the plan
|
||||
- **Reject** — discard these findings entirely
|
||||
- **Discuss** — the user wants to talk through the findings before deciding
|
||||
|
||||
If the user chooses "Discuss", engage in brief dialogue about the findings and then re-ask with only accept/reject (no discuss option on the second ask). The user makes a deliberate choice either way.
|
||||
|
||||
When presenting findings from multiple agents targeting the same section, present them one agent at a time so the user can make independent decisions. Do not merge findings from different agents before showing them.
|
||||
|
||||
After all agents have been reviewed, carry only the accepted findings forward to 5.3.7.
|
||||
|
||||
If the user accepted no findings, report "No findings accepted — plan unchanged." If artifact-backed mode was used, clean up the scratch directory before continuing. Then proceed directly to Phase 5.4 (skip document-review and synthesis — the plan was not modified). This interactive-mode-only skip does not apply in auto mode; auto mode always proceeds through 5.3.7 and 5.3.8.
|
||||
|
||||
If findings were accepted and the plan was modified, proceed through 5.3.7 and 5.3.8 as normal — document-review acts as a quality gate on the changes.
|
||||
|
||||
##### 5.3.7 Synthesize and Update the Plan
|
||||
|
||||
Strengthen only the selected sections. Keep the plan coherent and preserve its overall structure.
|
||||
|
||||
**In interactive mode:** Only integrate findings the user accepted in 5.3.6b. If some findings from different agents touch the same section, reconcile them coherently but do not reintroduce rejected findings.
|
||||
|
||||
Allowed changes:
|
||||
- Clarify or strengthen decision rationale
|
||||
- Tighten requirements trace or origin fidelity
|
||||
- Reorder or split implementation units when sequencing is weak
|
||||
- Add missing pattern references, file/test paths, or verification outcomes
|
||||
- Expand system-wide impact, risks, or rollout treatment where justified
|
||||
- Reclassify open questions between `Resolved During Planning` and `Deferred to Implementation` when evidence supports the change
|
||||
- Strengthen, replace, or add a High-Level Technical Design section when the work warrants it and the current representation is weak
|
||||
- Strengthen or add per-unit technical design fields where the unit's approach is non-obvious
|
||||
- Add or update `deepened: YYYY-MM-DD` in frontmatter when the plan was substantively improved
|
||||
|
||||
Do **not**:
|
||||
- Add implementation code — no imports, exact method signatures, or framework-specific syntax. Pseudo-code sketches and DSL grammars are allowed
|
||||
- Add git commands, commit choreography, or exact test command recipes
|
||||
- Add generic `Research Insights` subsections everywhere
|
||||
- Rewrite the entire plan from scratch
|
||||
- Invent new product requirements, scope changes, or success criteria without surfacing them explicitly
|
||||
|
||||
If research reveals a product-level ambiguity that should change behavior or scope:
|
||||
- Do not silently decide it here
|
||||
- Record it under `Open Questions`
|
||||
- Recommend `ce:brainstorm` if the gap is truly product-defining
|
||||
|
||||
##### 5.3.8 Document Review
|
||||
|
||||
After the confidence check (and any deepening), run the `document-review` skill on the plan file. Pass the plan path as the argument. When this step is reached, it is mandatory — do not skip it because the confidence check already ran. The two tools catch different classes of issues.
|
||||
|
||||
The confidence check and document-review are complementary:
|
||||
- The confidence check strengthens rationale, sequencing, risk treatment, and grounding
|
||||
- Document-review checks coherence, feasibility, scope alignment, and surfaces role-specific issues
|
||||
|
||||
If document-review returns findings that were auto-applied, note them briefly when presenting handoff options. If residual P0/P1 findings were surfaced, mention them so the user can decide whether to address them before proceeding.
|
||||
|
||||
When document-review returns "Review complete", proceed to Final Checks.
|
||||
|
||||
**Pipeline mode:** If invoked from an automated workflow such as LFG, SLFG, or any `disable-model-invocation` context, run `document-review` with `mode:headless` and the plan path. Headless mode applies auto-fixes silently and returns structured findings without interactive prompts. Address any P0/P1 findings before returning control to the caller.
|
||||
|
||||
##### 5.3.9 Final Checks and Cleanup
|
||||
|
||||
Before proceeding to post-generation options:
|
||||
- Confirm the plan is stronger in specific ways, not merely longer
|
||||
- Confirm the planning boundary is intact
|
||||
- Confirm origin decisions were preserved when an origin document exists
|
||||
|
||||
If artifact-backed mode was used:
|
||||
- Clean up the temporary scratch directory after the plan is safely updated
|
||||
- If cleanup is not practical on the current platform, note where the artifacts were left
|
||||
|
||||
#### 5.4 Post-Generation Options
|
||||
|
||||
**Pipeline mode:** If invoked from an automated workflow such as LFG, SLFG, or any `disable-model-invocation` context, skip the interactive menu below and return control to the caller immediately. The plan file has already been written, the confidence check has already run, and document-review has already run — the caller (e.g., lfg, slfg) determines the next step.
|
||||
|
||||
After document-review completes, present the options using the platform's blocking question tool when available (see Interaction Method). Otherwise present numbered options in chat and wait for the user's reply before proceeding.
|
||||
|
||||
**Question:** "Plan ready at `docs/plans/YYYY-MM-DD-NNN-<type>-<name>-plan.md`. What would you like to do next?"
|
||||
|
||||
**Options:**
|
||||
1. **Start `/ce:work`** - Begin implementing this plan in the current environment (recommended)
|
||||
2. **Open plan in editor** - Open the plan file for review
|
||||
3. **Run additional document review** - Another pass for further refinement
|
||||
4. **Share to Proof** - Upload the plan for collaborative review and sharing
|
||||
5. **Start `/ce:work` in another session** - Begin implementing in a separate agent session when the current platform supports it
|
||||
6. **Create Issue** - Create an issue in the configured tracker
|
||||
|
||||
Based on selection:
|
||||
- **Open plan in editor** → Open `docs/plans/<plan_filename>.md` using the current platform's file-open or editor mechanism (e.g., `open` on macOS, `xdg-open` on Linux, or the IDE's file-open API)
|
||||
- **Run additional document review** → Load the `document-review` skill with the plan path for another pass
|
||||
- **Share to Proof** → Upload the plan:
|
||||
```bash
|
||||
CONTENT=$(cat docs/plans/<plan_filename>.md)
|
||||
TITLE="Plan: <plan title from frontmatter>"
|
||||
RESPONSE=$(curl -s -X POST https://www.proofeditor.ai/share/markdown \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "$(jq -n --arg title "$TITLE" --arg markdown "$CONTENT" --arg by "ai:compound" '{title: $title, markdown: $markdown, by: $by}')")
|
||||
PROOF_URL=$(echo "$RESPONSE" | jq -r '.tokenUrl')
|
||||
```
|
||||
Display `View & collaborate in Proof: <PROOF_URL>` if successful, then return to the options
|
||||
- **`/ce:work`** → Call `/ce:work` with the plan path
|
||||
- **`/ce:work` in another session** → If the current platform supports launching a separate agent session, start `/ce:work` with the plan path there. Otherwise, explain the limitation briefly and offer to run `/ce:work` in the current session instead.
|
||||
- **Create Issue** → Follow the Issue Creation section below
|
||||
- **Other** → Accept free text for revisions and loop back to options
|
||||
|
||||
## Issue Creation
|
||||
|
||||
When the user selects "Create Issue", detect their project tracker from `AGENTS.md` or, if needed for compatibility, `CLAUDE.md`:
|
||||
|
||||
1. Look for `project_tracker: github` or `project_tracker: linear`
|
||||
2. If GitHub:
|
||||
|
||||
```bash
|
||||
gh issue create --title "<type>: <title>" --body-file <plan_path>
|
||||
```
|
||||
|
||||
3. If Linear:
|
||||
|
||||
```bash
|
||||
linear issue create --title "<title>" --description "$(cat <plan_path>)"
|
||||
```
|
||||
|
||||
4. If no tracker is configured:
|
||||
- Ask which tracker they use using the platform's blocking question tool when available (see Interaction Method)
|
||||
- Suggest adding the tracker to `AGENTS.md` for future runs
|
||||
|
||||
After issue creation:
|
||||
- Display the issue URL
|
||||
- Ask whether to proceed to `/ce:work`
|
||||
When reaching this phase, read `references/plan-handoff.md` for document review instructions (5.3.8), final checks and cleanup (5.3.9), post-generation options menu (5.4), and issue creation. Do not load this file earlier. Document review is mandatory — do not skip it even if the confidence check already ran.
|
||||
|
||||
NEVER CODE! Research, decide, and write the plan.
|
||||
|
||||
@@ -0,0 +1,238 @@
|
||||
# Deepening Workflow
|
||||
|
||||
This file contains the confidence-check execution path (5.3.3-5.3.7). Load it only when the deepening gate at 5.3.2 determines that deepening is warranted.
|
||||
|
||||
## 5.3.3 Score Confidence Gaps
|
||||
|
||||
Use a checklist-first, risk-weighted scoring pass.
|
||||
|
||||
For each section, compute:
|
||||
- **Trigger count** - number of checklist problems that apply
|
||||
- **Risk bonus** - add 1 if the topic is high-risk and this section is materially relevant to that risk
|
||||
- **Critical-section bonus** - add 1 for `Key Technical Decisions`, `Implementation Units`, `System-Wide Impact`, `Risks & Dependencies`, or `Open Questions` in `Standard` or `Deep` plans
|
||||
|
||||
Treat a section as a candidate if:
|
||||
- it hits **2+ total points**, or
|
||||
- it hits **1+ point** in a high-risk domain and the section is materially important
|
||||
|
||||
Choose only the top **2-5** sections by score. If deepening a lightweight plan (high-risk exception), cap at **1-2** sections.
|
||||
|
||||
If the plan already has a `deepened:` date:
|
||||
- Prefer sections that have not yet been substantially strengthened, if their scores are comparable
|
||||
- Revisit an already-deepened section only when it still scores clearly higher than alternatives
|
||||
|
||||
**Section Checklists:**
|
||||
|
||||
**Requirements Trace**
|
||||
- Requirements are vague or disconnected from implementation units
|
||||
- Success criteria are missing or not reflected downstream
|
||||
- Units do not clearly advance the traced requirements
|
||||
- Origin requirements are not clearly carried forward
|
||||
|
||||
**Context & Research / Sources & References**
|
||||
- Relevant repo patterns are named but never used in decisions or implementation units
|
||||
- Cited learnings or references do not materially shape the plan
|
||||
- High-risk work lacks appropriate external or internal grounding
|
||||
- Research is generic instead of tied to this repo or this plan
|
||||
|
||||
**Key Technical Decisions**
|
||||
- A decision is stated without rationale
|
||||
- Rationale does not explain tradeoffs or rejected alternatives
|
||||
- The decision does not connect back to scope, requirements, or origin context
|
||||
- An obvious design fork exists but the plan never addresses why one path won
|
||||
|
||||
**Open Questions**
|
||||
- Product blockers are hidden as assumptions
|
||||
- Planning-owned questions are incorrectly deferred to implementation
|
||||
- Resolved questions have no clear basis in repo context, research, or origin decisions
|
||||
- Deferred items are too vague to be useful later
|
||||
|
||||
**High-Level Technical Design (when present)**
|
||||
- The sketch uses the wrong medium for the work
|
||||
- The sketch contains implementation code rather than pseudo-code
|
||||
- The non-prescriptive framing is missing or weak
|
||||
- The sketch does not connect to the key technical decisions or implementation units
|
||||
|
||||
**High-Level Technical Design (when absent)** *(Standard or Deep plans only)*
|
||||
- The work involves DSL design, API surface design, multi-component integration, complex data flow, or state-heavy lifecycle
|
||||
- Key technical decisions would be easier to validate with a visual or pseudo-code representation
|
||||
- The approach section of implementation units is thin and a higher-level technical design would provide context
|
||||
|
||||
**Implementation Units**
|
||||
- Dependency order is unclear or likely wrong
|
||||
- File paths or test file paths are missing where they should be explicit
|
||||
- Units are too large, too vague, or broken into micro-steps
|
||||
- Approach notes are thin or do not name the pattern to follow
|
||||
- Test scenarios are vague (don't name inputs and expected outcomes), skip applicable categories (e.g., no error paths for a unit with failure modes, no integration scenarios for a unit crossing layers), or are disproportionate to the unit's complexity
|
||||
- Feature-bearing units have blank or missing test scenarios (feature-bearing units require actual test scenarios; the `Test expectation: none` annotation is only valid for non-feature-bearing units)
|
||||
- Verification outcomes are vague or not expressed as observable results
|
||||
|
||||
**System-Wide Impact**
|
||||
- Affected interfaces, callbacks, middleware, entry points, or parity surfaces are missing
|
||||
- Failure propagation is underexplored
|
||||
- State lifecycle, caching, or data integrity risks are absent where relevant
|
||||
- Integration coverage is weak for cross-layer work
|
||||
|
||||
**Risks & Dependencies / Documentation / Operational Notes**
|
||||
- Risks are listed without mitigation
|
||||
- Rollout, monitoring, migration, or support implications are missing when warranted
|
||||
- External dependency assumptions are weak or unstated
|
||||
- Security, privacy, performance, or data risks are absent where they obviously apply
|
||||
|
||||
Use the plan's own `Context & Research` and `Sources & References` as evidence. If those sections cite a pattern, learning, or risk that never affects decisions, implementation units, or verification, treat that as a confidence gap.
|
||||
|
||||
## 5.3.4 Report and Dispatch Targeted Research
|
||||
|
||||
Before dispatching agents, report what sections are being strengthened and why:
|
||||
|
||||
```text
|
||||
Strengthening [section names] — [brief reason for each, e.g., "decision rationale is thin", "cross-boundary effects aren't mapped"]
|
||||
```
|
||||
|
||||
For each selected section, choose the smallest useful agent set. Do **not** run every agent. Use at most **1-3 agents per section** and usually no more than **8 agents total**.
|
||||
|
||||
Use fully-qualified agent names inside Task calls.
|
||||
|
||||
**Deterministic Section-to-Agent Mapping:**
|
||||
|
||||
**Requirements Trace / Open Questions classification**
|
||||
- `compound-engineering:workflow:spec-flow-analyzer` for missing user flows, edge cases, and handoff gaps
|
||||
- `compound-engineering:research:repo-research-analyst` (Scope: `architecture, patterns`) for repo-grounded patterns, conventions, and implementation reality checks
|
||||
|
||||
**Context & Research / Sources & References gaps**
|
||||
- `compound-engineering:research:learnings-researcher` for institutional knowledge and past solved problems
|
||||
- `compound-engineering:research:framework-docs-researcher` for official framework or library behavior
|
||||
- `compound-engineering:research:best-practices-researcher` for current external patterns and industry guidance
|
||||
- Add `compound-engineering:research:git-history-analyzer` only when historical rationale or prior art is materially missing
|
||||
|
||||
**Key Technical Decisions**
|
||||
- `compound-engineering:review:architecture-strategist` for design integrity, boundaries, and architectural tradeoffs
|
||||
- Add `compound-engineering:research:framework-docs-researcher` or `compound-engineering:research:best-practices-researcher` when the decision needs external grounding beyond repo evidence
|
||||
|
||||
**High-Level Technical Design**
|
||||
- `compound-engineering:review:architecture-strategist` for validating that the technical design accurately represents the intended approach and identifying gaps
|
||||
- `compound-engineering:research:repo-research-analyst` (Scope: `architecture, patterns`) for grounding the technical design in existing repo patterns and conventions
|
||||
- Add `compound-engineering:research:best-practices-researcher` when the technical design involves a DSL, API surface, or pattern that benefits from external validation
|
||||
|
||||
**Implementation Units / Verification**
|
||||
- `compound-engineering:research:repo-research-analyst` (Scope: `patterns`) for concrete file targets, patterns to follow, and repo-specific sequencing clues
|
||||
- `compound-engineering:review:pattern-recognition-specialist` for consistency, duplication risks, and alignment with existing patterns
|
||||
- Add `compound-engineering:workflow:spec-flow-analyzer` when sequencing depends on user flow or handoff completeness
|
||||
|
||||
**System-Wide Impact**
|
||||
- `compound-engineering:review:architecture-strategist` for cross-boundary effects, interface surfaces, and architectural knock-on impact
|
||||
- Add the specific specialist that matches the risk:
|
||||
- `compound-engineering:review:performance-oracle` for scalability, latency, throughput, and resource-risk analysis
|
||||
- `compound-engineering:review:security-sentinel` for auth, validation, exploit surfaces, and security boundary review
|
||||
- `compound-engineering:review:data-integrity-guardian` for migrations, persistent state safety, consistency, and data lifecycle risks
|
||||
|
||||
**Risks & Dependencies / Operational Notes**
|
||||
- Use the specialist that matches the actual risk:
|
||||
- `compound-engineering:review:security-sentinel` for security, auth, privacy, and exploit risk
|
||||
- `compound-engineering:review:data-integrity-guardian` for persistent data safety, constraints, and transaction boundaries
|
||||
- `compound-engineering:review:data-migration-expert` for migration realism, backfills, and production data transformation risk
|
||||
- `compound-engineering:review:deployment-verification-agent` for rollout checklists, rollback planning, and launch verification
|
||||
- `compound-engineering:review:performance-oracle` for capacity, latency, and scaling concerns
|
||||
|
||||
**Agent Prompt Shape:**
|
||||
|
||||
For each selected section, pass:
|
||||
- The scope prefix from the mapping above when the agent supports scoped invocation
|
||||
- A short plan summary
|
||||
- The exact section text
|
||||
- Why the section was selected, including which checklist triggers fired
|
||||
- The plan depth and risk profile
|
||||
- A specific question to answer
|
||||
|
||||
Instruct the agent to return:
|
||||
- findings that change planning quality
|
||||
- stronger rationale, sequencing, verification, risk treatment, or references
|
||||
- no implementation code
|
||||
- no shell commands
|
||||
|
||||
## 5.3.5 Choose Research Execution Mode
|
||||
|
||||
Use the lightest mode that will work:
|
||||
|
||||
- **Direct mode** - Default. Use when the selected section set is small and the parent can safely read the agent outputs inline.
|
||||
- **Artifact-backed mode** - Use only when the selected research scope is large enough that inline returns would create unnecessary context pressure.
|
||||
|
||||
Signals that justify artifact-backed mode:
|
||||
- More than 5 agents are likely to return meaningful findings
|
||||
- The selected section excerpts are long enough that repeating them in multiple agent outputs would be wasteful
|
||||
- The topic is high-risk and likely to attract bulky source-backed analysis
|
||||
|
||||
If artifact-backed mode is not clearly warranted, stay in direct mode.
|
||||
|
||||
Artifact-backed mode uses a per-run scratch directory under `.context/compound-engineering/ce-plan/deepen/`.
|
||||
|
||||
## 5.3.6 Run Targeted Research
|
||||
|
||||
Launch the selected agents in parallel using the execution mode chosen above. If the current platform does not support parallel dispatch, run them sequentially instead.
|
||||
|
||||
Prefer local repo and institutional evidence first. Use external research only when the gap cannot be closed responsibly from repo context or already-cited sources.
|
||||
|
||||
If a selected section can be improved by reading the origin document more carefully, do that before dispatching external agents.
|
||||
|
||||
**Direct mode:** Have each selected agent return its findings directly to the parent. Keep the return payload focused: strongest findings only, the evidence or sources that matter, the concrete planning improvement implied by the finding.
|
||||
|
||||
**Artifact-backed mode:** For each selected agent, instruct it to write one compact artifact file in the scratch directory and return only a short completion summary. Each artifact should contain: target section, why selected, 3-7 findings, source-backed rationale, the specific plan change implied by each finding. No implementation code, no shell commands.
|
||||
|
||||
If an artifact is missing or clearly malformed, re-run that agent or fall back to direct-mode reasoning for that section.
|
||||
|
||||
If agent outputs conflict:
|
||||
- Prefer repo-grounded and origin-grounded evidence over generic advice
|
||||
- Prefer official framework documentation over secondary best-practice summaries when the conflict is about library behavior
|
||||
- If a real tradeoff remains, record it explicitly in the plan
|
||||
|
||||
## 5.3.6b Interactive Finding Review (Interactive Mode Only)
|
||||
|
||||
Skip this step in auto mode — proceed directly to 5.3.7.
|
||||
|
||||
In interactive mode, present each agent's findings to the user before integration. For each agent that returned findings:
|
||||
|
||||
1. **Summarize the agent and its target section** — e.g., "The architecture-strategist reviewed Key Technical Decisions and found:"
|
||||
2. **Present the findings concisely** — bullet the key points, not the raw agent output. Include enough context for the user to evaluate: what the agent found, what evidence supports it, and what plan change it implies.
|
||||
3. **Ask the user** using the platform's blocking question tool when available (see Interaction Method):
|
||||
- **Accept** — integrate these findings into the plan
|
||||
- **Reject** — discard these findings entirely
|
||||
- **Discuss** — the user wants to talk through the findings before deciding
|
||||
|
||||
If the user chooses "Discuss", engage in brief dialogue about the findings and then re-ask with only accept/reject (no discuss option on the second ask). The user makes a deliberate choice either way.
|
||||
|
||||
When presenting findings from multiple agents targeting the same section, present them one agent at a time so the user can make independent decisions. Do not merge findings from different agents before showing them.
|
||||
|
||||
After all agents have been reviewed, carry only the accepted findings forward to 5.3.7.
|
||||
|
||||
If the user accepted no findings, report "No findings accepted — plan unchanged." If artifact-backed mode was used, clean up the scratch directory before continuing. Then proceed directly to Phase 5.4 (skip document-review and synthesis — the plan was not modified). This interactive-mode-only skip does not apply in auto mode; auto mode always proceeds through 5.3.7 and 5.3.8.
|
||||
|
||||
If findings were accepted and the plan was modified, proceed through 5.3.7 and 5.3.8 as normal — document-review acts as a quality gate on the changes.
|
||||
|
||||
## 5.3.7 Synthesize and Update the Plan
|
||||
|
||||
Strengthen only the selected sections. Keep the plan coherent and preserve its overall structure.
|
||||
|
||||
**In interactive mode:** Only integrate findings the user accepted in 5.3.6b. If some findings from different agents touch the same section, reconcile them coherently but do not reintroduce rejected findings.
|
||||
|
||||
Allowed changes:
|
||||
- Clarify or strengthen decision rationale
|
||||
- Tighten requirements trace or origin fidelity
|
||||
- Reorder or split implementation units when sequencing is weak
|
||||
- Add missing pattern references, file/test paths, or verification outcomes
|
||||
- Expand system-wide impact, risks, or rollout treatment where justified
|
||||
- Reclassify open questions between `Resolved During Planning` and `Deferred to Implementation` when evidence supports the change
|
||||
- Strengthen, replace, or add a High-Level Technical Design section when the work warrants it and the current representation is weak
|
||||
- Strengthen or add per-unit technical design fields where the unit's approach is non-obvious
|
||||
- Add or update `deepened: YYYY-MM-DD` in frontmatter when the plan was substantively improved
|
||||
|
||||
Do **not**:
|
||||
- Add implementation code — no imports, exact method signatures, or framework-specific syntax. Pseudo-code sketches and DSL grammars are allowed
|
||||
- Add git commands, commit choreography, or exact test command recipes
|
||||
- Add generic `Research Insights` subsections everywhere
|
||||
- Rewrite the entire plan from scratch
|
||||
- Invent new product requirements, scope changes, or success criteria without surfacing them explicitly
|
||||
|
||||
If research reveals a product-level ambiguity that should change behavior or scope:
|
||||
- Do not silently decide it here
|
||||
- Record it under `Open Questions`
|
||||
- Recommend `ce:brainstorm` if the gap is truly product-defining
|
||||
@@ -0,0 +1,87 @@
|
||||
# Plan Handoff
|
||||
|
||||
This file contains post-plan-writing instructions: document review, post-generation options, and issue creation. Load it after the plan file has been written and the confidence check (5.3.1-5.3.7) is complete.
|
||||
|
||||
## 5.3.8 Document Review
|
||||
|
||||
After the confidence check (and any deepening), run the `document-review` skill on the plan file. Pass the plan path as the argument. When this step is reached, it is mandatory — do not skip it because the confidence check already ran. The two tools catch different classes of issues.
|
||||
|
||||
The confidence check and document-review are complementary:
|
||||
- The confidence check strengthens rationale, sequencing, risk treatment, and grounding
|
||||
- Document-review checks coherence, feasibility, scope alignment, and surfaces role-specific issues
|
||||
|
||||
If document-review returns findings that were auto-applied, note them briefly when presenting handoff options. If residual P0/P1 findings were surfaced, mention them so the user can decide whether to address them before proceeding.
|
||||
|
||||
When document-review returns "Review complete", proceed to Final Checks.
|
||||
|
||||
**Pipeline mode:** If invoked from an automated workflow such as LFG, SLFG, or any `disable-model-invocation` context, run `document-review` with `mode:headless` and the plan path. Headless mode applies auto-fixes silently and returns structured findings without interactive prompts. Address any P0/P1 findings before returning control to the caller.
|
||||
|
||||
## 5.3.9 Final Checks and Cleanup
|
||||
|
||||
Before proceeding to post-generation options:
|
||||
- Confirm the plan is stronger in specific ways, not merely longer
|
||||
- Confirm the planning boundary is intact
|
||||
- Confirm origin decisions were preserved when an origin document exists
|
||||
|
||||
If artifact-backed mode was used:
|
||||
- Clean up the temporary scratch directory after the plan is safely updated
|
||||
- If cleanup is not practical on the current platform, note where the artifacts were left
|
||||
|
||||
## 5.4 Post-Generation Options
|
||||
|
||||
**Pipeline mode:** If invoked from an automated workflow such as LFG, SLFG, or any `disable-model-invocation` context, skip the interactive menu below and return control to the caller immediately. The plan file has already been written, the confidence check has already run, and document-review has already run — the caller (e.g., lfg, slfg) determines the next step.
|
||||
|
||||
After document-review completes, present the options using the platform's blocking question tool when available (see Interaction Method). Otherwise present numbered options in chat and wait for the user's reply before proceeding.
|
||||
|
||||
**Question:** "Plan ready at `docs/plans/YYYY-MM-DD-NNN-<type>-<name>-plan.md`. What would you like to do next?"
|
||||
|
||||
**Options:**
|
||||
1. **Start `/ce:work`** - Begin implementing this plan in the current environment (recommended)
|
||||
2. **Open plan in editor** - Open the plan file for review
|
||||
3. **Run additional document review** - Another pass for further refinement
|
||||
4. **Share to Proof** - Upload the plan for collaborative review and sharing
|
||||
5. **Start `/ce:work` in another session** - Begin implementing in a separate agent session when the current platform supports it
|
||||
6. **Create Issue** - Create an issue in the configured tracker
|
||||
|
||||
Based on selection:
|
||||
- **Open plan in editor** -> Open `docs/plans/<plan_filename>.md` using the current platform's file-open or editor mechanism (e.g., `open` on macOS, `xdg-open` on Linux, or the IDE's file-open API)
|
||||
- **Run additional document review** -> Load the `document-review` skill with the plan path for another pass
|
||||
- **Share to Proof** -> Upload the plan:
|
||||
```bash
|
||||
CONTENT=$(cat docs/plans/<plan_filename>.md)
|
||||
TITLE="Plan: <plan title from frontmatter>"
|
||||
RESPONSE=$(curl -s -X POST https://www.proofeditor.ai/share/markdown \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "$(jq -n --arg title "$TITLE" --arg markdown "$CONTENT" --arg by "ai:compound" '{title: $title, markdown: $markdown, by: $by}')")
|
||||
PROOF_URL=$(echo "$RESPONSE" | jq -r '.tokenUrl')
|
||||
```
|
||||
Display `View & collaborate in Proof: <PROOF_URL>` if successful, then return to the options
|
||||
- **`/ce:work`** -> Call `/ce:work` with the plan path
|
||||
- **`/ce:work` in another session** -> If the current platform supports launching a separate agent session, start `/ce:work` with the plan path there. Otherwise, explain the limitation briefly and offer to run `/ce:work` in the current session instead.
|
||||
- **Create Issue** -> Follow the Issue Creation section below
|
||||
- **Other** -> Accept free text for revisions and loop back to options
|
||||
|
||||
## Issue Creation
|
||||
|
||||
When the user selects "Create Issue", detect their project tracker from `AGENTS.md` or, if needed for compatibility, `CLAUDE.md`:
|
||||
|
||||
1. Look for `project_tracker: github` or `project_tracker: linear`
|
||||
2. If GitHub:
|
||||
|
||||
```bash
|
||||
gh issue create --title "<type>: <title>" --body-file <plan_path>
|
||||
```
|
||||
|
||||
3. If Linear:
|
||||
|
||||
```bash
|
||||
linear issue create --title "<title>" --description "$(cat <plan_path>)"
|
||||
```
|
||||
|
||||
4. If no tracker is configured:
|
||||
- Ask which tracker they use using the platform's blocking question tool when available (see Interaction Method)
|
||||
- Suggest adding the tracker to `AGENTS.md` for future runs
|
||||
|
||||
After issue creation:
|
||||
- Display the issue URL
|
||||
- Ask whether to proceed to `/ce:work`
|
||||
@@ -0,0 +1,31 @@
|
||||
# Visual Communication in Plan Documents
|
||||
|
||||
Section 3.4 covers diagrams about the *solution being planned* (pseudo-code, mermaid sequences, state diagrams). The existing Section 4.3 mermaid rule encourages those solution-design diagrams within Technical Design and per-unit fields. This guidance covers a different concern: visual aids that help readers *navigate and comprehend the plan document itself* -- dependency graphs, interaction diagrams, and comparison tables that make plan structure scannable.
|
||||
|
||||
Visual aids are conditional on content patterns, not on plan depth classification -- a Lightweight plan about a complex multi-unit workflow may warrant a dependency graph; a Deep plan about a straightforward feature may not.
|
||||
|
||||
**When to include:**
|
||||
|
||||
| Plan describes... | Visual aid | Placement |
|
||||
|---|---|---|
|
||||
| 4+ implementation units with non-linear dependencies (parallelism, diamonds, fan-in/fan-out) | Mermaid dependency graph | Before or after the Implementation Units heading |
|
||||
| System-Wide Impact naming 3+ interacting surfaces or cross-layer effects | Mermaid interaction or component diagram | Within the System-Wide Impact section |
|
||||
| Problem/Overview involving 3+ behavioral modes, states, or variants | Markdown comparison table | Within Overview or Problem Frame |
|
||||
| Key Technical Decisions with 3+ interacting decisions, or Alternative Approaches with 3+ alternatives | Markdown comparison table | Within the relevant section |
|
||||
|
||||
**When to skip:**
|
||||
- The plan has 3 or fewer units in a straight dependency chain -- the Dependencies field on each unit is sufficient
|
||||
- Prose already communicates the relationships clearly
|
||||
- The visual would duplicate what the High-Level Technical Design section already shows
|
||||
- The visual describes code-level detail (specific method names, SQL columns, API field lists)
|
||||
|
||||
**Format selection:**
|
||||
- **Mermaid** (default) for dependency graphs and interaction diagrams -- 5-15 nodes, no in-box annotations, standard flowchart shapes. Use `TB` (top-to-bottom) direction so diagrams stay narrow in both rendered and source form. Source should be readable as fallback in diff views and terminals.
|
||||
- **ASCII/box-drawing diagrams** for annotated flows that need rich in-box content -- file path layouts, decision logic branches, multi-column spatial arrangements. More expressive than mermaid when the diagram's value comes from annotations within nodes. Follow 80-column max for code blocks, use vertical stacking.
|
||||
- **Markdown tables** for mode/variant comparisons and decision/approach comparisons.
|
||||
- Keep diagrams proportionate to the plan. A 6-unit linear chain gets a simple 6-node graph. A complex dependency graph with fan-out and fan-in may need 10-15 nodes -- that is fine if every node earns its place.
|
||||
- Place inline at the point of relevance, not in a separate section.
|
||||
- Plan-structure level only -- unit dependencies, component interactions, mode comparisons, impact surfaces. Not implementation architecture, data schemas, or code structure (those belong in Section 3.4).
|
||||
- Prose is authoritative: when a visual aid and its surrounding prose disagree, the prose governs.
|
||||
|
||||
After generating a visual aid, verify it accurately represents the plan sections it illustrates -- correct dependency edges, no missing surfaces, no merged units.
|
||||
@@ -118,10 +118,11 @@ describe("ce:plan testing contract", () => {
|
||||
|
||||
describe("ce:plan review contract", () => {
|
||||
test("requires document review after confidence check", async () => {
|
||||
const content = await readRepoFile("plugins/compound-engineering/skills/ce-plan/SKILL.md")
|
||||
// Document review instructions extracted to references/plan-handoff.md
|
||||
const content = await readRepoFile("plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md")
|
||||
|
||||
// Phase 5.3.8 runs document-review before final checks (5.3.9)
|
||||
expect(content).toContain("##### 5.3.8 Document Review")
|
||||
expect(content).toContain("## 5.3.8 Document Review")
|
||||
expect(content).toContain("`document-review` skill")
|
||||
|
||||
// Document review must come before final checks so auto-applied edits are validated
|
||||
@@ -130,16 +131,24 @@ describe("ce:plan review contract", () => {
|
||||
expect(docReviewIdx).toBeLessThan(finalChecksIdx)
|
||||
})
|
||||
|
||||
test("uses headless mode in pipeline context", async () => {
|
||||
test("SKILL.md stub points to plan-handoff reference", async () => {
|
||||
const content = await readRepoFile("plugins/compound-engineering/skills/ce-plan/SKILL.md")
|
||||
|
||||
// Stub references the handoff file and marks document review as mandatory
|
||||
expect(content).toContain("`references/plan-handoff.md`")
|
||||
expect(content).toContain("Document review is mandatory")
|
||||
})
|
||||
|
||||
test("uses headless mode in pipeline context", async () => {
|
||||
const content = await readRepoFile("plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md")
|
||||
|
||||
// Pipeline mode runs document-review headlessly, not skipping it
|
||||
expect(content).toContain("document-review` with `mode:headless`")
|
||||
expect(content).not.toContain("skip document-review and return control")
|
||||
})
|
||||
|
||||
test("handoff options recommend ce:work after review", async () => {
|
||||
const content = await readRepoFile("plugins/compound-engineering/skills/ce-plan/SKILL.md")
|
||||
const content = await readRepoFile("plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md")
|
||||
|
||||
// ce:work is recommended (review already happened)
|
||||
expect(content).toContain("**Start `/ce:work`** - Begin implementing this plan in the current environment (recommended)")
|
||||
|
||||
Reference in New Issue
Block a user