refactor(ce-brainstorm): make doc review opt-in in Phase 4 handoff (#633)

2026-04-21 16:28:20 -07:00
parent 44ce9dd127
commit ff0eee391e
4 changed files with 63 additions and 42 deletions
--- a/plugins/compound-engineering/AGENTS.md
+++ b/plugins/compound-engineering/AGENTS.md
@@ -124,6 +124,7 @@ Keep rationale at the highest-level location that covers it; restate behavioral
 - [ ] When a skill needs to ask the user a question, instruct use of the platform's blocking question tool and name the known equivalents (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini)
 - [ ] For Claude Code, also instruct to load `AskUserQuestion` via `ToolSearch` with `select:AskUserQuestion` first if its schema isn't already loaded — `AskUserQuestion` is a deferred tool and won't be available at session start. A pending schema load is not a valid reason to fall back to text.
 - [ ] Include a fallback: when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes where `request_user_input` is unavailable, or `ToolSearch` returns no match), present numbered options in chat and wait for the user's reply — never silently skip the question.
 - [ ] **Narrow exception for legitimate option overflow:** when a menu has 5 or more genuinely relevant options — each a distinct destination or workflow, none removable without losing real user choice — render as a numbered list in chat rather than trimming to fit the 4-option cap. This is used with restraint, not as a convenience escape from the blocking tool. Default remains the blocking tool. Before invoking the exception, verify that (a) no option can be cut, (b) no two options can be merged, and (c) no option is better surfaced as contextual prose (e.g., a nudge adjacent to the menu). If any of those reductions work, prefer them over the fallback. When the exception applies, include a hint that free-form input is accepted (e.g., "Pick a number or describe what you want.") so the numbered list retains the blocking tool's open-endedness.
 > **Platform-behavior note (April 2026, may change):** The specifics above reflect current behavior — `AskUserQuestion` is deferred in Claude Code, and `request_user_input` in Codex is exposed only in Plan mode. If Anthropic changes `AskUserQuestion` to a non-deferred tool, or Codex exposes `request_user_input` in edit modes, revisit this guidance rather than carrying the workaround forward indefinitely. Verify before assuming these constraints still hold.
--- a/plugins/compound-engineering/skills/ce-brainstorm/SKILL.md
+++ b/plugins/compound-engineering/skills/ce-brainstorm/SKILL.md
@@ -200,13 +200,6 @@ If relevant, call out whether the choice is:
 Write or update a requirements document only when the conversation produced durable decisions worth preserving. Read `references/requirements-capture.md` for the document template, formatting rules, visual aid guidance, and completeness checks.
 For **Lightweight** brainstorms, keep the document compact. Skip document creation when the user only needs brief alignment and no durable decisions need to be preserved.
 ### Phase 3.5: Document Review
 When a requirements document was created or updated, run the `ce-doc-review` skill on it before presenting handoff options. Pass the document path as the argument.
 If document-review returns findings that were auto-applied, note them briefly when presenting handoff options. If residual P0/P1 findings were surfaced, mention them so the user can decide whether to address them before proceeding.
 When document-review returns "Review complete", proceed to Phase 4.
 ### Phase 4: Handoff
--- a/plugins/compound-engineering/skills/ce-brainstorm/references/handoff.md
+++ b/plugins/compound-engineering/skills/ce-brainstorm/references/handoff.md
@@ -1,69 +1,95 @@
 # Handoff
-This content is loaded when Phase 4 begins — after the requirements document is written and reviewed.
+This content is loaded when Phase 4 begins — after the requirements document is written.
 ---
 #### 4.1 Present Next-Step Options
-Present the options using the platform's blocking question tool: `AskUserQuestion` in Claude Code (call `ToolSearch` with `select:AskUserQuestion` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini. Fall back to numbered options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question.
+The Phase 4 menu's visible option count varies by state: no requirements doc hides the review and Proof options, unresolved `Resolve Before Planning` hides `Plan implementation` and `Build it now`, a failing direct-to-work gate hides `Build it now`. Count the visible options for the current state and choose the rendering mode accordingly:
 - **4 or fewer visible:** use the platform's blocking question tool (`AskUserQuestion` in Claude Code — call `ToolSearch` with `select:AskUserQuestion` first if its schema isn't loaded; `request_user_input` in Codex; `ask_user` in Gemini). This is the default per the plugin AGENTS.md "Interactive Question Tool Design" section.
 - **5 or more visible:** render as a numbered list in chat. This is the narrow case-3 overflow exception documented in the same AGENTS.md section; trimming would hide legitimate choices (plan, review, Proof, build, refine, pause are all distinct destinations).
 Never silently skip the question.
 If `Resolve Before Planning` contains any items:
 - Ask the blocking questions now, one at a time, by default
 - If the user explicitly wants to proceed anyway, first convert each remaining item into an explicit decision, assumption, or `Deferred to Planning` question
 - If the user chooses to pause instead, present the handoff as paused or blocked rather than complete
- Do not offer `Proceed to planning` or `Proceed directly to work` while `Resolve Before Planning` remains non-empty
+- Do not offer the `Plan implementation` or `Build it now` options while `Resolve Before Planning` remains non-empty
-**Question when no blocking questions remain:** "Brainstorm complete. What would you like to do next?"
+In both preambles below, the "Pick a number or describe what you want." hint applies only in numbered-list mode. When using the blocking tool, omit that line and pass the remaining stem as the question.
-**Question when blocking questions remain and user wants to pause:** "Brainstorm paused. Planning is blocked until the remaining questions are resolved. What would you like to do next?"
+**Preamble when no blocking questions remain:**
-Present only the options that apply, keeping the total at 4 or fewer:
+```
 Brainstorm complete.
- **Proceed to planning (Recommended)** - Move to `/ce-plan` for structured implementation planning. Shown only when `Resolve Before Planning` is empty.
+Requirements doc: <path/to/requirements-doc.md>  # omit line if no doc was created
 - **Proceed directly to work** - Skip planning and move to `/ce-work`; suited to lightweight, well-defined changes. Shown only when `Resolve Before Planning` is empty **and** scope is lightweight, success criteria are clear, scope boundaries are clear, and no meaningful technical or research questions remain (the "direct-to-work gate").
 - **Continue the brainstorm** - Answer more clarifying questions to tighten scope, edge cases, and preferences. Always shown.
 - **Open in Proof (web app) — review and comment to iterate with the agent** - Open the doc in Every's Proof editor, iterate with the agent via comments, or copy a link to share with others. Shown only when a requirements document exists **and** the direct-to-work gate is not satisfied (when both conditions collide, `Proceed directly to work` takes priority and Proof becomes reachable via free-form request).
 - **Done for now** - Pause; the requirements doc is saved and can be resumed later. Always shown.
-**Surface additional document review contextually, not as a menu fixture:** When the prior document-review pass surfaced residual P0/P1 findings that the user has not addressed, mention them adjacent to the menu and offer another review pass in prose (e.g., "Document review flagged 2 P1 findings you may want to address — want me to run another pass?"). Do not add it to the option list.
+What would you like to do next? (Pick a number or describe what you want.)
 ```
 **Preamble when blocking questions remain and user wants to pause:**
 ```
 Brainstorm paused. Planning is blocked until the remaining questions are resolved.
 Requirements doc: <path/to/requirements-doc.md>  # omit line if no doc was created
 What would you like to do next? (Pick a number or describe what you want.)
 ```
 Present only the options that apply. Renumber so visible options stay contiguous starting at 1.
 1. **Plan implementation with `ce-plan` (Recommended)** - Move to `ce-plan` for structured implementation planning. Shown only when `Resolve Before Planning` is empty.
 2. **Agent review of requirements doc with `ce-doc-review`** - Dispatch reviewer agents to check the doc for coherence, feasibility, scope, and other persona-specific issues; auto-apply safe fixes; route remaining findings interactively. Shown only when a requirements document exists.
 3. **Open in Proof — review and comment to iterate with the agent** - Open the doc in Every's Proof editor, iterate with the agent via comments, or copy a link to share with others. Shown only when a requirements document exists.
 4. **Build it now with `ce-work` (skip planning)** - Skip planning and move to `ce-work`; suited to lightweight, well-defined changes. Shown only when `Resolve Before Planning` is empty **and** scope is lightweight, success criteria are clear, scope boundaries are clear, and no meaningful technical or research questions remain (the "direct-to-work gate").
 5. **More clarifying questions to sharpen the doc** - Keep refining scope, edge cases, constraints, and preferences through further dialogue. Always shown.
 6. **Done for now** - Pause; the requirements doc is saved and can be resumed later. Always shown.
 **Post-review nudge (subsequent rounds only):** If the user has already run `ce-doc-review` this session and residual P0/P1 findings remain unaddressed, add a one-line prose nudge adjacent to the menu (e.g., "Document review flagged 2 P1 findings you may want to address — pick \"Agent review of requirements doc\" to run another pass."). Reference the option by label, not number: the menu renumbers when `Resolve Before Planning` hides `Plan implementation` and `Build it now`, so a hardcoded option number can point users at the wrong action. Do not add a separate menu option; reuse the existing agent-review option.
 #### 4.2 Handle the Selected Option
-**If user selects "Proceed to planning (Recommended)":**
+Selections may be the literal option label (when the user types the label or a close paraphrase) or the option number. Match numbers against the currently-rendered (post-trim) list. Free-form input that doesn't match an option or describe an alternative action should be treated as clarification — ask a follow-up rather than guessing.
-Immediately run `/ce-plan` in the current session. Pass the requirements document path when one exists; otherwise pass a concise summary of the finalized brainstorm decisions. Do not print the closing summary first.
+**If user selects "Plan implementation with `ce-plan` (Recommended)":**
-**If user selects "Proceed directly to work":**
+Immediately load the `ce-plan` skill in the current session. Pass the requirements document path when one exists; otherwise pass a concise summary of the finalized brainstorm decisions. Do not print the closing summary first.
-Immediately run `/ce-work` in the current session using the finalized brainstorm output as context. If a compact requirements document exists, pass its path. Do not print the closing summary first.
+**If user selects "Agent review of requirements doc with `ce-doc-review`":**
-**If user selects "Continue the brainstorm":** Return to Phase 1.3 (Collaborative Dialogue) and continue asking the user clarifying questions one at a time to further refine scope, edge cases, constraints, and preferences. Continue until the user is satisfied, then return to Phase 4. Do not show the closing summary yet.
+Load the `ce-doc-review` skill, passing the requirements document path as the argument. When ce-doc-review returns "Review complete", return to the Phase 4 options and re-render the menu (the doc may have changed, so re-evaluate `Resolve Before Planning`, direct-to-work gate, and residual findings). If residual P0/P1 findings remain unaddressed, include the post-review nudge above the menu. Do not show the closing summary yet.
-**If user selects "Open in Proof (web app) — review and comment to iterate with the agent":**
+**If user selects "Build it now with `ce-work` (skip planning)":**
 Immediately load the `ce-work` skill in the current session using the finalized brainstorm output as context. If a compact requirements document exists, pass its path. Do not print the closing summary first.
 **If user selects "More clarifying questions to sharpen the doc":** Return to Phase 1.3 (Collaborative Dialogue) and continue asking the user clarifying questions one at a time to further refine scope, edge cases, constraints, and preferences. Continue until the user is satisfied, then return to Phase 4. Do not show the closing summary yet.
 **If user selects "Open in Proof — review and comment to iterate with the agent":**
 Load the `ce-proof` skill in HITL-review mode with:
 - **source file:** `docs/brainstorms/YYYY-MM-DD-<topic>-requirements.md`
 - **doc title:** `Requirements: <topic title>`
 - **identity:** `ai:compound-engineering` / `Compound Engineering`
- **recommended next step:** `/ce-plan` (shown in the ce-proof skill's final terminal output)
+- **recommended next step:** `ce-plan` (shown in the ce-proof skill's final terminal output)
 Follow `references/hitl-review.md` in the ce-proof skill. It uploads the doc, prompts the user for review in Proof's web UI, ingests each thread by reading it fresh and replying in-thread, applies agreed edits as tracked suggestions, and syncs the final markdown back to the source file atomically on proceed.
 When the ce-proof skill returns control:
 - `status: proceeded` with `localSynced: true` → the requirements doc on disk now reflects the review. Return to the Phase 4 options and re-render the menu (the doc may have changed substantially during review, so option eligibility can shift — re-evaluate `Resolve Before Planning`, direct-to-work gate, and residual ce-doc-review findings against the updated doc).
- `status: proceeded` with `localSynced: false` → the reviewed version lives in Proof at `docUrl` but the local copy is stale. Offer to pull the Proof doc to `localPath` using the ce-proof skill's Pull workflow. Re-render the Phase 4 menu after the pull completes (or is declined). If the pull was declined, include a one-line note above the menu that `<localPath>` is stale vs. Proof — otherwise `Proceed to planning` / `Proceed directly to work` will silently read the pre-review copy.
+- `status: proceeded` with `localSynced: false` → the reviewed version lives in Proof at `docUrl` but the local copy is stale. Offer to pull the Proof doc to `localPath` using the ce-proof skill's Pull workflow. Re-render the Phase 4 menu after the pull completes (or is declined). If the pull was declined, include a one-line note above the menu that `<localPath>` is stale vs. Proof — otherwise `Plan implementation` / `Build it now` / `Agent review of requirements doc` will silently read the pre-review copy (ce-doc-review would analyze stale content, and planning or work would skip the user's Proof edits).
- `status: done_for_now` → the doc on disk may be stale if the user edited in Proof before leaving. Offer to pull the Proof doc to `localPath` so the local requirements file stays in sync, then return to the Phase 4 options. If the pull was declined, include the stale-local note above the menu. `done_for_now` means the user stopped the HITL loop without syncing — it does not mean they ended the whole brainstorm; they may still want to proceed to planning or continue the brainstorm.
+- `status: done_for_now` → the doc on disk may be stale if the user edited in Proof before leaving. Offer to pull the Proof doc to `localPath` so the local requirements file stays in sync, then return to the Phase 4 options. If the pull was declined, include the stale-local note above the menu. `done_for_now` means the user stopped the HITL loop without syncing — it does not mean they ended the whole brainstorm; they may still want to plan implementation, run an agent review, or keep refining the doc.
 - `status: aborted` → fall back to the Phase 4 options without changes.
 If the initial upload fails (network error, Proof API down), retry once after a short wait. If it still fails, tell the user the upload didn't succeed and briefly explain why, then return to the Phase 4 options — don't leave them wondering why the option did nothing.
 **If the user asks to run another document review** (either from the contextual prompt when P0/P1 findings remain, or by free-form request):
 Load the `ce-doc-review` skill and apply it to the requirements document for another pass. When ce-doc-review returns "Review complete", return to the normal Phase 4 options and present only the options that still apply. Do not show the closing summary yet.
 **If user selects "Done for now":** Display the closing summary (see 4.3) and end the turn.
 #### 4.3 Closing Summary
@@ -81,7 +107,7 @@ Key decisions:
 - [Decision 1]
 - [Decision 2]
-Recommended next step: `/ce-plan`
+Recommended next step: `ce-plan`
 ```
 If the user pauses with `Resolve Before Planning` still populated, display:
@@ -95,5 +121,5 @@ Planning is blocked by:
 - [Blocking question 1]
 - [Blocking question 2]
-Resume with `/ce-brainstorm` when ready to resolve these before planning.
+Resume with `ce-brainstorm` when ready to resolve these before planning.
 ```
--- a/tests/pipeline-review-contract.test.ts
+++ b/tests/pipeline-review-contract.test.ts
@@ -273,22 +273,23 @@ describe("ce:plan remains neutral during ce:work-beta rollout", () => {
 })
 describe("ce-brainstorm review contract", () => {
-  test("requires document review before handoff", async () => {
+  test("exposes document review as an opt-in handoff option", async () => {
    const content = await readRepoFile("plugins/compound-engineering/skills/ce-brainstorm/SKILL.md")
    const handoff = await readRepoFile("plugins/compound-engineering/skills/ce-brainstorm/references/handoff.md")
-    // Phase 3.5 exists and runs document-review
+    // Document review is no longer a forced Phase 3.5 step. Users opt in from the Phase 4 menu.
-    expect(content).toContain("### Phase 3.5: Document Review")
+    expect(content).not.toContain("Phase 3.5")
    expect(content).toContain("`ce-doc-review` skill")
    // Phase 3 and Phase 4 are extracted to references for token optimization
    expect(content).toContain("`references/requirements-capture.md`")
    expect(content).toContain("`references/handoff.md`")
-    // Additional review passes are surfaced contextually (not as a menu fixture) and still
+    // Phase 4 menu exposes agent review as a first-class option and routes to ce-doc-review
-    // route through the ce-doc-review skill when requested
+    expect(handoff).toContain("Agent review of requirements doc with `ce-doc-review`")
    const handoff = await readRepoFile("plugins/compound-engineering/skills/ce-brainstorm/references/handoff.md")
    expect(handoff).toContain("Surface additional document review contextually")
    expect(handoff).toContain("Load the `ce-doc-review` skill")
    // Subsequent-round residual findings are surfaced as a prose nudge, not a separate menu option
    expect(handoff).toContain("Post-review nudge")
    expect(handoff).not.toContain("**Review and refine**")
  })
 })