feat(ce-debug): environment sanity, assumption audit, more techniques (#649)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 09:08:24 -07:00
parent 6155b9de3c
commit cce95fb814
3 changed files with 265 additions and 3 deletions
--- a/plugins/compound-engineering/skills/ce-debug/SKILL.md
+++ b/plugins/compound-engineering/skills/ce-debug/SKILL.md
@@ -65,7 +65,18 @@ Confirm the bug exists and understand its behavior. Run the test, trigger the er
 - **Does not reproduce after 2-3 attempts:** Read `references/investigation-techniques.md` for intermittent-bug techniques.
 - **Cannot reproduce at all in this environment:** Document what was tried and what conditions appear to be missing.

-#### 1.2 Trace the code path
+#### 1.2 Verify environment sanity
+
+Before deep code tracing, confirm the environment is what you think it is:
+
+- Correct branch checked out; no unintended uncommitted changes
+- Dependencies installed and up to date (`bun install`, `npm install`, `bundle install`, etc.) — stale `node_modules`/`vendor` is a frequent false lead
+- Expected interpreter or runtime version (check `.tool-versions`, `.nvmrc`, `Gemfile`, etc. against what's actually active)
+- Required env vars present and non-empty
+- No stale build artifacts (`dist/`, `.next/`, compiled binaries from an earlier branch)
+- Dependent local services (database, cache, queue) running at expected versions *when the bug plausibly involves them*
+
+#### 1.3 Trace the code path

 Read the relevant source files. Follow the execution path from entry point to where the error manifests. Trace backward through the call chain:

@@ -92,6 +103,8 @@ As you trace:

 Read `references/anti-patterns.md` before forming hypotheses.

+**Assumption audit (before hypothesis formation):** List the concrete "this must be true" beliefs your understanding depends on — the framework behaves as expected here, this function returns what its name implies, the config loads before this runs, the caller passes a non-null value, the database is in the state the test implies. For each, mark *verified* (you read the code, checked state, or ran it) or *assumed*. Assumptions are the most common source of stuck debugging. Many "wrong hypotheses" are actually correct hypotheses tested against a wrong assumption.
+
 **Form hypotheses** ranked by likelihood. For each, state:
 - What is wrong and where (file:line)
 - The causal chain: how the trigger leads to the observed symptom, step by step
@@ -144,6 +157,8 @@ If 2-3 hypotheses are exhausted without confirmation, diagnose why:
 | Works locally, fails in CI/prod | Environment problem | Focus on env differences, config, dependencies, timing |
 | Fix works but prediction was wrong | Symptom fix, not root cause | The real cause is still active — keep investigating |

+**Parallel investigation option:** When hypotheses are evidence-bottlenecked across clearly independent subsystems, dispatch read-only sub-agents in parallel, each with an explicit hypothesis and structured evidence-return format. No code edits by sub-agents, and skip this when hypotheses depend on each other's outcomes. If the platform does not support parallel sub-agent dispatch, run the same hypothesis probes sequentially in ranked-likelihood order instead — the parallelism is a latency optimization, not a correctness requirement.
+
 Present the diagnosis to the user before proceeding.

 ---
@@ -165,8 +180,7 @@ If the user chose "Diagnosis only" at the end of Phase 2, skip this phase and go

 **3 failed fix attempts = smart escalation.** Diagnose using the same table from Phase 2. If fixes keep failing, the root cause identification was likely wrong. Return to Phase 2.

-**Conditional defense-in-depth** (trigger: grep for the root-cause pattern found it in other files):
-Check whether the same gap exists at those locations. Skip when the root cause is a one-off error.
+**Conditional defense-in-depth** (trigger: grep for the root-cause pattern found it in 3+ other files, OR the bug would have been catastrophic if it reached production): Read `references/defense-in-depth.md` for the four-layer model (entry validation, invariant check, environment guard, diagnostic breadcrumb) and choose which layers apply. Skip when the root cause is a one-off error with no realistic recurrence path.

 **Conditional post-mortem** (trigger: the bug was in production, OR the pattern appears in 3+ locations):
 How was this introduced? What allowed it to survive? If a systemic gap was found: "This pattern appears in N other files. Want to capture it with `/ce-compound`?"