feat(ce-debug): environment sanity, assumption audit, more techniques (#649)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -65,7 +65,18 @@ Confirm the bug exists and understand its behavior. Run the test, trigger the er
|
||||
- **Does not reproduce after 2-3 attempts:** Read `references/investigation-techniques.md` for intermittent-bug techniques.
|
||||
- **Cannot reproduce at all in this environment:** Document what was tried and what conditions appear to be missing.
|
||||
|
||||
#### 1.2 Trace the code path
|
||||
#### 1.2 Verify environment sanity
|
||||
|
||||
Before deep code tracing, confirm the environment is what you think it is:
|
||||
|
||||
- Correct branch checked out; no unintended uncommitted changes
|
||||
- Dependencies installed and up to date (`bun install`, `npm install`, `bundle install`, etc.) — stale `node_modules`/`vendor` is a frequent false lead
|
||||
- Expected interpreter or runtime version (check `.tool-versions`, `.nvmrc`, `Gemfile`, etc. against what's actually active)
|
||||
- Required env vars present and non-empty
|
||||
- No stale build artifacts (`dist/`, `.next/`, compiled binaries from an earlier branch)
|
||||
- Dependent local services (database, cache, queue) running at expected versions *when the bug plausibly involves them*
|
||||
|
||||
#### 1.3 Trace the code path
|
||||
|
||||
Read the relevant source files. Follow the execution path from entry point to where the error manifests. Trace backward through the call chain:
|
||||
|
||||
@@ -92,6 +103,8 @@ As you trace:
|
||||
|
||||
Read `references/anti-patterns.md` before forming hypotheses.
|
||||
|
||||
**Assumption audit (before hypothesis formation):** List the concrete "this must be true" beliefs your understanding depends on — the framework behaves as expected here, this function returns what its name implies, the config loads before this runs, the caller passes a non-null value, the database is in the state the test implies. For each, mark *verified* (you read the code, checked state, or ran it) or *assumed*. Assumptions are the most common source of stuck debugging. Many "wrong hypotheses" are actually correct hypotheses tested against a wrong assumption.
|
||||
|
||||
**Form hypotheses** ranked by likelihood. For each, state:
|
||||
- What is wrong and where (file:line)
|
||||
- The causal chain: how the trigger leads to the observed symptom, step by step
|
||||
@@ -144,6 +157,8 @@ If 2-3 hypotheses are exhausted without confirmation, diagnose why:
|
||||
| Works locally, fails in CI/prod | Environment problem | Focus on env differences, config, dependencies, timing |
|
||||
| Fix works but prediction was wrong | Symptom fix, not root cause | The real cause is still active — keep investigating |
|
||||
|
||||
**Parallel investigation option:** When hypotheses are evidence-bottlenecked across clearly independent subsystems, dispatch read-only sub-agents in parallel, each with an explicit hypothesis and structured evidence-return format. No code edits by sub-agents, and skip this when hypotheses depend on each other's outcomes. If the platform does not support parallel sub-agent dispatch, run the same hypothesis probes sequentially in ranked-likelihood order instead — the parallelism is a latency optimization, not a correctness requirement.
|
||||
|
||||
Present the diagnosis to the user before proceeding.
|
||||
|
||||
---
|
||||
@@ -165,8 +180,7 @@ If the user chose "Diagnosis only" at the end of Phase 2, skip this phase and go
|
||||
|
||||
**3 failed fix attempts = smart escalation.** Diagnose using the same table from Phase 2. If fixes keep failing, the root cause identification was likely wrong. Return to Phase 2.
|
||||
|
||||
**Conditional defense-in-depth** (trigger: grep for the root-cause pattern found it in other files):
|
||||
Check whether the same gap exists at those locations. Skip when the root cause is a one-off error.
|
||||
**Conditional defense-in-depth** (trigger: grep for the root-cause pattern found it in 3+ other files, OR the bug would have been catastrophic if it reached production): Read `references/defense-in-depth.md` for the four-layer model (entry validation, invariant check, environment guard, diagnostic breadcrumb) and choose which layers apply. Skip when the root cause is a one-off error with no realistic recurrence path.
|
||||
|
||||
**Conditional post-mortem** (trigger: the bug was in production, OR the pattern appears in 3+ locations):
|
||||
How was this introduced? What allowed it to survive? If a systemic gap was found: "This pattern appears in N other files. Want to capture it with `/ce-compound`?"
|
||||
|
||||
Reference in New Issue
Block a user