feat(ce-debug): environment sanity, assumption audit, more techniques (#649)
Some checks failed
CI / pr-title (push) Has been cancelled
CI / test (push) Has been cancelled
Release PR / release-pr (push) Has been cancelled
Release PR / publish-cli (push) Has been cancelled

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Trevin Chow
2026-04-22 09:08:24 -07:00
committed by GitHub
parent 6155b9de3c
commit cce95fb814
3 changed files with 265 additions and 3 deletions

View File

@@ -65,7 +65,18 @@ Confirm the bug exists and understand its behavior. Run the test, trigger the er
- **Does not reproduce after 2-3 attempts:** Read `references/investigation-techniques.md` for intermittent-bug techniques.
- **Cannot reproduce at all in this environment:** Document what was tried and what conditions appear to be missing.
#### 1.2 Trace the code path
#### 1.2 Verify environment sanity
Before deep code tracing, confirm the environment is what you think it is:
- Correct branch checked out; no unintended uncommitted changes
- Dependencies installed and up to date (`bun install`, `npm install`, `bundle install`, etc.) — stale `node_modules`/`vendor` is a frequent false lead
- Expected interpreter or runtime version (check `.tool-versions`, `.nvmrc`, `Gemfile`, etc. against what's actually active)
- Required env vars present and non-empty
- No stale build artifacts (`dist/`, `.next/`, compiled binaries from an earlier branch)
- Dependent local services (database, cache, queue) running at expected versions *when the bug plausibly involves them*
#### 1.3 Trace the code path
Read the relevant source files. Follow the execution path from entry point to where the error manifests. Trace backward through the call chain:
@@ -92,6 +103,8 @@ As you trace:
Read `references/anti-patterns.md` before forming hypotheses.
**Assumption audit (before hypothesis formation):** List the concrete "this must be true" beliefs your understanding depends on — the framework behaves as expected here, this function returns what its name implies, the config loads before this runs, the caller passes a non-null value, the database is in the state the test implies. For each, mark *verified* (you read the code, checked state, or ran it) or *assumed*. Assumptions are the most common source of stuck debugging. Many "wrong hypotheses" are actually correct hypotheses tested against a wrong assumption.
**Form hypotheses** ranked by likelihood. For each, state:
- What is wrong and where (file:line)
- The causal chain: how the trigger leads to the observed symptom, step by step
@@ -144,6 +157,8 @@ If 2-3 hypotheses are exhausted without confirmation, diagnose why:
| Works locally, fails in CI/prod | Environment problem | Focus on env differences, config, dependencies, timing |
| Fix works but prediction was wrong | Symptom fix, not root cause | The real cause is still active — keep investigating |
**Parallel investigation option:** When hypotheses are evidence-bottlenecked across clearly independent subsystems, dispatch read-only sub-agents in parallel, each with an explicit hypothesis and structured evidence-return format. No code edits by sub-agents, and skip this when hypotheses depend on each other's outcomes. If the platform does not support parallel sub-agent dispatch, run the same hypothesis probes sequentially in ranked-likelihood order instead — the parallelism is a latency optimization, not a correctness requirement.
Present the diagnosis to the user before proceeding.
---
@@ -165,8 +180,7 @@ If the user chose "Diagnosis only" at the end of Phase 2, skip this phase and go
**3 failed fix attempts = smart escalation.** Diagnose using the same table from Phase 2. If fixes keep failing, the root cause identification was likely wrong. Return to Phase 2.
**Conditional defense-in-depth** (trigger: grep for the root-cause pattern found it in other files):
Check whether the same gap exists at those locations. Skip when the root cause is a one-off error.
**Conditional defense-in-depth** (trigger: grep for the root-cause pattern found it in 3+ other files, OR the bug would have been catastrophic if it reached production): Read `references/defense-in-depth.md` for the four-layer model (entry validation, invariant check, environment guard, diagnostic breadcrumb) and choose which layers apply. Skip when the root cause is a one-off error with no realistic recurrence path.
**Conditional post-mortem** (trigger: the bug was in production, OR the pattern appears in 3+ locations):
How was this introduced? What allowed it to survive? If a systemic gap was found: "This pattern appears in N other files. Want to capture it with `/ce-compound`?"