feat: optimize ce:compound speed and effectiveness (#370)

This commit is contained in:
Trevin Chow
2026-03-24 20:12:19 -07:00
committed by GitHub
parent 2612ed6b3d
commit 4e3af07962
2 changed files with 82 additions and 39 deletions

View File

@@ -53,33 +53,33 @@ If the feature type is clear, narrow the search to relevant category directories
| Integration | `docs/solutions/integration-issues/` |
| General/unclear | `docs/solutions/` (all) |
### Step 3: Grep Pre-Filter (Critical for Efficiency)
### Step 3: Content-Search Pre-Filter (Critical for Efficiency)
**Use Grep to find candidate files BEFORE reading any content.** Run multiple Grep calls in parallel:
**Use the native content-search tool (e.g., Grep in Claude Code) to find candidate files BEFORE reading any content.** Run multiple searches in parallel, case-insensitive, returning only matching file paths:
```bash
```
# Search for keyword matches in frontmatter fields (run in PARALLEL, case-insensitive)
Grep: pattern="title:.*email" path=docs/solutions/ output_mode=files_with_matches -i=true
Grep: pattern="tags:.*(email|mail|smtp)" path=docs/solutions/ output_mode=files_with_matches -i=true
Grep: pattern="module:.*(Brief|Email)" path=docs/solutions/ output_mode=files_with_matches -i=true
Grep: pattern="component:.*background_job" path=docs/solutions/ output_mode=files_with_matches -i=true
content-search: pattern="title:.*email" path=docs/solutions/ files_only=true case_insensitive=true
content-search: pattern="tags:.*(email|mail|smtp)" path=docs/solutions/ files_only=true case_insensitive=true
content-search: pattern="module:.*(Brief|Email)" path=docs/solutions/ files_only=true case_insensitive=true
content-search: pattern="component:.*background_job" path=docs/solutions/ files_only=true case_insensitive=true
```
**Pattern construction tips:**
- Use `|` for synonyms: `tags:.*(payment|billing|stripe|subscription)`
- Include `title:` - often the most descriptive field
- Use `-i=true` for case-insensitive matching
- Search case-insensitively
- Include related terms the user might not have mentioned
**Why this works:** Grep scans file contents without reading into context. Only matching filenames are returned, dramatically reducing the set of files to examine.
**Why this works:** Content search scans file contents without reading into context. Only matching filenames are returned, dramatically reducing the set of files to examine.
**Combine results** from all Grep calls to get candidate files (typically 5-20 files instead of 200).
**Combine results** from all searches to get candidate files (typically 5-20 files instead of 200).
**If Grep returns >25 candidates:** Re-run with more specific patterns or combine with category narrowing.
**If search returns >25 candidates:** Re-run with more specific patterns or combine with category narrowing.
**If Grep returns <3 candidates:** Do a broader content search (not just frontmatter fields) as fallback:
```bash
Grep: pattern="email" path=docs/solutions/ output_mode=files_with_matches -i=true
**If search returns <3 candidates:** Do a broader content search (not just frontmatter fields) as fallback:
```
content-search: pattern="email" path=docs/solutions/ files_only=true case_insensitive=true
```
### Step 3b: Always Check Critical Patterns
@@ -228,26 +228,26 @@ Structure your findings as:
## Efficiency Guidelines
**DO:**
- Use Grep to pre-filter files BEFORE reading any content (critical for 100+ files)
- Run multiple Grep calls in PARALLEL for different keywords
- Include `title:` in Grep patterns - often the most descriptive field
- Use the native content-search tool to pre-filter files BEFORE reading any content (critical for 100+ files)
- Run multiple content searches in PARALLEL for different keywords
- Include `title:` in search patterns - often the most descriptive field
- Use OR patterns for synonyms: `tags:.*(payment|billing|stripe)`
- Use `-i=true` for case-insensitive matching
- Use category directories to narrow scope when feature type is clear
- Do a broader content Grep as fallback if <3 candidates found
- Do a broader content search as fallback if <3 candidates found
- Re-narrow with more specific patterns if >25 candidates found
- Always read the critical patterns file (Step 3b)
- Only read frontmatter of Grep-matched candidates (not all files)
- Only read frontmatter of search-matched candidates (not all files)
- Filter aggressively - only fully read truly relevant files
- Prioritize high-severity and critical patterns
- Extract actionable insights, not just summaries
- Note when no relevant learnings exist (this is valuable information too)
**DON'T:**
- Read frontmatter of ALL files (use Grep to pre-filter first)
- Run Grep calls sequentially when they can be parallel
- Read frontmatter of ALL files (use content-search to pre-filter first)
- Run searches sequentially when they can be parallel
- Use only exact keyword matches (include synonyms)
- Skip the `title:` field in Grep patterns
- Skip the `title:` field in search patterns
- Proceed with >25 candidates without narrowing first
- Read every file in full (wasteful)
- Return raw document contents (distill instead)

View File

@@ -68,15 +68,54 @@ Launch these subagents IN PARALLEL. Each returns text data to the orchestrator.
- Extracts conversation history
- Identifies problem type, component, symptoms
- Incorporates auto memory excerpts (if provided by the orchestrator) as supplementary evidence when identifying problem type, component, and symptoms
- Validates against schema
- Returns: YAML frontmatter skeleton
- Validates all enum fields against the schema values below
- Maps problem_type to the `docs/solutions/` category directory
- Suggests a filename using the pattern `[sanitized-problem-slug]-[date].md`
- Returns: YAML frontmatter skeleton (must include `category:` field mapped from problem_type), category directory path, and suggested filename
**Schema enum values (validate against these exactly):**
- **problem_type**: build_error, test_failure, runtime_error, performance_issue, database_issue, security_issue, ui_bug, integration_issue, logic_error, developer_experience, workflow_issue, best_practice, documentation_gap
- **component**: rails_model, rails_controller, rails_view, service_object, background_job, database, frontend_stimulus, hotwire_turbo, email_processing, brief_system, assistant, authentication, payments, development_workflow, testing_framework, documentation, tooling
- **root_cause**: missing_association, missing_include, missing_index, wrong_api, scope_issue, thread_violation, async_timing, memory_leak, config_error, logic_error, test_isolation, missing_validation, missing_permission, missing_workflow_step, inadequate_documentation, missing_tooling, incomplete_setup
- **resolution_type**: code_fix, migration, config_change, test_fix, dependency_update, environment_setup, workflow_improvement, documentation_update, tooling_addition, seed_data_update
- **severity**: critical, high, medium, low
**Category mapping (problem_type -> directory):**
| problem_type | Directory |
|---|---|
| build_error | build-errors/ |
| test_failure | test-failures/ |
| runtime_error | runtime-errors/ |
| performance_issue | performance-issues/ |
| database_issue | database-issues/ |
| security_issue | security-issues/ |
| ui_bug | ui-bugs/ |
| integration_issue | integration-issues/ |
| logic_error | logic-errors/ |
| developer_experience | developer-experience/ |
| workflow_issue | workflow-issues/ |
| best_practice | best-practices/ |
| documentation_gap | documentation-gaps/ |
#### 2. **Solution Extractor**
- Analyzes all investigation steps
- Identifies root cause
- Extracts working solution with code examples
- Incorporates auto memory excerpts (if provided by the orchestrator) as supplementary evidence -- conversation history and the verified fix take priority; if memory notes contradict the conversation, note the contradiction as cautionary context
- Returns: Solution content block
- Develops prevention strategies and best practices guidance
- Generates test cases if applicable
- Returns: Solution content block including prevention section
**Expected output sections (follow this structure):**
- **Problem**: 1-2 sentence description of the issue
- **Symptoms**: Observable symptoms (error messages, behavior)
- **What Didn't Work**: Failed investigation attempts and why they failed
- **Solution**: The actual fix with code examples (before/after when applicable)
- **Why This Works**: Root cause explanation and why the solution addresses it
- **Prevention**: Strategies to avoid recurrence, best practices, and test cases. Include concrete code examples where applicable (e.g., gem configurations, test assertions, linting rules)
#### 3. **Related Docs Finder**
- Searches `docs/solutions/` for related documentation
@@ -85,17 +124,23 @@ Launch these subagents IN PARALLEL. Each returns text data to the orchestrator.
- Flags any related learning or pattern docs that may now be stale, contradicted, or overly broad
- Returns: Links, relationships, and any refresh candidates
#### 4. **Prevention Strategist**
- Develops prevention strategies
- Creates best practices guidance
- Generates test cases if applicable
- Returns: Prevention/testing content
**Search strategy (grep-first filtering for efficiency):**
#### 5. **Category Classifier**
- Determines optimal `docs/solutions/` category
- Validates category against schema
- Suggests filename based on slug
- Returns: Final path and filename
1. Extract keywords from the problem context: module names, technical terms, error messages, component types
2. If the problem category is clear, narrow search to the matching `docs/solutions/<category>/` directory
3. Use the native content-search tool (e.g., Grep in Claude Code) to pre-filter candidate files BEFORE reading any content. Run multiple searches in parallel, case-insensitive, targeting frontmatter fields. These are template patterns -- substitute actual keywords:
- `title:.*<keyword>`
- `tags:.*(<keyword1>|<keyword2>)`
- `module:.*<module name>`
- `component:.*<component>`
4. If search returns >25 candidates, re-run with more specific patterns. If <3, broaden to full content search
5. Read only frontmatter (first 30 lines) of candidate files to score relevance
6. Fully read only strong/moderate matches
7. Return distilled links and relationships, not raw file contents
**GitHub issue search:**
Prefer the `gh` CLI for searching related issues: `gh issue list --search "<keywords>" --state all --limit 5`. If `gh` is not installed, fall back to the GitHub MCP tools (e.g., `unblocked` data_retrieval) if available. If neither is available, skip GitHub issue search and note it was skipped in the output.
</parallel_tasks>
@@ -275,11 +320,9 @@ In compact-safe mode, only suggest `ce:compound-refresh` if there is an obvious
Auto memory: 2 relevant entries used as supplementary evidence
Subagent Results:
✓ Context Analyzer: Identified performance_issue in brief_system
✓ Solution Extractor: 3 code fixes
✓ Context Analyzer: Identified performance_issue in brief_system, category: performance-issues/
✓ Solution Extractor: 3 code fixes, prevention strategies
✓ Related Docs Finder: 2 related issues
✓ Prevention Strategist: Prevention strategies, test suggestions
✓ Category Classifier: `performance-issues`
Specialized Agent Reviews (Auto-Triggered):
✓ performance-oracle: Validated query optimization approach