diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index ae52e23..b291a64 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -11,7 +11,7 @@ "plugins": [ { "name": "compound-engineering", - "description": "AI-powered development tools that get smarter with every use. Make each unit of engineering work easier than the last. Includes 28 specialized agents and 41 skills.", + "description": "AI-powered development tools that get smarter with every use. Make each unit of engineering work easier than the last. Includes 29 specialized agents and 47 skills.", "version": "2.40.0", "author": { "name": "Kieran Klaassen", diff --git a/README.md b/README.md index e1e1c3e..6d67b50 100644 --- a/README.md +++ b/README.md @@ -184,17 +184,20 @@ Notes: ``` Brainstorm → Plan → Work → Review → Compound → Repeat + ↑ + Ideate (optional — when you need ideas) ``` | Command | Purpose | |---------|---------| +| `/ce:ideate` | Discover high-impact project improvements through divergent ideation and adversarial filtering | | `/ce:brainstorm` | Explore requirements and approaches before planning | | `/ce:plan` | Turn feature ideas into detailed implementation plans | | `/ce:work` | Execute plans with worktrees and task tracking | | `/ce:review` | Multi-agent code review before merging | | `/ce:compound` | Document learnings to make future work easier | -The `/ce:brainstorm` skill supports collaborative dialogue to clarify requirements and compare approaches before committing to a plan. +The `/ce:ideate` skill proactively surfaces strong improvement ideas, and `/ce:brainstorm` then clarifies the selected one before committing to a plan. Each cycle compounds: brainstorms sharpen plans, plans inform future plans, reviews catch more issues, patterns get documented. diff --git a/docs/brainstorms/2026-03-15-ce-ideate-skill-requirements.md b/docs/brainstorms/2026-03-15-ce-ideate-skill-requirements.md new file mode 100644 index 0000000..41f2d40 --- /dev/null +++ b/docs/brainstorms/2026-03-15-ce-ideate-skill-requirements.md @@ -0,0 +1,77 @@ +--- +date: 2026-03-15 +topic: ce-ideate-skill +--- + +# ce:ideate — Open-Ended Ideation Skill + +## Problem Frame + +The ce:brainstorm skill is reactive — the user brings an idea, and the skill helps refine it through collaborative dialogue. There is no workflow for the opposite direction: having the AI proactively generate ideas by deeply understanding the project and then filtering them through critical self-evaluation. Users currently achieve this through ad-hoc prompting (e.g., "come up with 100 ideas and give me your best 10"), but that approach has no codebase grounding, no structured output, no durable artifact, and no connection to the ce:* workflow pipeline. + +## Requirements + +- R1. ce:ideate is a standalone skill, separate from ce:brainstorm, with its own SKILL.md in `plugins/compound-engineering/skills/ce-ideate/` +- R2. Accepts an optional freeform argument that serves as a focus hint — can be a concept ("DX improvements"), a path ("plugins/compound-engineering/skills/"), a constraint ("low-complexity quick wins"), or empty for fully open ideation +- R3. Performs a deep codebase scan before generating ideas, grounding ideation in the actual project state rather than abstract speculation +- R4. Preserves the user's proven prompt mechanism as the core workflow: generate many ideas first, then systematically and critically reject weak ones, then explain only the surviving ideas in detail +- R5. Self-critiques the full list, rejecting weak ideas with explicit reasoning — the adversarial filtering step is the core quality mechanism +- R6. Presents the top 5-7 surviving ideas with structured analysis: description, rationale, downsides, confidence score (0-100%), estimated complexity +- R7. Includes a brief rejection summary — one-line per rejected idea with the reason — so the user can see what was considered and why it was cut +- R8. Writes a durable ideation artifact to `docs/ideation/YYYY-MM-DD--ideation.md` (or `YYYY-MM-DD-open-ideation.md` when no focus area). This compounds — rejected ideas prevent re-exploring dead ends, and un-acted-on ideas remain available for future sessions. +- R9. The default volume (~30 ideas, top 5-7 presented) can be overridden by the user's argument (e.g., "give me your top 3" or "go deep, 100 ideas") +- R10. Handoff options after presenting ideas: brainstorm a selected idea (feeds into ce:brainstorm), refine the ideation (dig deeper, re-evaluate, explore new angles), share to Proof, or end the session +- R11. Always routes to ce:brainstorm when the user wants to act on an idea — ideation output is never detailed enough to skip requirements refinement +- R12. Session completion: when ending, offer to commit the ideation doc to the current branch. If the user declines, leave the file uncommitted. Do not create branches or push — just the local commit. +- R13. Resume behavior: when ce:ideate is invoked, check `docs/ideation/` for ideation docs created within the last 30 days. If a relevant one exists, offer to continue from it (add new ideas, revisit rejected ones, act on un-explored ideas) or start fresh. +- R14. Present the surviving candidates to the user before writing the durable ideation artifact, so the user can ask questions or lightly reshape the candidate set before it is archived +- R15. The ideation artifact must be written or updated before any downstream handoff, Proof sharing, or session end, even though the initial survivor presentation happens first +- R16. Refine routes based on intent: "add more ideas" or "explore new angles" returns to generation (Phase 2), "re-evaluate" or "raise the bar" returns to critique (Phase 3), "dig deeper on idea #N" expands that idea's analysis in place. The ideation doc is updated after each refinement when the refined state is being preserved +- R17. Uses agent intelligence to improve ideation quality, but only as support for the core prompt mechanism rather than as a replacement for it +- R18. Uses existing research agents for codebase grounding, but ideation and critique sub-agents are prompt-defined roles with distinct perspectives rather than forced reuse of existing named review agents +- R19. When sub-agents are used for ideation, each one receives the same grounding summary, the user focus hint, and the current volume target +- R20. Focus hints influence both candidate generation and final filtering; they are not only an evaluation-time bias +- R21. Ideation sub-agents return ideas in a standardized structured format so the orchestrator can merge, dedupe, and reason over them consistently +- R22. The orchestrator owns final scoring, ranking, and survivor decisions across the merged idea set; sub-agents may emit lightweight local signals, but they do not authoritatively rank their own ideas +- R23. Distinct ideation perspectives should be created through prompt framing methods that encourage creative spread without over-constraining the workflow; examples include friction, unmet need, inversion, assumption-breaking, leverage, and extreme-case prompts +- R24. The skill does not hardcode a fixed number of sub-agents for all runs; it should use the smallest useful set that preserves diversity without overwhelming the orchestrator's context window +- R25. When the user picks an idea to brainstorm, the ideation doc is updated to mark that idea as "explored" with a reference to the resulting brainstorm session date, so future revisits show which ideas have been acted on. + +## Success Criteria + +- A user can invoke `/ce:ideate` with no arguments on any project and receive genuinely surprising, high-quality improvement ideas grounded in the actual codebase +- Ideas that survive the filter are meaningfully better than what the user would get from a naive "give me 10 ideas" prompt +- The workflow uses agent intelligence to widen the candidate pool without obscuring the core generate -> reject -> survivors mechanism +- The user sees and can question the surviving candidates before they are written into the durable artifact +- The ideation artifact persists and provides value when revisited weeks later +- The skill composes naturally with the existing pipeline: ideate → brainstorm → plan → work + +## Scope Boundaries + +- ce:ideate does NOT produce requirements, plans, or code — it produces ranked ideas +- ce:ideate does NOT modify ce:brainstorm's behavior — discovery of ce:ideate is handled through the skill description and catalog, not by altering other skills +- The skill does not do external research (competitive analysis, similar projects) in v1 — this could be a future enhancement but adds cost and latency without proven need +- No configurable depth modes in v1 — fixed volume with argument-based override is sufficient + +## Key Decisions + +- **Standalone skill, not a mode within ce:brainstorm**: The workflows are fundamentally different cognitive modes (proactive/divergent vs. reactive/convergent) with different phases, outputs, and success criteria. Combining them would make ce:brainstorm harder to maintain and blur its identity. +- **Durable artifact in docs/ideation/**: Discarding ideation results is anti-compounding. The file is cheap to write and provides value when revisiting un-acted-on ideas or avoiding re-exploration of rejected ones. +- **Artifact written after candidate review, not before initial presentation**: The first survivor presentation is collaborative review, not archival finalization. The artifact should be written only after the candidate set is good enough to preserve, but always before handoff, sharing, or session end. +- **Always route to ce:brainstorm for follow-up**: At ideation depth, ideas are one-paragraph concepts — never detailed enough to skip requirements refinement. +- **Survivors + rejection summary output format**: Full transparency on what was considered without overwhelming with detailed analysis of rejected ideas. +- **Freeform optional argument**: A concept, a path, or nothing at all — the skill interprets whatever it gets as context. No artificial distinction between "focus area" and "target path." +- **Agent intelligence as support, not replacement**: The value comes from the proven ideation-and-rejection mechanism. Parallel sub-agents help produce a richer candidate pool and stronger critique, but the orchestrator remains responsible for synthesis, scoring, and final ranking. + +## Outstanding Questions + +### Deferred to Planning + +- [Affects R3][Technical] Which research agents should always run for codebase grounding in v1 beyond `repo-research-analyst` and `learnings-researcher`, if any? +- [Affects R21][Technical] What exact structured output schema should ideation sub-agents return so the orchestrator can merge and score consistently without overfitting the format too early? +- [Affects R6][Technical] Should the structured analysis per surviving idea include "suggested next steps" or "what this would unlock" beyond the current fields (description, rationale, downsides, confidence, complexity)? +- [Affects R2][Technical] How should the skill detect volume overrides in the freeform argument vs. focus-area hints? Simple heuristic or explicit parsing? + +## Next Steps + +→ `/ce:plan` for structured implementation planning diff --git a/docs/brainstorms/2026-03-16-issue-grounded-ideation-requirements.md b/docs/brainstorms/2026-03-16-issue-grounded-ideation-requirements.md new file mode 100644 index 0000000..9afc291 --- /dev/null +++ b/docs/brainstorms/2026-03-16-issue-grounded-ideation-requirements.md @@ -0,0 +1,65 @@ +--- +date: 2026-03-16 +topic: issue-grounded-ideation +--- + +# Issue-Grounded Ideation Mode for ce:ideate + +## Problem Frame + +When a team wants to ideate on improvements, their issue tracker holds rich signal about real user pain, recurring failures, and severity patterns — but ce:ideate currently only looks at the codebase and past learnings. Teams have to manually synthesize issue patterns before ideating, or they ideate without that context and miss what their users are actually hitting. + +The goal is not "fix individual bugs" but "generate strategic improvement ideas grounded in the patterns your issue tracker reveals." 25 duplicate bugs about the same failure mode is a signal about collaboration reliability, not 25 separate problems. + +## Requirements + +- R1. When the user's argument indicates they want issue-tracker data as input (e.g., "bugs", "github issues", "open issues", "what users are reporting", "issue patterns"), ce:ideate activates an issue intelligence step alongside the existing Phase 1 scans +- R2. A new **issue intelligence agent** fetches, clusters, deduplicates, and analyzes issues, returning structured theme analysis — not a list of individual issues +- R3. The agent fetches **open issues** plus **recently closed issues** (approximately 30 days), filtering out issues closed as duplicate, won't-fix, or not-planned. Recently fixed issues are included because they show which areas had enough pain to warrant action. +- R4. Issue clusters drive the ideation frames in Phase 2 using a **hybrid strategy**: derive frames from clusters, pad with default frames (e.g., "assumption-breaking", "leverage/compounding") when fewer than 4 clusters exist. This ensures ideas are grounded in real pain patterns while maintaining ideation diversity. +- R5. The existing Phase 1 scans (codebase context + learnings search) still run in parallel — issue analysis is additive context, not a replacement +- R6. The issue intelligence agent detects the repository from the current directory's git remote +- R7. Start with GitHub issues via `gh` CLI. Design the agent prompt and output structure so Linear or other trackers can be added later without restructuring the ideation flow. +- R8. The issue intelligence agent is independently useful outside of ce:ideate — it can be dispatched directly by a user or other workflows to summarize issue themes, understand the current landscape, or reason over recent activity. Its output should be self-contained, not coupled to ideation-specific context. +- R9. The agent's output must communicate at the **theme level**, not the individual-issue level. Each theme should convey: what the pattern is, why it matters (user impact, severity, frequency, trend direction), and what it signals about the system. The output should help a human or agent fully understand the importance and shape of each theme without needing to read individual issues. + +## Success Criteria + +- Running `/ce:ideate bugs` on a repo with noisy/duplicate issues (like proof's 25+ LIVE_DOC_UNAVAILABLE variants) produces clustered themes, not a rehash of individual issues +- Surviving ideas are strategic improvements ("invest in collaboration reliability infrastructure") not bug fixes ("fix LIVE_DOC_UNAVAILABLE") +- The issue intelligence agent's output is structured enough that ideation sub-agents can engage with themes meaningfully +- Ideation quality is at least as good as the default mode, with the added benefit of issue grounding + +## Scope Boundaries + +- GitHub issues only in v1 (Linear is a future extension) +- No issue triage or management — this is read-only analysis for ideation input +- No changes to Phase 3 (adversarial filtering) or Phase 4 (presentation) — only Phase 1 and Phase 2 frame derivation are affected +- The issue intelligence agent is a new agent file, not a modification to an existing research agent +- The agent is designed as a standalone capability that ce:ideate composes, not an ideation-internal module +- Assumes `gh` CLI is available and authenticated in the environment +- When a repo has too few issues to cluster meaningfully (e.g., < 5 open+recent), the agent should report that and ce:ideate should fall back to default ideation with a note to the user + +## Key Decisions + +- **Pattern-first, not issue-first**: The output is improvement ideas grounded in bug patterns, not a prioritized bug list. The ideation instructions already prevent "just fix bug #534" thinking. +- **Hybrid frame strategy**: Clusters derive ideation frames, padded with defaults when thin. Pure cluster-derived frames risk too few frames; pure default frames risk ignoring the issue signal. +- **Flexible argument detection**: Use intent-based parsing ("reasonable interpretation rather than formal parsing") consistent with the existing volume hint system. No rigid keyword matching. +- **Open + recently closed**: Including recently fixed issues provides richer pattern data — shows which areas warranted action, not just what's currently broken. +- **Additive to Phase 1**: Issue analysis runs as a third parallel agent alongside codebase scan and learnings search. All three feed the grounding summary. +- **Titles + labels + sample bodies**: Read titles and labels for all issues (cheap), then read full bodies for 2-3 representative issues per emerging cluster. This handles both well-labeled repos (labels drive clustering, bodies confirm) and poorly-labeled repos (bodies drive clustering). Avoids reading all bodies which is expensive at scale. + +## Outstanding Questions + +### Deferred to Planning + +- [Affects R2][Technical] What structured output format should the issue intelligence agent return? Likely theme clusters with: theme name, issue count, severity distribution, representative issue titles, and a one-line synthesis. +- [Affects R3][Technical] How to detect GitHub close reasons (completed vs not-planned vs duplicate) via `gh` CLI? May need `gh issue list --state closed --json stateReason` or label-based filtering. +- [Affects R4][Technical] What's the threshold for "too few clusters"? Current thinking: pad with default frames when fewer than 4 clusters, but this may need tuning. +- [Affects R6][Technical] How to extract the GitHub repo from git remote? Standard `gh repo view --json nameWithOwner` or parse the remote URL. +- [Affects R7][Needs research] What would a Linear integration look like? Just swapping the fetch mechanism, or does Linear's project/cycle structure change the clustering approach? +- [Affects R2][Technical] Exact number of sample bodies per cluster to read (starting point: 2-3 per cluster). + +## Next Steps + +→ `/ce:plan` for structured implementation planning diff --git a/docs/plans/2026-03-15-001-feat-ce-ideate-skill-plan.md b/docs/plans/2026-03-15-001-feat-ce-ideate-skill-plan.md new file mode 100644 index 0000000..59edc49 --- /dev/null +++ b/docs/plans/2026-03-15-001-feat-ce-ideate-skill-plan.md @@ -0,0 +1,387 @@ +--- +title: "feat: Add ce:ideate open-ended ideation skill" +type: feat +status: completed +date: 2026-03-15 +origin: docs/brainstorms/2026-03-15-ce-ideate-skill-requirements.md +deepened: 2026-03-16 +--- + +# feat: Add ce:ideate open-ended ideation skill + +## Overview + +Add a new `ce:ideate` skill to the compound-engineering plugin that performs open-ended, divergent-then-convergent idea generation for any project. The skill deeply scans the codebase, generates ~30 ideas, self-critiques and filters them, and presents the top 5-7 as a ranked list with structured analysis. It uses agent intelligence to improve the candidate pool without replacing the core prompt mechanism, writes a durable artifact to `docs/ideation/` after the survivors have been reviewed, and hands off selected ideas to `ce:brainstorm`. + +## Problem Frame + +The ce:* workflow pipeline has a gap at the very beginning. `ce:brainstorm` requires the user to bring an idea — it refines but doesn't generate. Users who want the AI to proactively suggest improvements must resort to ad-hoc prompting, which lacks codebase grounding, structured output, durable artifacts, and pipeline integration. (see origin: docs/brainstorms/2026-03-15-ce-ideate-skill-requirements.md) + +## Requirements Trace + +- R1. Standalone skill in `plugins/compound-engineering/skills/ce-ideate/` +- R2. Optional freeform argument as focus hint (concept, path, constraint, or empty) +- R3. Deep codebase scan via research agents before generating ideas +- R4. Preserve the proven prompt mechanism: many ideas first, then brutal filtering, then detailed survivors +- R5. Self-critique with explicit rejection reasoning +- R6. Present top 5-7 with structured analysis (description, rationale, downsides, confidence 0-100%, complexity) +- R7. Rejection summary (one-line per rejected idea) +- R8. Durable artifact in `docs/ideation/YYYY-MM-DD--ideation.md` +- R9. Volume overridable via argument +- R10. Handoff: brainstorm an idea, refine, share to Proof, or end session +- R11. Always route to ce:brainstorm for follow-up on selected ideas +- R12. Offer commit on session end +- R13. Resume from existing ideation docs (30-day recency window) +- R14. Present survivors before writing the durable artifact +- R15. Write artifact before handoff/share/end +- R16. Update doc in place on refine when preserving refined state +- R17. Use agent intelligence as support for the core mechanism, not a replacement +- R18. Use research agents for grounding; ideation/critique sub-agents are prompt-defined roles +- R19. Pass grounding summary, focus hint, and volume target to ideation sub-agents +- R20. Focus hints influence both generation and filtering +- R21. Use standardized structured outputs from ideation sub-agents +- R22. Orchestrator owns final scoring, ranking, and survivor decisions +- R23. Use broad prompt-framing methods to encourage creative spread without over-constraining ideation +- R24. Use the smallest useful set of sub-agents rather than a hardcoded fixed count +- R25. Mark ideas as "explored" when brainstormed + +## Scope Boundaries + +- No external research (competitive analysis, similar projects) in v1 (see origin) +- No configurable depth modes — fixed volume with argument-based override (see origin) +- No modifications to ce:brainstorm — discovery via skill description only (see origin) +- No deprecated `workflows:ideate` alias — the `workflows:*` prefix is deprecated +- No `references/` split — estimated skill length ~300 lines, well under the 500-line threshold + +## Context & Research + +### Relevant Code and Patterns + +- `plugins/compound-engineering/skills/ce-brainstorm/SKILL.md` — Closest sibling. Mirror: resume behavior (Phase 0.1), artifact frontmatter (date + topic), handoff options via platform question tool, document-review integration, Proof sharing +- `plugins/compound-engineering/skills/ce-plan/SKILL.md` — Agent dispatch pattern: `Task compound-engineering:research:repo-research-analyst(context)` running in parallel. Phase 0.2 upstream document detection +- `plugins/compound-engineering/skills/ce-work/SKILL.md` — Session completion: incremental commit pattern, staging specific files, conventional commit format +- `plugins/compound-engineering/skills/ce-compound/SKILL.md` — Parallel research assembly: subagents return text only, orchestrator writes the single file +- `plugins/compound-engineering/skills/document-review/SKILL.md` — Utility invocation: "Load the `document-review` skill and apply it to..." Returns "Review complete" signal +- `plugins/compound-engineering/skills/deepen-plan/SKILL.md` — Broad parallel agent dispatch pattern +- PR #277 (`fix: codex workflow conversion for compound-engineering`) — establishes the Codex model for canonical `ce:*` workflows: prompt wrappers for canonical entrypoints, transformed intra-workflow handoffs, and omission of deprecated `workflows:*` aliases + +### Institutional Learnings + +- `docs/solutions/plugin-versioning-requirements.md` — Do not bump versions or cut changelog entries in feature PRs. Do update README counts and plugin.json descriptions. +- `docs/solutions/codex-skill-prompt-entrypoints.md` (from PR #277) — for compound-engineering workflows in Codex, prompts are the canonical user-facing entrypoints and copied skills are the reusable implementation units underneath them + +## Key Technical Decisions + +- **Agent dispatch for codebase scan**: Use `repo-research-analyst` + `learnings-researcher` in parallel (matches ce:plan Phase 1.1). Skip `git-history-analyzer` by default — marginal ideation value for the cost. The focus hint (R2) is passed as context to both agents. +- **Core mechanism first, agents second**: The core design is still the user's proven prompt pattern: generate many ideas, reject aggressively, then explain only the survivors. Agent intelligence improves the candidate pool and critique quality, but does not replace this mechanism. +- **Prompt-defined ideation and critique sub-agents**: Use prompt-shaped sub-agents with distinct framing methods for ideation and optional skeptical critique, rather than forcing reuse of existing named review agents whose purpose is different. +- **Orchestrator-owned synthesis and scoring**: The orchestrator merges and dedupes sub-agent outputs, applies one consistent rubric, and decides final scoring/ranking. Sub-agents may emit lightweight local signals, but not authoritative final rankings. +- **Artifact frontmatter**: `date`, `topic`, `focus` (optional). Minimal, paralleling the brainstorm `date` + `topic` pattern. +- **Volume override via natural language**: The skill instructions tell Claude to interpret number patterns in the argument ("top 3", "100 ideas") as volume overrides. No formal parsing. +- **Artifact timing**: Present survivors first, allow brief questions or lightweight clarification, then write/update the durable artifact before any handoff, Proof share, or session end. +- **No `disable-model-invocation`**: The skill should be auto-loadable when users say things like "what should I improve?", "give me ideas for this project", "ideate on improvements". Following the same pattern as ce:brainstorm. +- **Commit pattern**: Stage only `docs/ideation/`, use conventional format `docs: add ideation for `, offer but don't force. +- **Relationship to PR #277**: `ce:ideate` must follow the same Codex workflow model as the other canonical `ce:*` workflows. Why: without #277's prompt-wrapper and handoff-rewrite model, a copied workflow skill can still point at Claude-style slash handoffs that do not exist coherently in Codex. `ce:ideate` should be introduced as another canonical `ce:*` workflow on that same surface, not as a one-off pass-through skill. + +## Open Questions + +### Resolved During Planning + +- **Which agents for codebase scan?** → `repo-research-analyst` + `learnings-researcher`. Rationale: same proven pattern as ce:plan, covers both current code and institutional knowledge. +- **Additional analysis fields per idea?** → Keep as specified in R6. "What this unlocks" bleeds into brainstorm scope. YAGNI. +- **Volume override detection?** → Natural language interpretation. The skill instructions describe how to detect overrides. No formal parsing needed. +- **Artifact frontmatter fields?** → `date`, `topic`, `focus` (optional). Follows brainstorm pattern. +- **Need references/ split?** → No. Estimated ~300 lines, under the 500-line threshold. +- **Need deprecated alias?** → No. `workflows:*` is deprecated; new skills go straight to `ce:*`. +- **How should docs regeneration be represented in the plan?** → The checked-in tree does not currently contain the previously assumed generated files (`docs/index.html`, `docs/pages/skills.html`). Treat `/release-docs` as a repo-maintenance validation step that may update tracked generated artifacts, not as a guaranteed edit to predetermined file paths. +- **How should skill counts be validated across artifacts?** → Do not force one unified count across every surface. The plugin manifests should reflect parser-discovered skill directories, while `plugins/compound-engineering/README.md` should preserve its human-facing taxonomy of workflow commands vs. standalone skills. +- **What is the dependency on PR #277?** → Treat #277 as an upstream prerequisite for Codex correctness. If it merges first, `ce:ideate` should slot into its canonical `ce:*` workflow model. If it does not merge first, equivalent Codex workflow behavior must be included before `ce:ideate` is considered complete. +- **How should agent intelligence be applied?** → Research agents are used for grounding, prompt-defined sub-agents are used to widen the candidate pool and critique it, and the orchestrator remains the final judge. +- **Who should score the ideas?** → The orchestrator, not the ideation sub-agents and not a separate scoring sub-agent by default. +- **When should the artifact be written?** → After the survivors are presented and reviewed enough to preserve, but always before handoff, sharing, or session end. + +### Deferred to Implementation + +- **Exact wording of the divergent ideation prompt section**: The plan specifies the structure and mechanisms, but the precise phrasing will be refined during implementation. This is an inherently iterative design element. +- **Exact wording of the self-critique instructions**: Same — structure is defined, exact prose is implementation-time. + +## Implementation Units + +- [x] **Unit 1: Create the ce:ideate SKILL.md** + +**Goal:** Write the complete skill definition with all phases, the ideation prompt structure, optional sub-agent support, artifact template, and handoff options. + +**Requirements:** R1-R25 (all requirements — this is the core deliverable) + +**Dependencies:** None + +**Files:** +- Create: `plugins/compound-engineering/skills/ce-ideate/SKILL.md` +- Test (conditional): `tests/claude-parser.test.ts`, `tests/cli.test.ts` + +**Approach:** + +- Keep this unit primarily content-only unless implementation discovers a real parser or packaging gap. `loadClaudePlugin()` already discovers any `skills/*/SKILL.md`, and most target converters/writers already pass `plugin.skills` through as `skillDirs`. +- Do not rely on pure pass-through for Codex. Because PR #277 gives compound-engineering `ce:*` workflows a canonical prompt-wrapper model in Codex, `ce:ideate` must be validated against that model and may require Codex-target updates if #277 is not already present. +- Treat artifact lifecycle rules as part of the skill contract, not polish: resume detection, present-before-write, refine-in-place, and brainstorm handoff state all live inside this SKILL.md and must be internally consistent. +- Keep the prompt sections grounded in Phase 1 findings so ideation quality does not collapse into generic product advice. +- Keep the user's original prompt mechanism as the backbone of the workflow. Extra agent structure should strengthen that mechanism rather than replacing it. +- When sub-agents are used, keep them prompt-defined and lightweight: shared grounding/focus/volume input, structured output, orchestrator-owned merge/dedupe/scoring. + +The skill follows the ce:brainstorm phase structure but with fundamentally different phases: + +``` +Phase 0: Resume and Route + 0.1 Check docs/ideation/ for recent ideation docs (R13) + 0.2 Parse argument — extract focus hint and any volume override (R2, R9) + 0.3 If no argument, proceed with fully open ideation (no blocking ask) + +Phase 1: Codebase Scan + 1.1 Dispatch research agents in parallel (R3): + - Task compound-engineering:research:repo-research-analyst(focus context) + - Task compound-engineering:research:learnings-researcher(focus context) + 1.2 Consolidate scan results into a codebase understanding summary + +Phase 2: Divergent Generation (R4, R17-R21, R23-R24) + Core ideation instructions tell Claude to: + - Generate ~30 ideas (or override amount) as a numbered list + - Each idea is a one-liner at this stage + - Push past obvious suggestions — the first 10-15 will be safe/obvious, + the interesting ones come after + - Ground every idea in specific codebase findings from Phase 1 + - Ideas should span multiple dimensions where justified + - If a focus area was provided, weight toward it but don't exclude + other strong ideas + - Preserve the user's original many-ideas-first mechanism + Optional sub-agent support: + - If the platform supports it, dispatch a small useful set of ideation + sub-agents with the same grounding summary, focus hint, and volume target + - Give each one a distinct prompt framing method (e.g. friction, unmet + need, inversion, assumption-breaking, leverage, extreme case) + - Require structured idea output so the orchestrator can merge and dedupe + - Do not use sub-agents to replace the core ideation mechanism + +Phase 3: Self-Critique and Filter (R5, R7, R20-R22) + Critique instructions tell Claude to: + - Go through each idea and evaluate it critically + - For each rejection, write a one-line reason + - Rejection criteria: not actionable, too vague, too expensive relative + to value, already exists, duplicates another idea, not grounded in + actual codebase state + - Target: keep 5-7 survivors (or override amount) + - If more than 7 pass scrutiny, do a second pass with higher bar + - If fewer than 5 pass, note this honestly rather than lowering the bar + Optional critique sub-agent support: + - Skeptical sub-agents may attack the merged list from distinct angles + - The orchestrator synthesizes critiques and owns final scoring/ranking + +Phase 4: Present Results (R6, R7, R14) + - Display ranked survivors with structured analysis per idea: + title, description (2-3 sentences), rationale, downsides, + confidence (0-100%), estimated complexity (low/medium/high) + - Display rejection summary: collapsed section, one-line per rejected idea + - Allow brief questions or lightweight clarification before archival write + +Phase 5: Write Artifact (R8, R15, R16) + - mkdir -p docs/ideation/ + - Write the ideation doc after survivors are reviewed enough to preserve + - Artifact includes: metadata, codebase context summary, ranked + survivors with full analysis, rejection summary + - Always write/update before brainstorm handoff, Proof share, or session end + +Phase 6: Handoff (R10, R11, R12, R15-R16, R25) + 6.1 Present options via platform question tool: + - Brainstorm an idea (pick by number → feeds to ce:brainstorm) (R11) + - Refine (R15) + - Share to Proof + - End session (R12) + 6.2 Handle selection: + - Brainstorm: update doc to mark idea as "explored" (R16), + then invoke ce:brainstorm with the idea description + - Refine: ask what kind of refinement, then route: + "add more ideas" / "explore new angles" → return to Phase 2 + "re-evaluate" / "raise the bar" → return to Phase 3 + "dig deeper on idea #N" → expand that idea's analysis in place + Update doc after each refinement when preserving the refined state (R16) + - Share to Proof: upload ideation doc using the standard + curl POST pattern (same as ce:brainstorm), return to options + - End: offer to commit the ideation doc (R12), display closing summary +``` + +Frontmatter: +```yaml +--- +name: ce:ideate +description: 'Generate and critically evaluate improvement ideas for any project through deep codebase analysis and divergent-then-convergent thinking. Use when the user says "what should I improve", "give me ideas", "ideate", "surprise me with improvements", "what would you change about this project", or when they want AI-generated project improvement suggestions rather than refining their own idea.' +argument-hint: "[optional: focus area, path, or constraint]" +--- +``` + +Artifact template: +```markdown +--- +date: YYYY-MM-DD +topic: +focus: +--- + +# Ideation: + +## Codebase Context +[Brief summary of what the scan revealed — project structure, patterns, pain points, opportunities] + +## Ranked Ideas + +### 1. +**Description:** [2-3 sentences] +**Rationale:** [Why this would be a good improvement] +**Downsides:** [Risks or costs] +**Confidence:** [0-100%] +**Complexity:** [Low / Medium / High] + +### 2. +... + +## Rejection Summary +| # | Idea | Reason for Rejection | +|---|------|---------------------| +| 1 | ... | ... | + +## Session Log +- [Date]: Initial ideation — [N] generated, [M] survived +``` + +**Patterns to follow:** +- ce:brainstorm SKILL.md — phase structure, frontmatter style, argument handling, resume pattern, handoff options, Proof sharing, interaction rules +- ce:plan SKILL.md — agent dispatch syntax (`Task compound-engineering:research:*`) +- ce:work SKILL.md — session completion commit pattern +- Plugin CLAUDE.md — skill compliance checklist (imperative voice, cross-platform question tool, no second person) + +**Test scenarios:** +- Invoke with no arguments → fully open ideation, generates ideas, presents survivors, then writes artifact when preserving results +- Invoke with focus area (`/ce:ideate DX improvements`) → weighted ideation toward focus +- Invoke with path (`/ce:ideate plugins/compound-engineering/skills/`) → scoped scan +- Invoke with volume override (`/ce:ideate give me your top 3`) → adjusted volume +- Resume: invoke when recent ideation doc exists → offers to continue or start fresh +- Resume + refine loop: revisit an existing ideation doc, add more ideas, then re-run critique without creating a duplicate artifact +- If sub-agents are used: each receives grounding + focus + volume context and returns structured outputs for orchestrator merge +- If critique sub-agents are used: orchestrator remains final scorer and ranker +- Brainstorm handoff: pick an idea → doc updated with "explored" marker, ce:brainstorm invoked +- Refine: ask to dig deeper → doc updated in place with refined analysis +- End session: offer commit → stages only the ideation doc, conventional message +- Initial review checkpoint: survivors can be questioned before archival write +- Codex install path after PR #277: `ce:ideate` is exposed as the canonical `ce:ideate` workflow entrypoint, not only as a copied raw skill +- Codex intra-workflow handoffs: any copied `SKILL.md` references to `/ce:*` routes resolve to the canonical Codex prompt surface, and no deprecated `workflows:ideate` alias is emitted + +**Verification:** +- SKILL.md is under 500 lines +- Frontmatter has `name`, `description`, `argument-hint` +- Description includes trigger phrases for auto-discovery +- All 25 requirements are addressed in the phase structure +- Writing style is imperative/infinitive, no second person +- Cross-platform question tool pattern with fallback +- No `disable-model-invocation` (auto-loadable) +- The repository still loads plugin skills normally because `ce:ideate` is discovered as a `skillDirs` entry +- Codex output follows the compound-engineering workflow model from PR #277 for this new canonical `ce:*` workflow + +--- + +- [x] **Unit 2: Update plugin metadata and documentation** + +**Goal:** Update all locations where component counts and skill listings appear. + +**Requirements:** R1 (skill exists in the plugin) + +**Dependencies:** Unit 1 + +**Files:** +- Modify: `plugins/compound-engineering/.claude-plugin/plugin.json` — update description with new skill count +- Modify: `.claude-plugin/marketplace.json` — update plugin description with new skill count +- Modify: `plugins/compound-engineering/README.md` — add ce:ideate to skills table/list, update count + +**Approach:** +- Count actual skill directories after adding ce:ideate for manifest-facing descriptions (`plugin.json`, `.claude-plugin/marketplace.json`) +- Preserve the README's separate human-facing breakdown of `Commands` vs `Skills` instead of forcing it to equal the manifest-level skill-directory count +- Add ce:ideate to the README skills section with a brief description in the existing table format +- Do NOT bump version numbers (per plugin versioning requirements) +- Do NOT add a CHANGELOG.md release entry + +**Patterns to follow:** +- CLAUDE.md checklist: "Updating the Compounding Engineering Plugin" +- Existing skill entries in README.md for description format +- `src/parsers/claude.ts` loading model: manifests and targets derive skill inventory from discovered `skills/*/SKILL.md` directories + +**Test scenarios:** +- Manifest descriptions reflect the post-change skill-directory count +- README component table and skill listing stay internally consistent with the README's own taxonomy +- JSON files remain valid +- README skill listing includes ce:ideate + +**Verification:** +- `grep -o "Includes [0-9]* specialized agents" plugins/compound-engineering/.claude-plugin/plugin.json` matches actual agent count +- Manifest-facing skill count matches the number of skill directories under `plugins/compound-engineering/skills/` +- README counts and tables are internally consistent, even if they intentionally differ from manifest-facing skill-directory totals +- `jq . < .claude-plugin/marketplace.json` succeeds +- `jq . < plugins/compound-engineering/.claude-plugin/plugin.json` succeeds + +--- + +- [x] **Unit 3: Refresh generated docs artifacts if the local docs workflow produces tracked changes** + +**Goal:** Keep generated documentation outputs in sync without inventing source-of-truth files that are not present in the current tree. + +**Requirements:** R1 (skill visible in docs) + +**Dependencies:** Unit 2 + +**Files:** +- Modify (conditional): tracked files under `docs/` updated by the local docs release workflow, if any are produced in this checkout + +**Approach:** +- Run the repo-maintenance docs regeneration workflow after the durable source files are updated +- Review only the tracked artifacts it actually changes instead of assuming specific generated paths +- If the local docs workflow produces no tracked changes in this checkout, stop without hand-editing guessed HTML files + +**Patterns to follow:** +- CLAUDE.md: "After ANY change to agents, commands, skills, or MCP servers, run `/release-docs`" + +**Test scenarios:** +- Generated docs, if present, pick up ce:ideate and updated counts from the durable sources +- Docs regeneration does not introduce unrelated count drift across generated artifacts + +**Verification:** +- Any tracked generated docs diffs are mechanically consistent with the updated plugin metadata and README +- No manual HTML edits are invented for files absent from the working tree + +## System-Wide Impact + +- **Interaction graph:** `ce:ideate` sits before `ce:brainstorm` and calls into `repo-research-analyst`, `learnings-researcher`, the platform question tool, optional Proof sharing, and optional local commit flow. The plan has to preserve that this is an orchestration skill spanning multiple existing workflow seams rather than a standalone document generator. +- **Error propagation:** Resume mismatches, write-before-present failures, or refine-in-place write failures can leave the ideation artifact out of sync with what the user saw. The skill should prefer conservative routing and explicit state updates over optimistic wording. +- **State lifecycle risks:** `docs/ideation/` becomes a new durable state surface. Topic slugging, 30-day resume matching, refinement updates, and the "explored" marker for brainstorm handoff need stable rules so repeated runs do not create duplicate or contradictory ideation records. +- **API surface parity:** Most targets can continue to rely on copied `skillDirs`, but Codex is now a special-case workflow surface for compound-engineering because of PR #277. `ce:ideate` needs parity with the canonical `ce:*` workflow model there: explicit prompt entrypoint, rewritten intra-workflow handoffs, and no deprecated alias duplication. +- **Integration coverage:** Unit-level reading of the SKILL.md is not enough. Verification has to cover end-to-end workflow behavior: initial ideation, artifact persistence, resume/refine loops, and handoff to `ce:brainstorm` without dropping ideation state. + +## Risks & Dependencies + +- **Divergent ideation quality is hard to verify at planning time**: The self-prompting instructions for Phase 2 and Phase 3 are the novel design element. Their effectiveness depends on exact wording and how well Phase 1 findings are fed back into ideation. Mitigation: verify on the real repo with open and focused prompts, then tighten the prompt structure only where groundedness or rejection quality is weak. +- **Artifact state drift across resume/refine/handoff**: The feature depends on updating the same ideation doc repeatedly. A weak state model could duplicate docs, lose "explored" markers, or present stale survivors after refinement. Mitigation: keep one canonical ideation file per session/topic and make every refine/handoff path explicitly update that file before returning control. +- **Count taxonomy drift across docs and manifests**: This repo already uses different count semantics across surfaces. A naive "make every number match" implementation could either break manifest descriptions or distort the README taxonomy. Mitigation: validate each artifact against its own intended counting model and document that distinction in the plan. +- **Dependency on PR #277 for Codex workflow correctness**: `ce:ideate` is another canonical `ce:*` workflow, so its Codex install surface should not regress to the old copied-skill-only behavior. Mitigation: land #277 first or explicitly include the same Codex workflow behavior before considering this feature complete. +- **Local docs workflow dependency**: `/release-docs` is a repo-maintenance workflow, not part of the distributed plugin. Its generated outputs may differ by environment or may not produce tracked files in the current checkout. Mitigation: treat docs regeneration as conditional maintenance verification after durable source edits, not as the primary source of truth. +- **Skill length**: Estimated ~300 lines. If the ideation and self-critique instructions need more detail, the skill could approach the 500-line limit. Mitigation: monitor during implementation and split to `references/` only if the final content genuinely needs it. + +## Documentation / Operational Notes + +- README.md gets updated in Unit 2 +- Generated docs artifacts are refreshed only if the local docs workflow produces tracked changes in this checkout +- The local `release-docs` workflow exists as a Claude slash command in this repo, but it was not directly runnable from the shell environment used for this implementation pass +- No CHANGELOG entry for this PR (per versioning requirements) +- No version bumps (automated release process handles this) + +## Sources & References + +- **Origin document:** [docs/brainstorms/2026-03-15-ce-ideate-skill-requirements.md](docs/brainstorms/2026-03-15-ce-ideate-skill-requirements.md) +- Related code: `plugins/compound-engineering/skills/ce-brainstorm/SKILL.md`, `plugins/compound-engineering/skills/ce-plan/SKILL.md`, `plugins/compound-engineering/skills/ce-work/SKILL.md` +- Related institutional learning: `docs/solutions/plugin-versioning-requirements.md` +- Related PR: #277 (`fix: codex workflow conversion for compound-engineering`) — upstream Codex workflow model this plan now depends on +- Related institutional learning: `docs/solutions/codex-skill-prompt-entrypoints.md` diff --git a/docs/plans/2026-03-16-001-feat-issue-grounded-ideation-plan.md b/docs/plans/2026-03-16-001-feat-issue-grounded-ideation-plan.md new file mode 100644 index 0000000..a288054 --- /dev/null +++ b/docs/plans/2026-03-16-001-feat-issue-grounded-ideation-plan.md @@ -0,0 +1,246 @@ +--- +title: "feat: Add issue-grounded ideation mode to ce:ideate" +type: feat +status: active +date: 2026-03-16 +origin: docs/brainstorms/2026-03-16-issue-grounded-ideation-requirements.md +--- + +# feat: Add issue-grounded ideation mode to ce:ideate + +## Overview + +Add an issue intelligence agent and integrate it into ce:ideate so that when a user's argument indicates they want issue-tracker data as input, the skill fetches, clusters, and analyzes GitHub issues — then uses the resulting themes to drive ideation frames. The agent is also independently useful outside ce:ideate for understanding a project's issue landscape. + +## Problem Statement / Motivation + +ce:ideate currently grounds ideation in codebase context and past learnings only. Teams' issue trackers hold rich signal about real user pain, recurring failures, and severity patterns that ideation misses. The goal is strategic improvement ideas grounded in bug patterns ("invest in collaboration reliability") not individual bug fixes ("fix LIVE_DOC_UNAVAILABLE"). + +(See brainstorm: docs/brainstorms/2026-03-16-issue-grounded-ideation-requirements.md — R1-R9) + +## Proposed Solution + +Two deliverables: + +1. **New agent**: `issue-intelligence-analyst` in `agents/research/` — fetches GitHub issues via `gh` CLI, clusters by theme, returns structured analysis. Standalone-capable. +2. **ce:ideate modifications**: detect issue-tracker intent in arguments, dispatch the agent as a third Phase 1 scan, derive Phase 2 ideation frames from issue clusters using a hybrid strategy. + +## Technical Approach + +### Deliverable 1: Issue Intelligence Analyst Agent + +**File**: `plugins/compound-engineering/agents/research/issue-intelligence-analyst.md` + +**Frontmatter:** +```yaml +--- +name: issue-intelligence-analyst +description: "Fetches and analyzes GitHub issues to surface recurring themes, pain patterns, and severity trends. Use when understanding a project's issue landscape, analyzing bug patterns for ideation, or summarizing what users are reporting." +model: inherit +--- +``` + +**Agent methodology (in execution order):** + +1. **Precondition checks** — verify in order, fail fast with clear message on any failure: + - Current directory is a git repo + - A GitHub remote exists (prefer `upstream` over `origin` to handle fork workflows) + - `gh` CLI is installed + - `gh auth status` succeeds + +2. **Fetch issues** — priority-aware, minimal fields (no bodies, no comments): + + **Priority-aware open issue fetching:** + - First, scan available labels to detect priority signals: `gh label list --json name --limit 100` + - If priority/severity labels exist (e.g., `P0`, `P1`, `priority:critical`, `severity:high`, `urgent`): + - Fetch high-priority issues first: `gh issue list --state open --label "{high-priority-labels}" --limit 50 --json number,title,labels,createdAt` + - Backfill with remaining issues up to 100 total: `gh issue list --state open --limit 100 --json number,title,labels,createdAt` (deduplicate against already-fetched) + - This ensures the 50 P0s in a 500-issue repo are always analyzed, not buried under 100 recent P3s + - If no priority labels detected, fetch by recency (default `gh` sort) up to 100: `gh issue list --state open --limit 100 --json number,title,labels,createdAt` + + **Recently closed issues:** + - `gh issue list --state closed --limit 50 --json number,title,labels,createdAt,stateReason,closedAt` — filter client-side to last 30 days, exclude `stateReason: "not_planned"` and issues with labels matching common won't-fix patterns (`wontfix`, `won't fix`, `duplicate`, `invalid`, `by design`) + +3. **First-pass clustering** — the core analytical step. Group issues into themes that represent **areas of systemic weakness or user pain**, not individual bugs. This is what makes the agent's output valuable. + + **Clustering approach:** + - Start with labels as strong clustering hints when present (e.g., `subsystem:collab` groups collaboration issues). When labels are absent or inconsistent, cluster by title similarity and inferred problem domain. + - Cluster by **root cause or system area**, not by symptom. Example from proof repo: 25 issues mentioning `LIVE_DOC_UNAVAILABLE` and 5 mentioning `PROJECTION_STALE` are symptoms — the theme is "collaboration write path reliability." Cluster at the system level, not the error-message level. + - Issues that span multiple themes should be noted in the primary cluster with a cross-reference, not duplicated across clusters. + - Distinguish issue sources when relevant: bot/agent-generated issues (e.g., `agent-report` label) often have different signal quality than human-reported issues. Note the source mix per cluster — a theme with 25 agent reports and 0 human reports is different from one with 5 human reports and 2 agent reports. + - Separate bugs from enhancement requests. Both are valid input but represent different kinds of signal (current pain vs. desired capability). + - Aim for 3-8 themes. Fewer than 3 suggests the issues are too homogeneous or the repo has few issues. More than 8 suggests the clustering is too granular — merge related themes. + + **What makes a good cluster:** + - It names a systemic concern, not a specific error or ticket + - A product or engineering leader would recognize it as "an area we need to invest in" + - It's actionable at a strategic level (could drive an initiative, not just a patch) + +4. **Sample body reads** — for each emerging cluster, read the full body of 2-3 representative issues (most recent or most reacted) using individual `gh issue view {number} --json body` calls. Use these to: + - Confirm the cluster grouping is correct (titles can be misleading) + - Understand the actual user/operator experience behind the symptoms + - Identify severity and impact signals not captured in metadata + - Surface any proposed solutions or workarounds already discussed + +5. **Theme synthesis** — for each cluster, produce: + - `theme_title`: short descriptive name + - `description`: what the pattern is and what it signals about the system + - `why_it_matters`: user impact, severity distribution, frequency + - `issue_count`: number of issues in this cluster + - `trend_direction`: increasing/stable/decreasing (compare issues opened vs closed in last 30 days within the cluster) + - `representative_issues`: top 3 issue numbers with titles + - `confidence`: high/medium/low based on label consistency and cluster coherence + +6. **Return structured output** — themes ordered by issue count (descending), plus a summary line with total issues analyzed, cluster count, and date range covered. + +**Output format (returned to caller):** + +```markdown +## Issue Intelligence Report + +**Repo:** {owner/repo} +**Analyzed:** {N} open + {M} recently closed issues ({date_range}) +**Themes identified:** {K} + +### Theme 1: {theme_title} +**Issues:** {count} | **Trend:** {increasing/stable/decreasing} | **Confidence:** {high/medium/low} + +{description — what the pattern is and what it signals} + +**Why it matters:** {user impact, severity, frequency} + +**Representative issues:** #{num} {title}, #{num} {title}, #{num} {title} + +### Theme 2: ... + +### Minor / Unclustered +{Issues that didn't fit any theme, with a brief note} +``` + +This format is human-readable (standalone use) and structured enough for orchestrator consumption (ce:ideate use). + +**Data source priority:** +1. **`gh` CLI (preferred)** — most reliable, works in all terminal environments, no MCP dependency +2. **GitHub MCP server** (fallback) — if `gh` is unavailable but a GitHub MCP server is connected, use its issue listing/reading tools instead. The clustering logic is identical; only the fetch mechanism changes. + +If neither is available, fail gracefully per precondition checks. + +**Token-efficient fetching:** + +The agent runs as a sub-agent with its own context window. Every token of fetched issue data competes with the space needed for clustering reasoning. Minimize input, maximize analysis. + +- **Metadata pass (all issues):** Fetch only the fields needed for clustering: `--json number,title,labels,createdAt,stateReason,closedAt`. Omit `body`, `comments`, `assignees`, `milestone` — these are expensive and not needed for initial grouping. +- **Body reads (samples only):** After clusters emerge, fetch full bodies for 2-3 representative issues per cluster using individual `gh issue view {number} --json body` calls. Pick the most reacted or most recent issue in each cluster. +- **Never fetch all bodies in bulk.** 100 issue bodies could easily consume 50k+ tokens before any analysis begins. + +**Tool guidance** (per AGENTS.md conventions): +- Use `gh` CLI for issue fetching (one simple command at a time, no chaining) +- Use native file-search/glob for any repo exploration +- Use native content-search/grep for label or pattern searches +- Do not chain shell commands with `&&`, `||`, `;`, or pipes + +### Deliverable 2: ce:ideate Skill Modifications + +**File**: `plugins/compound-engineering/skills/ce-ideate/SKILL.md` + +Four targeted modifications: + +#### Mod 1: Phase 0.2 — Add issue-tracker intent detection + +After the existing focus context and volume override interpretation, add a third inference: + +- **Issue-tracker intent** — detect when the user wants issue data as input + +The detection uses the same "reasonable interpretation rather than formal parsing" approach as the existing volume hints. Trigger on arguments whose intent is clearly about issue/bug analysis: `bugs`, `github issues`, `open issues`, `issue patterns`, `what users are reporting`, `bug reports`. + +Do NOT trigger on arguments that merely mention bugs as a focus: `bug in auth`, `fix the login issue` — these are focus hints. + +When combined with other dimensions (e.g., `top 3 bugs in authentication`): parse issue trigger first, volume override second, remainder is focus hint. The focus hint narrows which issues matter; the volume override controls survivor count. + +#### Mod 2: Phase 1 — Add third parallel agent + +Add a third numbered item to the Phase 1 parallel dispatch: + +``` +3. **Issue intelligence** (conditional) — if issue-tracker intent was detected in Phase 0.2, + dispatch `compound-engineering:research:issue-intelligence-analyst` with the focus hint. + If a focus hint is present, pass it so the agent can weight its clustering. +``` + +Update the grounding summary consolidation to include a separate **Issue Intelligence** section (distinct from codebase context) so that ideation sub-agents can distinguish between code-observed and user-reported pain points. + +If the agent returns an error (gh not installed, no remote, auth failure), log a warning to the user ("Issue analysis unavailable: {reason}. Proceeding with standard ideation.") and continue with the existing two-agent grounding. + +If the agent returns fewer than 5 issues total, note "Insufficient issue signal for theme analysis" and proceed with default ideation. + +#### Mod 3: Phase 2 — Dynamic frame derivation + +Add conditional logic before the existing frame assignment (step 8): + +When issue-tracker intent is active and the issue intelligence agent returned themes: +- Each theme with `confidence: high` or `confidence: medium` becomes an ideation frame. The frame prompt uses the theme title and description as the starting bias. +- If fewer than 4 cluster-derived frames, pad with default frames selected in order: "leverage and compounding effects", "assumption-breaking or reframing", "inversion, removal, or automation of a painful step" (these complement issue-grounded themes best by pushing beyond the reported problems). +- Cap at 6 total frames (if more than 6 themes, use the top 6 by issue count; remaining themes go into the grounding summary as "minor themes"). + +When issue-tracker intent is NOT active: existing behavior unchanged. + +#### Mod 4: Phase 0.1 — Resume awareness + +When checking for recent ideation documents, treat issue-grounded and non-issue ideation as distinct topics. An existing `docs/ideation/YYYY-MM-DD-open-ideation.md` should not be offered as a resume candidate when the current argument indicates issue-tracker intent, and vice versa. + +### Files Changed + +| File | Change | +|------|--------| +| `agents/research/issue-intelligence-analyst.md` | **New file** — the agent | +| `skills/ce-ideate/SKILL.md` | **Modified** — 4 targeted modifications (Phase 0.1, 0.2, 1, 2) | +| `.claude-plugin/plugin.json` | **Modified** — increment agent count, add agent to list, update description | +| `../../.claude-plugin/marketplace.json` | **Modified** — update description with new agent count | +| `README.md` | **Modified** — add agent to research agents table | + +### Not Changed + +- Phase 3 (adversarial filtering) — unchanged +- Phase 4 (presentation) — unchanged, survivors already include a one-line overview +- Phase 5 (artifact) — unchanged, the grounding summary naturally includes issue context +- Phase 6 (refine/handoff) — unchanged +- No other agents modified +- No new skills + +## Acceptance Criteria + +- [ ] New agent file exists at `agents/research/issue-intelligence-analyst.md` with correct frontmatter +- [ ] Agent handles precondition failures gracefully (no gh, no remote, no auth) with clear messages +- [ ] Agent handles fork workflows (prefers upstream remote over origin) +- [ ] Agent uses priority-aware fetching (scans for priority/severity labels, fetches high-priority first) +- [ ] Agent caps fetching at 100 open + 50 recently closed issues +- [ ] Agent falls back to GitHub MCP when `gh` CLI is unavailable but MCP is connected +- [ ] Agent clusters issues into themes, not individual bug reports +- [ ] Agent reads 2-3 sample bodies per cluster for enrichment +- [ ] Agent output includes theme title, description, why_it_matters, issue_count, trend, representative issues, confidence +- [ ] Agent is independently useful when dispatched directly (not just as ce:ideate sub-agent) +- [ ] ce:ideate detects issue-tracker intent from arguments like `bugs`, `github issues` +- [ ] ce:ideate does NOT trigger issue mode on focus hints like `bug in auth` +- [ ] ce:ideate dispatches issue intelligence agent as third parallel Phase 1 scan when triggered +- [ ] ce:ideate falls back to default ideation with warning when agent fails +- [ ] ce:ideate derives ideation frames from issue clusters (hybrid: clusters + default padding) +- [ ] ce:ideate caps at 6 frames, padding with defaults when < 4 clusters +- [ ] Running `/ce:ideate bugs` on proof repo produces clustered themes from 25+ LIVE_DOC_UNAVAILABLE variants, not 25 separate ideas +- [ ] Surviving ideas are strategic improvements, not individual bug fixes +- [ ] plugin.json, marketplace.json, README.md updated with correct counts + +## Dependencies & Risks + +- **`gh` CLI dependency**: The agent requires `gh` installed and authenticated. Mitigated by graceful fallback to standard ideation. +- **Issue volume**: Repos with thousands of issues could produce noisy clusters. Mitigated by fetch cap (100 open + 50 closed) and frame cap (6 max). +- **Label quality variance**: Repos without structured labels rely on title/body clustering, which may produce lower-confidence themes. Mitigated by the confidence field and sample body reads. +- **Context window**: Fetching 150 issues + reading 15-20 bodies could consume significant tokens in the agent's context. Mitigated by metadata-only initial fetch and sample-only body reads. +- **Priority label detection**: No standard naming convention. Mitigated by scanning available labels and matching common patterns (P0/P1, priority:*, severity:*, urgent, critical). When no priority labels exist, falls back to recency-based fetching. + +## Sources & References + +- **Origin brainstorm:** [docs/brainstorms/2026-03-16-issue-grounded-ideation-requirements.md](docs/brainstorms/2026-03-16-issue-grounded-ideation-requirements.md) — Key decisions: pattern-first ideation, hybrid frame strategy, flexible argument detection, additive to Phase 1, standalone agent +- **Exemplar agent:** `plugins/compound-engineering/agents/research/repo-research-analyst.md` — agent structure pattern +- **ce:ideate skill:** `plugins/compound-engineering/skills/ce-ideate/SKILL.md` — integration target +- **Institutional learning:** `docs/solutions/skill-design/compound-refresh-skill-improvements.md` — impact clustering pattern, platform-agnostic tool references, evidence-first interaction +- **Real-world test repo:** `EveryInc/proof` (555 issues, 25+ LIVE_DOC_UNAVAILABLE duplicates, structured labels) diff --git a/plugins/compound-engineering/.claude-plugin/plugin.json b/plugins/compound-engineering/.claude-plugin/plugin.json index 767e7cb..c137838 100644 --- a/plugins/compound-engineering/.claude-plugin/plugin.json +++ b/plugins/compound-engineering/.claude-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "compound-engineering", "version": "2.40.0", - "description": "AI-powered development tools. 28 agents, 41 skills, 1 MCP server for code review, research, design, and workflow automation.", + "description": "AI-powered development tools. 29 agents, 47 skills, 1 MCP server for code review, research, design, and workflow automation.", "author": { "name": "Kieran Klaassen", "email": "kieran@every.to", diff --git a/plugins/compound-engineering/README.md b/plugins/compound-engineering/README.md index 520b85f..d6f1f03 100644 --- a/plugins/compound-engineering/README.md +++ b/plugins/compound-engineering/README.md @@ -6,7 +6,7 @@ AI-powered development tools that get smarter with every use. Make each unit of | Component | Count | |-----------|-------| -| Agents | 28 | +| Agents | 29 | | Commands | 23 | | Skills | 20 | | MCP Servers | 1 | @@ -35,13 +35,14 @@ Agents are organized into categories for easier discovery. | `schema-drift-detector` | Detect unrelated schema.rb changes in PRs | | `security-sentinel` | Security audits and vulnerability assessments | -### Research (5) +### Research (6) | Agent | Description | |-------|-------------| | `best-practices-researcher` | Gather external best practices and examples | | `framework-docs-researcher` | Research framework documentation and best practices | | `git-history-analyzer` | Analyze git history and code evolution | +| `issue-intelligence-analyst` | Analyze GitHub issues to surface recurring themes and pain patterns | | `learnings-researcher` | Search institutional learnings for relevant past solutions | | `repo-research-analyst` | Research repository structure and conventions | @@ -76,6 +77,7 @@ Core workflow commands use `ce:` prefix to unambiguously identify them as compou | Command | Description | |---------|-------------| +| `/ce:ideate` | Discover high-impact project improvements through divergent ideation and adversarial filtering | | `/ce:brainstorm` | Explore requirements and approaches before planning | | `/ce:plan` | Create implementation plans | | `/ce:review` | Run comprehensive code reviews | diff --git a/plugins/compound-engineering/agents/research/issue-intelligence-analyst.md b/plugins/compound-engineering/agents/research/issue-intelligence-analyst.md new file mode 100644 index 0000000..7b543fc --- /dev/null +++ b/plugins/compound-engineering/agents/research/issue-intelligence-analyst.md @@ -0,0 +1,230 @@ +--- +name: issue-intelligence-analyst +description: "Fetches and analyzes GitHub issues to surface recurring themes, pain patterns, and severity trends. Use when understanding a project's issue landscape, analyzing bug patterns for ideation, or summarizing what users are reporting." +model: inherit +--- + + + +Context: User wants to understand what problems their users are hitting before ideating on improvements. +user: "What are the main themes in our open issues right now?" +assistant: "I'll use the issue-intelligence-analyst agent to fetch and cluster your GitHub issues into actionable themes." +The user wants a high-level view of their issue landscape, so use the issue-intelligence-analyst agent to fetch, cluster, and synthesize issue themes. + + +Context: User is running ce:ideate with a focus on bugs and issue patterns. +user: "/ce:ideate bugs" +assistant: "I'll dispatch the issue-intelligence-analyst agent to analyze your GitHub issues for recurring patterns that can ground the ideation." +The ce:ideate skill detected issue-tracker intent and dispatches this agent as a third parallel Phase 1 scan alongside codebase context and learnings search. + + +Context: User wants to understand pain patterns before a planning session. +user: "Before we plan the next sprint, can you summarize what our issue tracker tells us about where we're hurting?" +assistant: "I'll use the issue-intelligence-analyst agent to analyze your open and recently closed issues for systemic themes." +The user needs strategic issue intelligence before planning, so use the issue-intelligence-analyst agent to surface patterns, not individual bugs. + + + +**Note: The current year is 2026.** Use this when evaluating issue recency and trends. + +You are an expert issue intelligence analyst specializing in extracting strategic signal from noisy issue trackers. Your mission is to transform raw GitHub issues into actionable theme-level intelligence that helps teams understand where their systems are weakest and where investment would have the highest impact. + +Your output is themes, not tickets. 25 duplicate bugs about the same failure mode is a signal about systemic reliability, not 25 separate problems. A product or engineering leader reading your report should immediately understand which areas need investment and why. + +## Methodology + +### Step 1: Precondition Checks + +Verify each condition in order. If any fails, return a clear message explaining what is missing and stop. + +1. **Git repository** — confirm the current directory is a git repo using `git rev-parse --is-inside-work-tree` +2. **GitHub remote** — detect the repository. Prefer `upstream` remote over `origin` to handle fork workflows (issues live on the upstream repo, not the fork). Use `gh repo view --json nameWithOwner` to confirm the resolved repo. +3. **`gh` CLI available** — verify `gh` is installed with `which gh` +4. **Authentication** — verify `gh auth status` succeeds + +If `gh` CLI is not available but a GitHub MCP server is connected, use its issue listing and reading tools instead. The analysis methodology is identical; only the fetch mechanism changes. + +If neither `gh` nor GitHub MCP is available, return: "Issue analysis unavailable: no GitHub access method found. Ensure `gh` CLI is installed and authenticated, or connect a GitHub MCP server." + +### Step 2: Fetch Issues (Token-Efficient) + +Every token of fetched data competes with the context needed for clustering and reasoning. Fetch minimal fields, never bulk-fetch bodies. + +**2a. Scan labels and adapt to the repo:** + +``` +gh label list --json name --limit 100 +``` + +The label list serves two purposes: +- **Priority signals:** patterns like `P0`, `P1`, `priority:critical`, `severity:high`, `urgent`, `critical` +- **Focus targeting:** if a focus hint was provided (e.g., "collaboration", "auth", "performance"), scan the label list for labels that match the focus area. Every repo's label taxonomy is different — some use `subsystem:collab`, others use `area/auth`, others have no structured labels at all. Use your judgment to identify which labels (if any) relate to the focus, then use `--label` to narrow the fetch. If no labels match the focus, fetch broadly and weight the focus area during clustering instead. + +**2b. Fetch open issues (priority-aware):** + +If priority/severity labels were detected: +- Fetch high-priority issues first (with truncated bodies for clustering): + ``` + gh issue list --state open --label "{high-priority-labels}" --limit 50 --json number,title,labels,createdAt,body --jq '[.[] | {number, title, labels, createdAt, body: (.body[:500])}]' + ``` +- Backfill with remaining issues: + ``` + gh issue list --state open --limit 100 --json number,title,labels,createdAt,body --jq '[.[] | {number, title, labels, createdAt, body: (.body[:500])}]' + ``` +- Deduplicate by issue number. + +If no priority labels detected: +``` +gh issue list --state open --limit 100 --json number,title,labels,createdAt,body --jq '[.[] | {number, title, labels, createdAt, body: (.body[:500])}]' +``` + +**2c. Fetch recently closed issues:** + +``` +gh issue list --state closed --limit 50 --json number,title,labels,createdAt,stateReason,closedAt,body --jq '[.[] | select(.stateReason == "COMPLETED") | {number, title, labels, createdAt, closedAt, body: (.body[:500])}]' +``` + +Then filter the output by reading it directly: +- Keep only issues closed within the last 30 days (by `closedAt` date) +- Exclude issues whose labels match common won't-fix patterns: `wontfix`, `won't fix`, `duplicate`, `invalid`, `by design` + +Perform date and label filtering by reasoning over the returned data directly. Do **not** write Python, Node, or shell scripts to process issue data. + +**How to interpret closed issues:** Closed issues are not evidence of current pain on their own — they may represent problems that were genuinely solved. Their value is as a **recurrence signal**: when a theme appears in both open AND recently closed issues, that means the problem keeps coming back despite fixes. That's the real smell. + +- A theme with 20 open issues + 10 recently closed issues → strong recurrence signal, high priority +- A theme with 0 open issues + 10 recently closed issues → problem was fixed, do not create a theme for it +- A theme with 5 open issues + 0 recently closed issues → active problem, no recurrence data + +Cluster from open issues first. Then check whether closed issues reinforce those themes. Do not let closed issues create new themes that have no open issue support. + +**Hard rules:** +- **One `gh` call per fetch** — fetch all needed issues in a single call with `--limit`. Do not paginate across multiple calls, pipe through `tail`/`head`, or split fetches. A single `gh issue list --limit 200` is fine; two calls to get issues 1-100 then 101-200 is unnecessary. +- Do not fetch `comments`, `assignees`, or `milestone` — these fields are expensive and not needed. +- Do not reformulate `gh` commands with custom `--jq` output formatting (tab-separated, CSV, etc.). Always return JSON arrays from `--jq` so the output is machine-readable and consistent. +- Bodies are included truncated to 500 characters via `--jq` in the initial fetch, which provides enough signal for clustering without separate body reads. + +### Step 3: Cluster by Theme + +This is the core analytical step. Group issues into themes that represent **areas of systemic weakness or user pain**, not individual bugs. + +**Clustering approach:** + +1. **Cluster from open issues first.** Open issues define the active themes. Then check whether recently closed issues reinforce those themes (recurrence signal). Do not let closed-only issues create new themes — a theme with 0 open issues is a solved problem, not an active concern. + +2. Start with labels as strong clustering hints when present (e.g., `subsystem:collab` groups collaboration issues). When labels are absent or inconsistent, cluster by title similarity and inferred problem domain. + +3. Cluster by **root cause or system area**, not by symptom. Example: 25 issues mentioning `LIVE_DOC_UNAVAILABLE` and 5 mentioning `PROJECTION_STALE` are different symptoms of the same systemic concern — "collaboration write path reliability." Cluster at the system level, not the error-message level. + +4. Issues that span multiple themes belong in the primary cluster with a cross-reference. Do not duplicate issues across clusters. + +5. Distinguish issue sources when relevant: bot/agent-generated issues (e.g., `agent-report` labels) have different signal quality than human-reported issues. Note the source mix per cluster — a theme with 25 agent reports and 0 human reports carries different weight than one with 5 human reports and 2 agent confirmations. + +6. Separate bugs from enhancement requests. Both are valid input but represent different signal types: current pain (bugs) vs. desired capability (enhancements). + +7. If a focus hint was provided by the caller, weight clustering toward that focus without excluding stronger unrelated themes. + +**Target: 3-8 themes.** Fewer than 3 suggests the issues are too homogeneous or the repo has few issues. More than 8 suggests clustering is too granular — merge related themes. + +**What makes a good cluster:** +- It names a systemic concern, not a specific error or ticket +- A product or engineering leader would recognize it as "an area we need to invest in" +- It is actionable at a strategic level — could drive an initiative, not just a patch + +### Step 4: Selective Full Body Reads (Only When Needed) + +The truncated bodies from Step 2 (500 chars) are usually sufficient for clustering. Only fetch full bodies when a truncated body was cut off at a critical point and the full context would materially change the cluster assignment or theme understanding. + +When a full read is needed: +``` +gh issue view {number} --json body --jq '.body' +``` + +Limit full reads to 2-3 issues total across all clusters, not per cluster. Use `--jq` to extract the field directly — do **not** pipe through `python3`, `jq`, or any other command. + +### Step 5: Synthesize Themes + +For each cluster, produce a theme entry with these fields: +- **theme_title**: short descriptive name (systemic, not symptom-level) +- **description**: what the pattern is and what it signals about the system +- **why_it_matters**: user impact, severity distribution, frequency, and what happens if unaddressed +- **issue_count**: number of issues in this cluster +- **source_mix**: breakdown of issue sources (human-reported vs. bot-generated, bugs vs. enhancements) +- **trend_direction**: increasing / stable / decreasing — based on recent issue creation rate within the cluster. Also note **recurrence** if closed issues in this theme show the same problems being fixed and reopening — this is the strongest signal that the underlying cause isn't resolved +- **representative_issues**: top 3 issue numbers with titles +- **confidence**: high / medium / low — based on label consistency, cluster coherence, and body confirmation + +Order themes by issue count descending. + +**Accuracy requirement:** Every number in the output must be derived from the actual data returned by `gh`, not estimated or assumed. +- Count the actual issues returned by each `gh` call — do not assume the count matches the `--limit` value. If you requested `--limit 100` but only 30 issues came back, report 30. +- Per-theme issue counts must add up to the total (with minor overlap for cross-referenced issues). If you claim 55 issues in theme 1 but only fetched 30 total, something is wrong. +- Do not fabricate statistics, ratios, or breakdowns that you did not compute from the actual returned data. If you cannot determine an exact count, say so — do not approximate with a round number. + +### Step 6: Handle Edge Cases + +- **Fewer than 5 total issues:** Return a brief note: "Insufficient issue volume for meaningful theme analysis ({N} issues found)." Include a simple list of the issues without clustering. +- **All issues are the same theme:** Report honestly as a single dominant theme. Note that the issue tracker shows a concentrated problem, not a diverse landscape. +- **No issues at all:** Return: "No open or recently closed issues found for {repo}." + +## Output Format + +Return the report in this structure: + +Every theme MUST include ALL of the following fields. Do not skip fields, merge them into prose, or move them to a separate section. + +```markdown +## Issue Intelligence Report + +**Repo:** {owner/repo} +**Analyzed:** {N} open + {M} recently closed issues ({date_range}) +**Themes identified:** {K} + +### Theme 1: {theme_title} +**Issues:** {count} | **Trend:** {direction} | **Confidence:** {level} +**Sources:** {X human-reported, Y bot-generated} | **Type:** {bugs/enhancements/mixed} + +{description — what the pattern is and what it signals about the system. Include causal connections to other themes here, not in a separate section.} + +**Why it matters:** {user impact, severity, frequency, consequence of inaction} + +**Representative issues:** #{num} {title}, #{num} {title}, #{num} {title} + +--- + +### Theme 2: {theme_title} +(same fields — no exceptions) + +... + +### Minor / Unclustered +{Issues that didn't fit any theme — list each with #{num} {title}, or "None"} +``` + +**Output checklist — verify before returning:** +- [ ] Total analyzed count matches actual `gh` results (not the `--limit` value) +- [ ] Every theme has all 6 lines: title, issues/trend/confidence, sources/type, description, why it matters, representative issues +- [ ] Representative issues use real issue numbers from the fetched data +- [ ] Per-theme issue counts sum to approximately the total (minor overlap from cross-references is acceptable) +- [ ] No statistics, ratios, or counts that were not computed from the actual fetched data + +## Tool Guidance + +**Critical: no scripts, no pipes.** Every `python3`, `node`, or piped command triggers a separate permission prompt that the user must manually approve. With dozens of issues to process, this creates an unacceptable permission-spam experience. + +- Use `gh` CLI for all GitHub operations — one simple command at a time, no chaining with `&&`, `||`, `;`, or pipes +- **Always use `--jq` for field extraction and filtering** from `gh` JSON output (e.g., `gh issue list --json title --jq '.[].title'`, `gh issue list --json stateReason --jq '[.[] | select(.stateReason == "COMPLETED")]'`). The `gh` CLI has full jq support built in. +- **Never write inline scripts** (`python3 -c`, `node -e`, `ruby -e`) to process, filter, sort, or transform issue data. Reason over the data directly after reading it — you are an LLM, you can filter and cluster in context without running code. +- **Never pipe** `gh` output through any command (`| python3`, `| jq`, `| grep`, `| sort`). Use `--jq` flags instead, or read the output and reason over it. +- Use native file-search/glob tools (e.g., `Glob` in Claude Code) for any repo file exploration +- Use native content-search/grep tools (e.g., `Grep` in Claude Code) for searching file contents +- Do not use shell commands for tasks that have native tool equivalents (no `find`, `cat`, `rg` through shell) + +## Integration Points + +This agent is designed to be invoked by: +- `ce:ideate` — as a third parallel Phase 1 scan when issue-tracker intent is detected +- Direct user dispatch — for standalone issue landscape analysis +- Other skills or workflows — any context where understanding issue patterns is valuable + +The output is self-contained and not coupled to any specific caller's context. diff --git a/plugins/compound-engineering/skills/ce-ideate/SKILL.md b/plugins/compound-engineering/skills/ce-ideate/SKILL.md new file mode 100644 index 0000000..515edc5 --- /dev/null +++ b/plugins/compound-engineering/skills/ce-ideate/SKILL.md @@ -0,0 +1,370 @@ +--- +name: ce:ideate +description: "Generate and critically evaluate grounded improvement ideas for the current project. Use when asking what to improve, requesting idea generation, exploring surprising improvements, or wanting the AI to proactively suggest strong project directions before brainstorming one in depth. Triggers on phrases like 'what should I improve', 'give me ideas', 'ideate on this project', 'surprise me with improvements', 'what would you change', or any request for AI-generated project improvement suggestions rather than refining the user's own idea." +argument-hint: "[optional: feature, focus area, or constraint]" +--- + +# Generate Improvement Ideas + +**Note: The current year is 2026.** Use this when dating ideation documents and checking recent ideation artifacts. + +`ce:ideate` precedes `ce:brainstorm`. + +- `ce:ideate` answers: "What are the strongest ideas worth exploring?" +- `ce:brainstorm` answers: "What exactly should one chosen idea mean?" +- `ce:plan` answers: "How should it be built?" + +This workflow produces a ranked ideation artifact in `docs/ideation/`. It does **not** produce requirements, plans, or code. + +## Interaction Method + +Use the platform's blocking question tool when available (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini). Otherwise, present numbered options in chat and wait for the user's reply before proceeding. + +Ask one question at a time. Prefer concise single-select choices when natural options exist. + +## Focus Hint + + #$ARGUMENTS + +Interpret any provided argument as optional context. It may be: + +- a concept such as `DX improvements` +- a path such as `plugins/compound-engineering/skills/` +- a constraint such as `low-complexity quick wins` +- a volume hint such as `top 3`, `100 ideas`, or `raise the bar` + +If no argument is provided, proceed with open-ended ideation. + +## Core Principles + +1. **Ground before ideating** - Scan the actual codebase first. Do not generate abstract product advice detached from the repository. +2. **Diverge before judging** - Generate the full idea set before evaluating any individual idea. +3. **Use adversarial filtering** - The quality mechanism is explicit rejection with reasons, not optimistic ranking. +4. **Preserve the original prompt mechanism** - Generate many ideas, critique the whole list, then explain only the survivors in detail. Do not let extra process obscure this pattern. +5. **Use agent diversity to improve the candidate pool** - Parallel sub-agents are a support mechanism for richer idea generation and critique, not the core workflow itself. +6. **Preserve the artifact early** - Write the ideation document before presenting results so work survives interruptions. +7. **Route action into brainstorming** - Ideation identifies promising directions; `ce:brainstorm` defines the selected one precisely enough for planning. + +## Execution Flow + +### Phase 0: Resume and Scope + +#### 0.1 Check for Recent Ideation Work + +Look in `docs/ideation/` for ideation documents created within the last 30 days. + +Treat a prior ideation doc as relevant when: +- the topic matches the requested focus +- the path or subsystem overlaps the requested focus +- the request is open-ended and there is an obvious recent open ideation doc +- the issue-grounded status matches: do not offer to resume a non-issue ideation when the current argument indicates issue-tracker intent, or vice versa — treat these as distinct topics + +If a relevant doc exists, ask whether to: +1. continue from it +2. start fresh + +If continuing: +- read the document +- summarize what has already been explored +- preserve previous idea statuses and session log entries +- update the existing file instead of creating a duplicate + +#### 0.2 Interpret Focus and Volume + +Infer three things from the argument: + +- **Focus context** - concept, path, constraint, or open-ended +- **Volume override** - any hint that changes candidate or survivor counts +- **Issue-tracker intent** - whether the user wants issue/bug data as an input source + +Issue-tracker intent triggers when the argument's primary intent is about analyzing issue patterns: `bugs`, `github issues`, `open issues`, `issue patterns`, `what users are reporting`, `bug reports`, `issue themes`. + +Do NOT trigger on arguments that merely mention bugs as a focus: `bug in auth`, `fix the login issue`, `the signup bug` — these are focus hints, not requests to analyze the issue tracker. + +When combined (e.g., `top 3 bugs in authentication`): detect issue-tracker intent first, volume override second, remainder is the focus hint. The focus narrows which issues matter; the volume override controls survivor count. + +Default volume: +- each ideation sub-agent generates about 7-8 ideas (yielding 30-40 raw ideas across agents, ~20-30 after dedupe) +- keep the top 5-7 survivors + +Honor clear overrides such as: +- `top 3` +- `100 ideas` +- `go deep` +- `raise the bar` + +Use reasonable interpretation rather than formal parsing. + +### Phase 1: Codebase Scan + +Before generating ideas, gather codebase context. + +Run agents in parallel in the **foreground** (do not use background dispatch — the results are needed before proceeding): + +1. **Quick context scan** — dispatch a general-purpose sub-agent with this prompt: + + > Read the project's CLAUDE.md (or AGENTS.md / README.md if CLAUDE.md is absent), then discover the top-level directory layout using the native file-search/glob tool (e.g., `Glob` with pattern `*` or `*/*` in Claude Code). Return a concise summary (under 30 lines) covering: + > - project shape (language, framework, top-level directory layout) + > - notable patterns or conventions + > - obvious pain points or gaps + > - likely leverage points for improvement + > + > Keep the scan shallow — read only top-level documentation and directory structure. Do not analyze GitHub issues, templates, or contribution guidelines. Do not do deep code search. + > + > Focus hint: {focus_hint} + +2. **Learnings search** — dispatch `compound-engineering:research:learnings-researcher` with a brief summary of the ideation focus. + +3. **Issue intelligence** (conditional) — if issue-tracker intent was detected in Phase 0.2, dispatch `compound-engineering:research:issue-intelligence-analyst` with the focus hint. If a focus hint is present, pass it so the agent can weight its clustering toward that area. Run this in parallel with agents 1 and 2. + + If the agent returns an error (gh not installed, no remote, auth failure), log a warning to the user ("Issue analysis unavailable: {reason}. Proceeding with standard ideation.") and continue with the existing two-agent grounding. + + If the agent reports fewer than 5 total issues, note "Insufficient issue signal for theme analysis" and proceed with default ideation frames in Phase 2. + +Consolidate all results into a short grounding summary. When issue intelligence is present, keep it as a distinct section so ideation sub-agents can distinguish between code-observed and user-reported signals: + +- **Codebase context** — project shape, notable patterns, obvious pain points, likely leverage points +- **Past learnings** — relevant institutional knowledge from docs/solutions/ +- **Issue intelligence** (when present) — theme summaries from the issue intelligence agent, preserving theme titles, descriptions, issue counts, and trend directions + +Do **not** do external research in v1. + +### Phase 2: Divergent Ideation + +Follow this mechanism exactly: + +1. Generate the full candidate list before critiquing any idea. +2. Each sub-agent targets about 7-8 ideas by default. With 4-6 agents this yields 30-40 raw ideas, which merge and dedupe to roughly 20-30 unique candidates. Adjust the per-agent target when volume overrides apply (e.g., "100 ideas" raises it, "top 3" may lower the survivor count instead). +3. Push past the safe obvious layer. Each agent's first few ideas tend to be obvious — push past them. +4. Ground every idea in the Phase 1 scan. +5. Use this prompting pattern as the backbone: + - first generate many ideas + - then challenge them systematically + - then explain only the survivors in detail +6. If the platform supports sub-agents, use them to improve diversity in the candidate pool rather than to replace the core mechanism. +7. Give each ideation sub-agent the same: + - grounding summary + - focus hint + - per-agent volume target (~7-8 ideas by default) + - instruction to generate raw candidates only, not critique +8. When using sub-agents, assign each one a different ideation frame as a **starting bias, not a constraint**. Prompt each agent to begin from its assigned perspective but follow any promising thread wherever it leads — cross-cutting ideas that span multiple frames are valuable, not out of scope. + + **Frame selection depends on whether issue intelligence is active:** + + **When issue-tracker intent is active and themes were returned:** + - Each theme with `confidence: high` or `confidence: medium` becomes an ideation frame. The frame prompt uses the theme title and description as the starting bias. + - If fewer than 4 cluster-derived frames, pad with default frames in this order: "leverage and compounding effects", "assumption-breaking or reframing", "inversion, removal, or automation of a painful step". These complement issue-grounded themes by pushing beyond the reported problems. + - Cap at 6 total frames. If more than 6 themes qualify, use the top 6 by issue count; note remaining themes in the grounding summary as "minor themes" so sub-agents are still aware of them. + + **When issue-tracker intent is NOT active (default):** + - user or operator pain and friction + - unmet need or missing capability + - inversion, removal, or automation of a painful step + - assumption-breaking or reframing + - leverage and compounding effects + - extreme cases, edge cases, or power-user pressure +9. Ask each ideation sub-agent to return a standardized structure for each idea so the orchestrator can merge and reason over the outputs consistently. Prefer a compact JSON-like structure with: + - title + - summary + - why_it_matters + - evidence or grounding hooks + - optional local signals such as boldness or focus_fit +10. Merge and dedupe the sub-agent outputs into one master candidate list. +11. **Synthesize cross-cutting combinations.** After deduping, scan the merged list for ideas from different frames that together suggest something stronger than either alone. If two or more ideas naturally combine into a higher-leverage proposal, add the combined idea to the list (expect 3-5 additions at most). This synthesis step belongs to the orchestrator because it requires seeing all ideas simultaneously. +12. Spread ideas across multiple dimensions when justified: + - workflow/DX + - reliability + - extensibility + - missing capabilities + - docs/knowledge compounding + - quality and maintenance + - leverage on future work +13. If a focus was provided, pass it to every ideation sub-agent and weight the merged list toward it without excluding stronger adjacent ideas. + +The mechanism to preserve is: +- generate many ideas first +- critique the full combined list second +- explain only the survivors in detail + +The sub-agent pattern to preserve is: +- independent ideation with frames as starting biases first +- orchestrator merge, dedupe, and cross-cutting synthesis second +- critique only after the combined and synthesized list exists + +### Phase 3: Adversarial Filtering + +Review every generated idea critically. + +Prefer a two-layer critique: +1. Have one or more skeptical sub-agents attack the merged list from distinct angles. +2. Have the orchestrator synthesize those critiques, apply the rubric consistently, score the survivors, and decide the final ranking. + +Do not let critique agents generate replacement ideas in this phase unless explicitly refining. + +Critique agents may provide local judgments, but final scoring authority belongs to the orchestrator so the ranking stays consistent across different frames and perspectives. + +For each rejected idea, write a one-line reason. + +Use rejection criteria such as: +- too vague +- not actionable +- duplicates a stronger idea +- not grounded in the current codebase +- too expensive relative to likely value +- already covered by existing workflows or docs +- interesting but better handled as a brainstorm variant, not a product improvement + +Use a consistent survivor rubric that weighs: +- groundedness in the current repo +- expected value +- novelty +- pragmatism +- leverage on future work +- implementation burden +- overlap with stronger ideas + +Target output: +- keep 5-7 survivors by default +- if too many survive, run a second stricter pass +- if fewer than 5 survive, report that honestly rather than lowering the bar + +### Phase 4: Present the Survivors + +Present the surviving ideas to the user before writing the durable artifact. + +This first presentation is a review checkpoint, not the final archived result. + +Present only the surviving ideas in structured form: + +- title +- description +- rationale +- downsides +- confidence score +- estimated complexity + +Then include a brief rejection summary so the user can see what was considered and cut. + +Keep the presentation concise. The durable artifact holds the full record. + +Allow brief follow-up questions and lightweight clarification before writing the artifact. + +Do not write the ideation doc yet unless: +- the user indicates the candidate set is good enough to preserve +- the user asks to refine and continue in a way that should be recorded +- the workflow is about to hand off to `ce:brainstorm`, Proof sharing, or session end + +### Phase 5: Write the Ideation Artifact + +Write the ideation artifact after the candidate set has been reviewed enough to preserve. + +Always write or update the artifact before: +- handing off to `ce:brainstorm` +- sharing to Proof +- ending the session + +To write the artifact: + +1. Ensure `docs/ideation/` exists +2. Choose the file path: + - `docs/ideation/YYYY-MM-DD--ideation.md` + - `docs/ideation/YYYY-MM-DD-open-ideation.md` when no focus exists +3. Write or update the ideation document + +Use this structure and omit clearly irrelevant fields only when necessary: + +```markdown +--- +date: YYYY-MM-DD +topic: +focus: +--- + +# Ideation: + +## Codebase Context +[Grounding summary from Phase 1] + +## Ranked Ideas + +### 1. <Idea Title> +**Description:** [Concrete explanation] +**Rationale:** [Why this improves the project] +**Downsides:** [Tradeoffs or costs] +**Confidence:** [0-100%] +**Complexity:** [Low / Medium / High] +**Status:** [Unexplored / Explored] + +## Rejection Summary + +| # | Idea | Reason Rejected | +|---|------|-----------------| +| 1 | <Idea> | <Reason rejected> | + +## Session Log +- YYYY-MM-DD: Initial ideation — <candidate count> generated, <survivor count> survived +``` + +If resuming: +- update the existing file in place +- append to the session log +- preserve explored markers + +### Phase 6: Refine or Hand Off + +After presenting the results, ask what should happen next. + +Offer these options: +1. brainstorm a selected idea +2. refine the ideation +3. share to Proof +4. end the session + +#### 6.1 Brainstorm a Selected Idea + +If the user selects an idea: +- write or update the ideation doc first +- mark that idea as `Explored` +- note the brainstorm date in the session log +- invoke `ce:brainstorm` with the selected idea as the seed + +Do **not** skip brainstorming and go straight to planning from ideation output. + +#### 6.2 Refine the Ideation + +Route refinement by intent: + +- `add more ideas` or `explore new angles` -> return to Phase 2 +- `re-evaluate` or `raise the bar` -> return to Phase 3 +- `dig deeper on idea #N` -> expand only that idea's analysis + +After each refinement: +- update the ideation document before any handoff, sharing, or session end +- append a session log entry + +#### 6.3 Share to Proof + +If requested, share the ideation document using the standard Proof markdown upload pattern already used elsewhere in the plugin. + +Return to the next-step options after sharing. + +#### 6.4 End the Session + +When ending: +- offer to commit only the ideation doc +- do not create a branch +- do not push +- if the user declines, leave the file uncommitted + +## Quality Bar + +Before finishing, check: + +- the idea set is grounded in the actual repo +- the candidate list was generated before filtering +- the original many-ideas -> critique -> survivors mechanism was preserved +- if sub-agents were used, they improved diversity without replacing the core workflow +- every rejected idea has a reason +- survivors are materially better than a naive "give me ideas" list +- the artifact was written before any handoff, sharing, or session end +- acting on an idea routes to `ce:brainstorm`, not directly to implementation