Reduce context token usage by 79% — fix silent component exclusion (#161)

* Update create-agent-skills to match 2026 official docs, add /triage-prs command - Rewrite SKILL.md to document that commands and skills are now merged - Add new frontmatter fields: disable-model-invocation, user-invocable, context, agent - Add invocation control table and dynamic context injection docs - Fix skill-structure.md: was incorrectly recommending XML tags over markdown headings - Update official-spec.md with complete 2026 specification - Add local /triage-prs command for PR triage workflow - Add PR triage plan document Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [2.31.0] Reduce context token usage by 79%, include recent community contributions The plugin was consuming 316% of Claude Code's description character budget (~50,500 chars vs 16,000 limit), causing components to be silently excluded. Now at 65% (~10,400 chars) with all components visible. Changes: - Trim all 29 agent descriptions (move examples to body) - Add disable-model-invocation to 18 manual commands - Add disable-model-invocation to 6 manual skills - Include recent community contributions in changelog - Fix component counts (29 agents, 24 commands, 18 skills) Contributors: @trevin, @terryli, @robertomello, @zacwilliams, @aarnikoskela, @samxie, @davidalley Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix: keep disable-model-invocation off commands called by /lfg, rename xcode-test - Remove disable-model-invocation from test-browser, feature-video, resolve_todo_parallel — these are called programmatically by /lfg and /slfg - Rename xcode-test to test-xcode to match test-browser naming convention Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix: keep git-worktree skill auto-invocable (used by /workflows:work) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(converter): support disable-model-invocation frontmatter Parse disable-model-invocation from command and skill frontmatter. Commands/skills with this flag are excluded from OpenCode command maps and Codex prompt/skill generation, matching Claude Code behavior where these components are user-only invocable. Bump converter version to 0.3.0. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 22:28:51 -06:00
parent 04ee7e4506
commit f744b797ef
71 changed files with 1765 additions and 767 deletions
--- a/docs/plans/2026-02-08-feat-pr-triage-and-merge-plan.md
+++ b/docs/plans/2026-02-08-feat-pr-triage-and-merge-plan.md
@@ -0,0 +1,128 @@
+---
+title: PR Triage, Review & Merge
+type: feat
+date: 2026-02-08
+---
+
+# PR Triage, Review & Merge
+
+## Overview
+
+Review all 17 open PRs one-by-one. Merge the ones that look good, leave constructive comments on the ones we won't take (keeping them open for contributors to address). Close duplicates/spam.
+
+## Approach
+
+Show the diff for each PR, get a go/no-go, then either merge or comment. PRs are ordered by priority group.
+
+## Group 1: Bug Fixes (high confidence merges)
+
+### PR #159 - fix(git-worktree): detect worktrees where .git is a file
+- **Author:** dalley | **Files:** 1 | **+2/-2**
+- **What:** Changes `-d` to `-e` check in `worktree-manager.sh` so `list` and `cleanup` detect worktrees (`.git` is a file in worktrees, not a dir)
+- **Fixes:** Issue #158
+- **Action:** Review diff → merge
+
+### PR #144 - Remove confirmation prompt when creating git worktrees
+- **Author:** XSAM | **Files:** 1 | **+0/-8**
+- **What:** Removes interactive `read -r` confirmation that breaks Claude's ability to create worktrees
+- **Related:** Same file as #159 (merge #159 first)
+- **Action:** Review diff → merge
+
+### PR #150 - fix(compound): prevent subagents from writing intermediary files
+- **Author:** tmchow | **Files:** 1 | **+64/-27**
+- **What:** Restructures `/workflows:compound` into 2-phase orchestration to prevent subagents from writing temp files
+- **Action:** Review diff → merge
+
+### PR #148 - Fix: resolve_pr_parallel uses non-existent scripts
+- **Author:** ajrobertsonio | **Files:** 1 | **+20/-7**
+- **What:** Replaces references to non-existent `bin/get-pr-comments` with standard `gh` CLI commands
+- **Fixes:** Issues #147, #54
+- **Action:** Review diff → merge
+
+## Group 2: Documentation (clean, low-risk)
+
+### PR #133 - Fix terminology: third person → passive voice
+- **Author:** FauxReal9999 | **Files:** 13 | docs-only
+- **What:** Corrects "third person" to "passive voice" across docs (accurate fix)
+- **Action:** Review diff → merge
+
+### PR #108 - Note new repository URL
+- **Author:** akx | **Files:** 5 | docs-only
+- **What:** Updates URLs from `kieranklaassen/compound-engineering-plugin` to `EveryInc/compound-engineering-plugin`
+- **Action:** Review diff → merge
+
+### PR #113 - docs: add brainstorm command to workflow documentation
+- **Author:** tmchow | docs-only
+- **What:** Adds brainstorming skill and learnings-researcher agent to README, fixes component counts
+- **Action:** Review diff → merge
+
+### PR #80 - docs: Add LSP prioritization guidance
+- **Author:** kevinold | **Files:** 1 | docs-only
+- **What:** Adds docs showing users how to customize agent behavior via project CLAUDE.md to prioritize LSP
+- **Action:** Review diff → merge
+
+## Group 3: Enhancements (likely merge)
+
+### PR #119 - fix: backup existing config files before overwriting
+- **Author:** jzw | **Files:** 5 | **+90/-3** | has tests
+- **What:** Adds `backupFile()` utility to create timestamped backups before overwriting Codex/OpenCode configs
+- **Fixes:** Issue #125
+- **Action:** Review diff → merge
+
+### PR #112 - feat(skills): add document-review skill
+- **Author:** tmchow | enhancement
+- **What:** Adds document-review skill for brainstorm/plan refinement, renames `/plan_review` → `/technical_review`
+- **Note:** Breaking rename - needs review
+- **Action:** Review diff → decide
+
+## Group 4: Needs Discussion (comment and leave open)
+
+### PR #157 - Rewrite workflows:review with context-managed map-reduce
+- **Author:** Drewx-Design | large rewrite
+- **What:** Complete rewrite of review command with file-based map-reduce architecture
+- **Comment:** Acknowledge quality, note it's a big change that needs dedicated review session
+
+### PR #131 - feat: add vmark-mcp plugin
+- **Author:** xiaolai | new plugin
+- **What:** Adds entirely new VMark markdown editor plugin to marketplace
+- **Comment:** Ask for more context on fit with marketplace scope
+
+### PR #124 - feat(commands): add /compound-engineering-setup
+- **Author:** internal | config
+- **What:** Interactive setup command for configuring review agents per project
+- **Comment:** Note overlap with #103, needs unified config strategy
+
+### PR #123 - feat: Add sync command for Claude Code personal config
+- **Author:** terry-li-hm | config
+- **What:** Sync personal Claude config across machines/editors
+- **Comment:** Note overlap with #124 and #103, needs unified config strategy
+
+### PR #103 - Add /compound:configure with persistent user preferences
+- **Author:** aviflombaum | **+36,866** lines
+- **What:** Massive architectural change adding persistent config with build system
+- **Comment:** Too large, suggest breaking into smaller PRs
+
+## Group 5: Close
+
+### PR #122 - [EXPERIMENTAL] add /slfg and /swarm-status
+- **Label:** duplicate
+- **What:** Already merged in v2.30.0 (commit e4ff6a8)
+- **Action:** Comment explaining it's been superseded, close
+
+### PR #68 - Improve all 13 skills to 90%+ grades
+- **Label:** wontfix
+- **What:** Massive stale PR (Jan 6), based on 13 skills when we now have 16+
+- **Action:** Comment thanking contributor, suggest fresh PR against current main, close
+
+## Post-Merge Cleanup
+
+After merging:
+- [ ] Close issues fixed by merged PRs (#158, #147, #54, #125)
+- [ ] Close spam issues (#98, #56)
+- [ ] Run `/release-docs` to update documentation site with new component counts
+- [ ] Bump version in plugin.json if needed
+
+## References
+
+- PR list: https://github.com/EveryInc/compound-engineering-plugin/pulls
+- Issues: https://github.com/EveryInc/compound-engineering-plugin/issues
--- a/docs/plans/2026-02-08-refactor-reduce-plugin-context-token-usage-plan.md
+++ b/docs/plans/2026-02-08-refactor-reduce-plugin-context-token-usage-plan.md
@@ -0,0 +1,212 @@
+---
+title: Reduce compound-engineering plugin context token usage
+type: refactor
+date: 2026-02-08
+---
+
+# Reduce compound-engineering Plugin Context Token Usage
+
+## Overview
+
+The compound-engineering plugin is **overflowing the default context budget by ~3x**, causing Claude Code to silently drop components. The plugin consumes ~50,500 characters in always-loaded descriptions against a default budget of 16,000 characters (2% of context window). This means Claude literally doesn't know some agents/skills exist during sessions.
+
+## Problem Statement
+
+### How Context Loading Works
+
+Claude Code uses progressive disclosure for plugin content:
+
+| Level | What Loads | When |
+|-------|-----------|------|
+| **Always in context** | `description` frontmatter from skills, commands, and agents | Session startup (unless `disable-model-invocation: true`) |
+| **On invocation** | Full SKILL.md / command body / agent body | When triggered |
+| **On demand** | Reference files in skill directories | When Claude reads them |
+
+The total budget for ALL descriptions combined is **2% of context window** (~16,000 chars fallback). When exceeded, components are **silently excluded**.
+
+### Current State: 316% of Budget
+
+| Component | Count | Always-Loaded Chars | % of 16K Budget |
+|-----------|------:|--------------------:|----------------:|
+| Agent descriptions | 29 | ~41,400 | 259% |
+| Skill descriptions | 16 | ~5,450 | 34% |
+| Command descriptions | 24 | ~3,700 | 23% |
+| **Total** | **69** | **~50,500** | **316%** |
+
+### Root Cause: Bloated Agent Descriptions
+
+Agent `description` fields contain full `<example>` blocks with user/assistant dialog. These examples belong in the agent body (system prompt), not the description. The description's only job is **discovery** — helping Claude decide whether to delegate.
+
+Examples of the problem:
+
+- `design-iterator.md`: 2,488 chars in description (should be ~200)
+- `spec-flow-analyzer.md`: 2,289 chars in description
+- `security-sentinel.md`: 1,986 chars in description
+- `kieran-rails-reviewer.md`: 1,822 chars in description
+- Average agent description: ~1,400 chars (should be 100-250)
+
+Compare to Anthropic's official examples at 100-200 chars:
+
+```yaml
+# Official (140 chars)
+description: Expert code review specialist. Proactively reviews code for quality, security, and maintainability. Use immediately after writing or modifying code.
+
+# Current plugin (1,822 chars)
+description: "Use this agent when you need to review Rails code changes with an extremely high quality bar...\n\nExamples:\n- <example>\n  Context: The user has just implemented..."
+```
+
+### Secondary Cause: No `disable-model-invocation` on Manual Commands
+
+Zero commands set `disable-model-invocation: true`. Commands like `/deploy-docs`, `/lfg`, `/slfg`, `/triage`, `/feature-video`, `/test-browser`, `/xcode-test` are manual workflows with side effects. Their descriptions consume budget unnecessarily.
+
+The official docs explicitly state:
+> Use `disable-model-invocation: true` for workflows with side effects: `/deploy`, `/commit`, `/triage-prs`. You don't want Claude deciding to deploy because your code looks ready.
+
+---
+
+## Proposed Solution
+
+Three changes, ordered by impact:
+
+### Phase 1: Trim Agent Descriptions (saves ~35,600 chars)
+
+For all 29 agents: move `<example>` blocks from the `description` field into the agent body markdown. Keep descriptions to 1-2 sentences (100-250 chars).
+
+**Before** (agent frontmatter):
+```yaml
+---
+name: kieran-rails-reviewer
+description: "Use this agent when you need to review Rails code changes with an extremely high quality bar. This agent should be invoked after implementing features, modifying existing code, or creating new Rails components. The agent applies Kieran's strict Rails conventions and taste preferences to ensure code meets exceptional standards.\n\nExamples:\n- <example>\n  Context: The user has just implemented a new controller action with turbo streams.\n  user: \"I've added a new update action to the posts controller\"\n  ..."
+---
+
+Detailed system prompt...
+```
+
+**After** (agent frontmatter):
+```yaml
+---
+name: kieran-rails-reviewer
+description: Review Rails code with Kieran's strict conventions. Use after implementing features, modifying code, or creating new Rails components.
+---
+
+<examples>
+<example>
+Context: The user has just implemented a new controller action with turbo streams.
+user: "I've added a new update action to the posts controller"
+...
+</example>
+</examples>
+
+Detailed system prompt...
+```
+
+The examples move into the body (which only loads when the agent is actually invoked).
+
+**Impact:** ~41,400 chars → ~5,800 chars (86% reduction)
+
+### Phase 2: Add `disable-model-invocation: true` to Manual Commands (saves ~3,100 chars)
+
+Commands that should only run when explicitly invoked by the user:
+
+| Command | Reason |
+|---------|--------|
+| `/deploy-docs` | Side effect: deploys |
+| `/release-docs` | Side effect: regenerates docs |
+| `/changelog` | Side effect: generates changelog |
+| `/lfg` | Side effect: autonomous workflow |
+| `/slfg` | Side effect: swarm workflow |
+| `/triage` | Side effect: categorizes findings |
+| `/resolve_parallel` | Side effect: resolves TODOs |
+| `/resolve_todo_parallel` | Side effect: resolves todos |
+| `/resolve_pr_parallel` | Side effect: resolves PR comments |
+| `/feature-video` | Side effect: records video |
+| `/test-browser` | Side effect: runs browser tests |
+| `/xcode-test` | Side effect: builds/tests iOS |
+| `/reproduce-bug` | Side effect: runs reproduction |
+| `/report-bug` | Side effect: creates bug report |
+| `/agent-native-audit` | Side effect: runs audit |
+| `/heal-skill` | Side effect: modifies skill files |
+| `/generate_command` | Side effect: creates files |
+| `/create-agent-skill` | Side effect: creates files |
+
+Keep these **without** the flag (Claude should know about them):
+- `/workflows:plan` — Claude might suggest planning
+- `/workflows:work` — Claude might suggest starting work
+- `/workflows:review` — Claude might suggest review
+- `/workflows:brainstorm` — Claude might suggest brainstorming
+- `/workflows:compound` — Claude might suggest documenting
+- `/deepen-plan` — Claude might suggest deepening a plan
+
+**Impact:** ~3,700 chars → ~600 chars for commands in context
+
+### Phase 3: Add `disable-model-invocation: true` to Manual Skills (saves ~1,000 chars)
+
+Skills that are manual workflows:
+
+| Skill | Reason |
+|-------|--------|
+| `skill-creator` | Only invoked manually |
+| `orchestrating-swarms` | Only invoked manually |
+| `git-worktree` | Only invoked manually |
+| `resolve-pr-parallel` | Side effect |
+| `compound-docs` | Only invoked manually |
+| `file-todos` | Only invoked manually |
+
+Keep without the flag (Claude should auto-invoke):
+- `dhh-rails-style` — Claude should use when writing Rails code
+- `frontend-design` — Claude should use when building UI
+- `brainstorming` — Claude should suggest before implementation
+- `agent-browser` — Claude should use for browser tasks
+- `gemini-imagegen` — Claude should use for image generation
+- `create-agent-skills` — Claude should use when creating skills
+- `every-style-editor` — Claude should use for editing
+- `dspy-ruby` — Claude should use for DSPy.rb
+- `agent-native-architecture` — Claude should use for agent-native design
+- `andrew-kane-gem-writer` — Claude should use for gem writing
+- `rclone` — Claude should use for cloud uploads
+- `document-review` — Claude should use for doc review
+
+**Impact:** ~5,450 chars → ~4,000 chars for skills in context
+
+---
+
+## Projected Result
+
+| Component | Before (chars) | After (chars) | Reduction |
+|-----------|---------------:|-------------:|-----------:|
+| Agent descriptions | ~41,400 | ~5,800 | -86% |
+| Command descriptions | ~3,700 | ~600 | -84% |
+| Skill descriptions | ~5,450 | ~4,000 | -27% |
+| **Total** | **~50,500** | **~10,400** | **-79%** |
+| **% of 16K budget** | **316%** | **65%** | -- |
+
+From 316% of budget (components silently dropped) to 65% of budget (room for growth).
+
+---
+
+## Acceptance Criteria
+
+- [x] All 29 agent description fields are under 250 characters
+- [x] All `<example>` blocks moved from description to agent body
+- [x] 18 manual commands have `disable-model-invocation: true`
+- [x] 6 manual skills have `disable-model-invocation: true`
+- [x] Total always-loaded description content is under 16,000 characters
+- [ ] Run `/context` to verify no "excluded skills" warnings
+- [x] All agents still function correctly (examples are in body, not lost)
+- [x] All commands still invocable via `/command-name`
+- [x] Update plugin version in plugin.json and marketplace.json
+- [x] Update CHANGELOG.md
+
+## Implementation Notes
+
+- Agent examples should use `<examples><example>...</example></examples>` tags in the body — Claude understands these natively
+- Description format: "[What it does]. Use [when/trigger condition]." — two sentences max
+- The `lint` agent at 115 words shows compact agents work great
+- Test with `claude --plugin-dir ./plugins/compound-engineering` after changes
+- The `SLASH_COMMAND_TOOL_CHAR_BUDGET` env var can override the default budget for testing
+
+## References
+
+- [Skills docs](https://code.claude.com/docs/en/skills) — "Skill descriptions are loaded into context... If you have many skills, they may exceed the character budget"
+- [Subagents docs](https://code.claude.com/docs/en/sub-agents) — description field used for automatic delegation
+- [Skills troubleshooting](https://code.claude.com/docs/en/skills#claude-doesnt-see-all-my-skills) — "The budget scales dynamically at 2% of the context window, with a fallback of 16,000 characters"