Files

Kieran Klaassen f744b797ef Reduce context token usage by 79% — fix silent component exclusion (#161 )

* Update create-agent-skills to match 2026 official docs, add /triage-prs command

- Rewrite SKILL.md to document that commands and skills are now merged
- Add new frontmatter fields: disable-model-invocation, user-invocable, context, agent
- Add invocation control table and dynamic context injection docs
- Fix skill-structure.md: was incorrectly recommending XML tags over markdown headings
- Update official-spec.md with complete 2026 specification
- Add local /triage-prs command for PR triage workflow
- Add PR triage plan document

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* [2.31.0] Reduce context token usage by 79%, include recent community contributions

The plugin was consuming 316% of Claude Code's description character budget
(~50,500 chars vs 16,000 limit), causing components to be silently excluded.
Now at 65% (~10,400 chars) with all components visible.

Changes:
- Trim all 29 agent descriptions (move examples to body)
- Add disable-model-invocation to 18 manual commands
- Add disable-model-invocation to 6 manual skills
- Include recent community contributions in changelog
- Fix component counts (29 agents, 24 commands, 18 skills)

Contributors: @trevin, @terryli, @robertomello, @zacwilliams,
@aarnikoskela, @samxie, @davidalley

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix: keep disable-model-invocation off commands called by /lfg, rename xcode-test

- Remove disable-model-invocation from test-browser, feature-video,
  resolve_todo_parallel — these are called programmatically by /lfg and /slfg
- Rename xcode-test to test-xcode to match test-browser naming convention

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix: keep git-worktree skill auto-invocable (used by /workflows:work)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(converter): support disable-model-invocation frontmatter

Parse disable-model-invocation from command and skill frontmatter.
Commands/skills with this flag are excluded from OpenCode command maps
and Codex prompt/skill generation, matching Claude Code behavior where
these components are user-only invocable.

Bump converter version to 0.3.0.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-08 22:28:51 -06:00

8.0 KiB

Raw Blame History

name, description, argument-hint, disable-model-invocation

name	description	argument-hint	disable-model-invocation
agent-native-audit	Run comprehensive agent-native architecture review with scored principles	[optional: specific principle to audit]	true

Agent-Native Architecture Audit

Conduct a comprehensive review of the codebase against agent-native architecture principles, launching parallel sub-agents for each principle and producing a scored report.

Core Principles to Audit

Action Parity - "Whatever the user can do, the agent can do"
Tools as Primitives - "Tools provide capability, not behavior"
Context Injection - "System prompt includes dynamic context about app state"
Shared Workspace - "Agent and user work in the same data space"
CRUD Completeness - "Every entity has full CRUD (Create, Read, Update, Delete)"
UI Integration - "Agent actions immediately reflected in UI"
Capability Discovery - "Users can discover what the agent can do"
Prompt-Native Features - "Features are prompts defining outcomes, not code"

Workflow

Step 1: Load the Agent-Native Skill

First, invoke the agent-native-architecture skill to understand all principles:

/compound-engineering:agent-native-architecture

Select option 7 (action parity) to load the full reference material.

Step 2: Launch Parallel Sub-Agents

Launch 8 parallel sub-agents using the Task tool with subagent_type: Explore, one for each principle. Each agent should:

Enumerate ALL instances in the codebase (user actions, tools, contexts, data stores, etc.)
Check compliance against the principle
Provide a SPECIFIC SCORE like "X out of Y (percentage%)"
List specific gaps and recommendations

Agent 1: Action Parity

Audit for ACTION PARITY - "Whatever the user can do, the agent can do."

Tasks:
1. Enumerate ALL user actions in frontend (API calls, button clicks, form submissions)
   - Search for API service files, fetch calls, form handlers
   - Check routes and components for user interactions
2. Check which have corresponding agent tools
   - Search for agent tool definitions
   - Map user actions to agent capabilities
3. Score: "Agent can do X out of Y user actions"

Format:
## Action Parity Audit
### User Actions Found
| Action | Location | Agent Tool | Status |
### Score: X/Y (percentage%)
### Missing Agent Tools
### Recommendations

Agent 2: Tools as Primitives

Audit for TOOLS AS PRIMITIVES - "Tools provide capability, not behavior."

Tasks:
1. Find and read ALL agent tool files
2. Classify each as:
   - PRIMITIVE (good): read, write, store, list - enables capability without business logic
   - WORKFLOW (bad): encodes business logic, makes decisions, orchestrates steps
3. Score: "X out of Y tools are proper primitives"

Format:
## Tools as Primitives Audit
### Tool Analysis
| Tool | File | Type | Reasoning |
### Score: X/Y (percentage%)
### Problematic Tools (workflows that should be primitives)
### Recommendations

Agent 3: Context Injection

Audit for CONTEXT INJECTION - "System prompt includes dynamic context about app state"

Tasks:
1. Find context injection code (search for "context", "system prompt", "inject")
2. Read agent prompts and system messages
3. Enumerate what IS injected vs what SHOULD be:
   - Available resources (files, drafts, documents)
   - User preferences/settings
   - Recent activity
   - Available capabilities listed
   - Session history
   - Workspace state

Format:
## Context Injection Audit
### Context Types Analysis
| Context Type | Injected? | Location | Notes |
### Score: X/Y (percentage%)
### Missing Context
### Recommendations

Agent 4: Shared Workspace

Audit for SHARED WORKSPACE - "Agent and user work in the same data space"

Tasks:
1. Identify all data stores/tables/models
2. Check if agents read/write to SAME tables or separate ones
3. Look for sandbox isolation anti-pattern (agent has separate data space)

Format:
## Shared Workspace Audit
### Data Store Analysis
| Data Store | User Access | Agent Access | Shared? |
### Score: X/Y (percentage%)
### Isolated Data (anti-pattern)
### Recommendations

Agent 5: CRUD Completeness

Audit for CRUD COMPLETENESS - "Every entity has full CRUD"

Tasks:
1. Identify all entities/models in the codebase
2. For each entity, check if agent tools exist for:
   - Create
   - Read
   - Update
   - Delete
3. Score per entity and overall

Format:
## CRUD Completeness Audit
### Entity CRUD Analysis
| Entity | Create | Read | Update | Delete | Score |
### Overall Score: X/Y entities with full CRUD (percentage%)
### Incomplete Entities (list missing operations)
### Recommendations

Agent 6: UI Integration

Audit for UI INTEGRATION - "Agent actions immediately reflected in UI"

Tasks:
1. Check how agent writes/changes propagate to frontend
2. Look for:
   - Streaming updates (SSE, WebSocket)
   - Polling mechanisms
   - Shared state/services
   - Event buses
   - File watching
3. Identify "silent actions" anti-pattern (agent changes state but UI doesn't update)

Format:
## UI Integration Audit
### Agent Action → UI Update Analysis
| Agent Action | UI Mechanism | Immediate? | Notes |
### Score: X/Y (percentage%)
### Silent Actions (anti-pattern)
### Recommendations

Agent 7: Capability Discovery

Audit for CAPABILITY DISCOVERY - "Users can discover what the agent can do"

Tasks:
1. Check for these 7 discovery mechanisms:
   - Onboarding flow showing agent capabilities
   - Help documentation
   - Capability hints in UI
   - Agent self-describes in responses
   - Suggested prompts/actions
   - Empty state guidance
   - Slash commands (/help, /tools)
2. Score against 7 mechanisms

Format:
## Capability Discovery Audit
### Discovery Mechanism Analysis
| Mechanism | Exists? | Location | Quality |
### Score: X/7 (percentage%)
### Missing Discovery
### Recommendations

Agent 8: Prompt-Native Features

Audit for PROMPT-NATIVE FEATURES - "Features are prompts defining outcomes, not code"

Tasks:
1. Read all agent prompts
2. Classify each feature/behavior as defined in:
   - PROMPT (good): outcomes defined in natural language
   - CODE (bad): business logic hardcoded
3. Check if behavior changes require prompt edit vs code change

Format:
## Prompt-Native Features Audit
### Feature Definition Analysis
| Feature | Defined In | Type | Notes |
### Score: X/Y (percentage%)
### Code-Defined Features (anti-pattern)
### Recommendations

Step 3: Compile Summary Report

After all agents complete, compile a summary with:

## Agent-Native Architecture Review: [Project Name]

### Overall Score Summary

| Core Principle | Score | Percentage | Status |
|----------------|-------|------------|--------|
| Action Parity | X/Y | Z% | ✅/⚠️/❌ |
| Tools as Primitives | X/Y | Z% | ✅/⚠️/❌ |
| Context Injection | X/Y | Z% | ✅/⚠️/❌ |
| Shared Workspace | X/Y | Z% | ✅/⚠️/❌ |
| CRUD Completeness | X/Y | Z% | ✅/⚠️/❌ |
| UI Integration | X/Y | Z% | ✅/⚠️/❌ |
| Capability Discovery | X/Y | Z% | ✅/⚠️/❌ |
| Prompt-Native Features | X/Y | Z% | ✅/⚠️/❌ |

**Overall Agent-Native Score: X%**

### Status Legend
- ✅ Excellent (80%+)
- ⚠️ Partial (50-79%)
- ❌ Needs Work (<50%)

### Top 10 Recommendations by Impact

| Priority | Action | Principle | Effort |
|----------|--------|-----------|--------|

### What's Working Excellently

[List top 5 strengths]

Success Criteria

All 8 sub-agents complete their audits
Each principle has a specific numeric score (X/Y format)
Summary table shows all scores and status indicators
Top 10 recommendations are prioritized by impact
Report identifies both strengths and gaps

Optional: Single Principle Audit

If $ARGUMENTS specifies a single principle (e.g., "action parity"), only run that sub-agent and provide detailed findings for that principle alone.

Valid arguments:

action parity or 1
tools or primitives or 2
context or injection or 3
shared or workspace or 4
crud or 5
ui or integration or 6
discovery or 7
prompt or features or 8

8.0 KiB Raw Blame History