[2.23.0] Add /agent-native-audit command

- New command for comprehensive agent-native architecture review - Launches 8 parallel sub-agents, one per core principle - Principles: Action Parity, Tools as Primitives, Context Injection, Shared Workspace, CRUD Completeness, UI Integration, Capability Discovery, Prompt-Native Features - Each agent produces specific score (X/Y format with percentage) - Generates summary report with overall score and top 10 recommendations - Supports single principle audit via argument Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-08 08:49:04 -08:00
parent 68aa93678c
commit 8c4ed0d458
3 changed files with 296 additions and 2 deletions
--- a/plugins/compound-engineering/.claude-plugin/plugin.json
+++ b/plugins/compound-engineering/.claude-plugin/plugin.json
@@ -1,7 +1,7 @@
 {
  "name": "compound-engineering",
-  "version": "2.22.0",
+  "version": "2.23.0",
-  "description": "AI-powered development tools. 27 agents, 20 commands, 13 skills, 2 MCP servers for code review, research, design, and workflow automation.",
+  "description": "AI-powered development tools. 27 agents, 21 commands, 13 skills, 2 MCP servers for code review, research, design, and workflow automation.",
  "author": {
    "name": "Kieran Klaassen",
    "email": "kieran@every.to",
--- a/plugins/compound-engineering/CHANGELOG.md
+++ b/plugins/compound-engineering/CHANGELOG.md
@@ -5,6 +5,23 @@ All notable changes to the compound-engineering plugin will be documented in thi
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 ## [2.23.0] - 2026-01-08
 ### Added
 - **`/agent-native-audit` command** - Comprehensive agent-native architecture review
  - Launches 8 parallel sub-agents, one per core principle
  - Principles: Action Parity, Tools as Primitives, Context Injection, Shared Workspace, CRUD Completeness, UI Integration, Capability Discovery, Prompt-Native Features
  - Each agent produces specific score (X/Y format with percentage)
  - Generates summary report with overall score and top 10 recommendations
  - Supports single principle audit via argument
 ### Summary
 - 27 agents, 21 commands, 13 skills, 2 MCP servers
 ---
 ## [2.22.0] - 2026-01-05
 ### Added
--- a/plugins/compound-engineering/commands/agent-native-audit.md
+++ b/plugins/compound-engineering/commands/agent-native-audit.md
@@ -0,0 +1,277 @@
 ---
 name: agent-native-audit
 description: Run comprehensive agent-native architecture review with scored principles
 argument-hint: "[optional: specific principle to audit]"
 ---
 # Agent-Native Architecture Audit
 Conduct a comprehensive review of the codebase against agent-native architecture principles, launching parallel sub-agents for each principle and producing a scored report.
 ## Core Principles to Audit
 1. **Action Parity** - "Whatever the user can do, the agent can do"
 2. **Tools as Primitives** - "Tools provide capability, not behavior"
 3. **Context Injection** - "System prompt includes dynamic context about app state"
 4. **Shared Workspace** - "Agent and user work in the same data space"
 5. **CRUD Completeness** - "Every entity has full CRUD (Create, Read, Update, Delete)"
 6. **UI Integration** - "Agent actions immediately reflected in UI"
 7. **Capability Discovery** - "Users can discover what the agent can do"
 8. **Prompt-Native Features** - "Features are prompts defining outcomes, not code"
 ## Workflow
 ### Step 1: Load the Agent-Native Skill
 First, invoke the agent-native-architecture skill to understand all principles:
 ```
 /compound-engineering:agent-native-architecture
 ```
 Select option 7 (action parity) to load the full reference material.
 ### Step 2: Launch Parallel Sub-Agents
 Launch 8 parallel sub-agents using the Task tool with `subagent_type: Explore`, one for each principle. Each agent should:
 1. Enumerate ALL instances in the codebase (user actions, tools, contexts, data stores, etc.)
 2. Check compliance against the principle
 3. Provide a SPECIFIC SCORE like "X out of Y (percentage%)"
 4. List specific gaps and recommendations
 <sub-agents>
 **Agent 1: Action Parity**
 ```
 Audit for ACTION PARITY - "Whatever the user can do, the agent can do."
 Tasks:
 1. Enumerate ALL user actions in frontend (API calls, button clicks, form submissions)
   - Search for API service files, fetch calls, form handlers
   - Check routes and components for user interactions
 2. Check which have corresponding agent tools
   - Search for agent tool definitions
   - Map user actions to agent capabilities
 3. Score: "Agent can do X out of Y user actions"
 Format:
 ## Action Parity Audit
 ### User Actions Found
 | Action | Location | Agent Tool | Status |
 ### Score: X/Y (percentage%)
 ### Missing Agent Tools
 ### Recommendations
 ```
 **Agent 2: Tools as Primitives**
 ```
 Audit for TOOLS AS PRIMITIVES - "Tools provide capability, not behavior."
 Tasks:
 1. Find and read ALL agent tool files
 2. Classify each as:
   - PRIMITIVE (good): read, write, store, list - enables capability without business logic
   - WORKFLOW (bad): encodes business logic, makes decisions, orchestrates steps
 3. Score: "X out of Y tools are proper primitives"
 Format:
 ## Tools as Primitives Audit
 ### Tool Analysis
 | Tool | File | Type | Reasoning |
 ### Score: X/Y (percentage%)
 ### Problematic Tools (workflows that should be primitives)
 ### Recommendations
 ```
 **Agent 3: Context Injection**
 ```
 Audit for CONTEXT INJECTION - "System prompt includes dynamic context about app state"
 Tasks:
 1. Find context injection code (search for "context", "system prompt", "inject")
 2. Read agent prompts and system messages
 3. Enumerate what IS injected vs what SHOULD be:
   - Available resources (files, drafts, documents)
   - User preferences/settings
   - Recent activity
   - Available capabilities listed
   - Session history
   - Workspace state
 Format:
 ## Context Injection Audit
 ### Context Types Analysis
 | Context Type | Injected? | Location | Notes |
 ### Score: X/Y (percentage%)
 ### Missing Context
 ### Recommendations
 ```
 **Agent 4: Shared Workspace**
 ```
 Audit for SHARED WORKSPACE - "Agent and user work in the same data space"
 Tasks:
 1. Identify all data stores/tables/models
 2. Check if agents read/write to SAME tables or separate ones
 3. Look for sandbox isolation anti-pattern (agent has separate data space)
 Format:
 ## Shared Workspace Audit
 ### Data Store Analysis
 | Data Store | User Access | Agent Access | Shared? |
 ### Score: X/Y (percentage%)
 ### Isolated Data (anti-pattern)
 ### Recommendations
 ```
 **Agent 5: CRUD Completeness**
 ```
 Audit for CRUD COMPLETENESS - "Every entity has full CRUD"
 Tasks:
 1. Identify all entities/models in the codebase
 2. For each entity, check if agent tools exist for:
   - Create
   - Read
   - Update
   - Delete
 3. Score per entity and overall
 Format:
 ## CRUD Completeness Audit
 ### Entity CRUD Analysis
 | Entity | Create | Read | Update | Delete | Score |
 ### Overall Score: X/Y entities with full CRUD (percentage%)
 ### Incomplete Entities (list missing operations)
 ### Recommendations
 ```
 **Agent 6: UI Integration**
 ```
 Audit for UI INTEGRATION - "Agent actions immediately reflected in UI"
 Tasks:
 1. Check how agent writes/changes propagate to frontend
 2. Look for:
   - Streaming updates (SSE, WebSocket)
   - Polling mechanisms
   - Shared state/services
   - Event buses
   - File watching
 3. Identify "silent actions" anti-pattern (agent changes state but UI doesn't update)
 Format:
 ## UI Integration Audit
 ### Agent Action → UI Update Analysis
 | Agent Action | UI Mechanism | Immediate? | Notes |
 ### Score: X/Y (percentage%)
 ### Silent Actions (anti-pattern)
 ### Recommendations
 ```
 **Agent 7: Capability Discovery**
 ```
 Audit for CAPABILITY DISCOVERY - "Users can discover what the agent can do"
 Tasks:
 1. Check for these 7 discovery mechanisms:
   - Onboarding flow showing agent capabilities
   - Help documentation
   - Capability hints in UI
   - Agent self-describes in responses
   - Suggested prompts/actions
   - Empty state guidance
   - Slash commands (/help, /tools)
 2. Score against 7 mechanisms
 Format:
 ## Capability Discovery Audit
 ### Discovery Mechanism Analysis
 | Mechanism | Exists? | Location | Quality |
 ### Score: X/7 (percentage%)
 ### Missing Discovery
 ### Recommendations
 ```
 **Agent 8: Prompt-Native Features**
 ```
 Audit for PROMPT-NATIVE FEATURES - "Features are prompts defining outcomes, not code"
 Tasks:
 1. Read all agent prompts
 2. Classify each feature/behavior as defined in:
   - PROMPT (good): outcomes defined in natural language
   - CODE (bad): business logic hardcoded
 3. Check if behavior changes require prompt edit vs code change
 Format:
 ## Prompt-Native Features Audit
 ### Feature Definition Analysis
 | Feature | Defined In | Type | Notes |
 ### Score: X/Y (percentage%)
 ### Code-Defined Features (anti-pattern)
 ### Recommendations
 ```
 </sub-agents>
 ### Step 3: Compile Summary Report
 After all agents complete, compile a summary with:
 ```markdown
 ## Agent-Native Architecture Review: [Project Name]
 ### Overall Score Summary
 | Core Principle | Score | Percentage | Status |
 |----------------|-------|------------|--------|
 | Action Parity | X/Y | Z% | ✅/⚠️/❌ |
 | Tools as Primitives | X/Y | Z% | ✅/⚠️/❌ |
 | Context Injection | X/Y | Z% | ✅/⚠️/❌ |
 | Shared Workspace | X/Y | Z% | ✅/⚠️/❌ |
 | CRUD Completeness | X/Y | Z% | ✅/⚠️/❌ |
 | UI Integration | X/Y | Z% | ✅/⚠️/❌ |
 | Capability Discovery | X/Y | Z% | ✅/⚠️/❌ |
 | Prompt-Native Features | X/Y | Z% | ✅/⚠️/❌ |
 **Overall Agent-Native Score: X%**
 ### Status Legend
 - ✅ Excellent (80%+)
 - ⚠️ Partial (50-79%)
 - ❌ Needs Work (<50%)
 ### Top 10 Recommendations by Impact
 | Priority | Action | Principle | Effort |
 |----------|--------|-----------|--------|
 ### What's Working Excellently
 [List top 5 strengths]
 ```
 ## Success Criteria
 - [ ] All 8 sub-agents complete their audits
 - [ ] Each principle has a specific numeric score (X/Y format)
 - [ ] Summary table shows all scores and status indicators
 - [ ] Top 10 recommendations are prioritized by impact
 - [ ] Report identifies both strengths and gaps
 ## Optional: Single Principle Audit
 If $ARGUMENTS specifies a single principle (e.g., "action parity"), only run that sub-agent and provide detailed findings for that principle alone.
 Valid arguments:
 - `action parity` or `1`
 - `tools` or `primitives` or `2`
 - `context` or `injection` or `3`
 - `shared` or `workspace` or `4`
 - `crud` or `5`
 - `ui` or `integration` or `6`
 - `discovery` or `7`
 - `prompt` or `features` or `8`