diff --git a/plugins/compound-engineering/.claude-plugin/plugin.json b/plugins/compound-engineering/.claude-plugin/plugin.json index 0924252..b4a7af0 100644 --- a/plugins/compound-engineering/.claude-plugin/plugin.json +++ b/plugins/compound-engineering/.claude-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "compound-engineering", - "version": "2.22.0", - "description": "AI-powered development tools. 27 agents, 20 commands, 13 skills, 2 MCP servers for code review, research, design, and workflow automation.", + "version": "2.23.0", + "description": "AI-powered development tools. 27 agents, 21 commands, 13 skills, 2 MCP servers for code review, research, design, and workflow automation.", "author": { "name": "Kieran Klaassen", "email": "kieran@every.to", diff --git a/plugins/compound-engineering/CHANGELOG.md b/plugins/compound-engineering/CHANGELOG.md index 843a916..40c116e 100644 --- a/plugins/compound-engineering/CHANGELOG.md +++ b/plugins/compound-engineering/CHANGELOG.md @@ -5,6 +5,23 @@ All notable changes to the compound-engineering plugin will be documented in thi The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [2.23.0] - 2026-01-08 + +### Added + +- **`/agent-native-audit` command** - Comprehensive agent-native architecture review + - Launches 8 parallel sub-agents, one per core principle + - Principles: Action Parity, Tools as Primitives, Context Injection, Shared Workspace, CRUD Completeness, UI Integration, Capability Discovery, Prompt-Native Features + - Each agent produces specific score (X/Y format with percentage) + - Generates summary report with overall score and top 10 recommendations + - Supports single principle audit via argument + +### Summary + +- 27 agents, 21 commands, 13 skills, 2 MCP servers + +--- + ## [2.22.0] - 2026-01-05 ### Added diff --git a/plugins/compound-engineering/commands/agent-native-audit.md b/plugins/compound-engineering/commands/agent-native-audit.md new file mode 100644 index 0000000..95253b2 --- /dev/null +++ b/plugins/compound-engineering/commands/agent-native-audit.md @@ -0,0 +1,277 @@ +--- +name: agent-native-audit +description: Run comprehensive agent-native architecture review with scored principles +argument-hint: "[optional: specific principle to audit]" +--- + +# Agent-Native Architecture Audit + +Conduct a comprehensive review of the codebase against agent-native architecture principles, launching parallel sub-agents for each principle and producing a scored report. + +## Core Principles to Audit + +1. **Action Parity** - "Whatever the user can do, the agent can do" +2. **Tools as Primitives** - "Tools provide capability, not behavior" +3. **Context Injection** - "System prompt includes dynamic context about app state" +4. **Shared Workspace** - "Agent and user work in the same data space" +5. **CRUD Completeness** - "Every entity has full CRUD (Create, Read, Update, Delete)" +6. **UI Integration** - "Agent actions immediately reflected in UI" +7. **Capability Discovery** - "Users can discover what the agent can do" +8. **Prompt-Native Features** - "Features are prompts defining outcomes, not code" + +## Workflow + +### Step 1: Load the Agent-Native Skill + +First, invoke the agent-native-architecture skill to understand all principles: + +``` +/compound-engineering:agent-native-architecture +``` + +Select option 7 (action parity) to load the full reference material. + +### Step 2: Launch Parallel Sub-Agents + +Launch 8 parallel sub-agents using the Task tool with `subagent_type: Explore`, one for each principle. Each agent should: + +1. Enumerate ALL instances in the codebase (user actions, tools, contexts, data stores, etc.) +2. Check compliance against the principle +3. Provide a SPECIFIC SCORE like "X out of Y (percentage%)" +4. List specific gaps and recommendations + + + +**Agent 1: Action Parity** +``` +Audit for ACTION PARITY - "Whatever the user can do, the agent can do." + +Tasks: +1. Enumerate ALL user actions in frontend (API calls, button clicks, form submissions) + - Search for API service files, fetch calls, form handlers + - Check routes and components for user interactions +2. Check which have corresponding agent tools + - Search for agent tool definitions + - Map user actions to agent capabilities +3. Score: "Agent can do X out of Y user actions" + +Format: +## Action Parity Audit +### User Actions Found +| Action | Location | Agent Tool | Status | +### Score: X/Y (percentage%) +### Missing Agent Tools +### Recommendations +``` + +**Agent 2: Tools as Primitives** +``` +Audit for TOOLS AS PRIMITIVES - "Tools provide capability, not behavior." + +Tasks: +1. Find and read ALL agent tool files +2. Classify each as: + - PRIMITIVE (good): read, write, store, list - enables capability without business logic + - WORKFLOW (bad): encodes business logic, makes decisions, orchestrates steps +3. Score: "X out of Y tools are proper primitives" + +Format: +## Tools as Primitives Audit +### Tool Analysis +| Tool | File | Type | Reasoning | +### Score: X/Y (percentage%) +### Problematic Tools (workflows that should be primitives) +### Recommendations +``` + +**Agent 3: Context Injection** +``` +Audit for CONTEXT INJECTION - "System prompt includes dynamic context about app state" + +Tasks: +1. Find context injection code (search for "context", "system prompt", "inject") +2. Read agent prompts and system messages +3. Enumerate what IS injected vs what SHOULD be: + - Available resources (files, drafts, documents) + - User preferences/settings + - Recent activity + - Available capabilities listed + - Session history + - Workspace state + +Format: +## Context Injection Audit +### Context Types Analysis +| Context Type | Injected? | Location | Notes | +### Score: X/Y (percentage%) +### Missing Context +### Recommendations +``` + +**Agent 4: Shared Workspace** +``` +Audit for SHARED WORKSPACE - "Agent and user work in the same data space" + +Tasks: +1. Identify all data stores/tables/models +2. Check if agents read/write to SAME tables or separate ones +3. Look for sandbox isolation anti-pattern (agent has separate data space) + +Format: +## Shared Workspace Audit +### Data Store Analysis +| Data Store | User Access | Agent Access | Shared? | +### Score: X/Y (percentage%) +### Isolated Data (anti-pattern) +### Recommendations +``` + +**Agent 5: CRUD Completeness** +``` +Audit for CRUD COMPLETENESS - "Every entity has full CRUD" + +Tasks: +1. Identify all entities/models in the codebase +2. For each entity, check if agent tools exist for: + - Create + - Read + - Update + - Delete +3. Score per entity and overall + +Format: +## CRUD Completeness Audit +### Entity CRUD Analysis +| Entity | Create | Read | Update | Delete | Score | +### Overall Score: X/Y entities with full CRUD (percentage%) +### Incomplete Entities (list missing operations) +### Recommendations +``` + +**Agent 6: UI Integration** +``` +Audit for UI INTEGRATION - "Agent actions immediately reflected in UI" + +Tasks: +1. Check how agent writes/changes propagate to frontend +2. Look for: + - Streaming updates (SSE, WebSocket) + - Polling mechanisms + - Shared state/services + - Event buses + - File watching +3. Identify "silent actions" anti-pattern (agent changes state but UI doesn't update) + +Format: +## UI Integration Audit +### Agent Action → UI Update Analysis +| Agent Action | UI Mechanism | Immediate? | Notes | +### Score: X/Y (percentage%) +### Silent Actions (anti-pattern) +### Recommendations +``` + +**Agent 7: Capability Discovery** +``` +Audit for CAPABILITY DISCOVERY - "Users can discover what the agent can do" + +Tasks: +1. Check for these 7 discovery mechanisms: + - Onboarding flow showing agent capabilities + - Help documentation + - Capability hints in UI + - Agent self-describes in responses + - Suggested prompts/actions + - Empty state guidance + - Slash commands (/help, /tools) +2. Score against 7 mechanisms + +Format: +## Capability Discovery Audit +### Discovery Mechanism Analysis +| Mechanism | Exists? | Location | Quality | +### Score: X/7 (percentage%) +### Missing Discovery +### Recommendations +``` + +**Agent 8: Prompt-Native Features** +``` +Audit for PROMPT-NATIVE FEATURES - "Features are prompts defining outcomes, not code" + +Tasks: +1. Read all agent prompts +2. Classify each feature/behavior as defined in: + - PROMPT (good): outcomes defined in natural language + - CODE (bad): business logic hardcoded +3. Check if behavior changes require prompt edit vs code change + +Format: +## Prompt-Native Features Audit +### Feature Definition Analysis +| Feature | Defined In | Type | Notes | +### Score: X/Y (percentage%) +### Code-Defined Features (anti-pattern) +### Recommendations +``` + + + +### Step 3: Compile Summary Report + +After all agents complete, compile a summary with: + +```markdown +## Agent-Native Architecture Review: [Project Name] + +### Overall Score Summary + +| Core Principle | Score | Percentage | Status | +|----------------|-------|------------|--------| +| Action Parity | X/Y | Z% | ✅/⚠️/❌ | +| Tools as Primitives | X/Y | Z% | ✅/⚠️/❌ | +| Context Injection | X/Y | Z% | ✅/⚠️/❌ | +| Shared Workspace | X/Y | Z% | ✅/⚠️/❌ | +| CRUD Completeness | X/Y | Z% | ✅/⚠️/❌ | +| UI Integration | X/Y | Z% | ✅/⚠️/❌ | +| Capability Discovery | X/Y | Z% | ✅/⚠️/❌ | +| Prompt-Native Features | X/Y | Z% | ✅/⚠️/❌ | + +**Overall Agent-Native Score: X%** + +### Status Legend +- ✅ Excellent (80%+) +- ⚠️ Partial (50-79%) +- ❌ Needs Work (<50%) + +### Top 10 Recommendations by Impact + +| Priority | Action | Principle | Effort | +|----------|--------|-----------|--------| + +### What's Working Excellently + +[List top 5 strengths] +``` + +## Success Criteria + +- [ ] All 8 sub-agents complete their audits +- [ ] Each principle has a specific numeric score (X/Y format) +- [ ] Summary table shows all scores and status indicators +- [ ] Top 10 recommendations are prioritized by impact +- [ ] Report identifies both strengths and gaps + +## Optional: Single Principle Audit + +If $ARGUMENTS specifies a single principle (e.g., "action parity"), only run that sub-agent and provide detailed findings for that principle alone. + +Valid arguments: +- `action parity` or `1` +- `tools` or `primitives` or `2` +- `context` or `injection` or `3` +- `shared` or `workspace` or `4` +- `crud` or `5` +- `ui` or `integration` or `6` +- `discovery` or `7` +- `prompt` or `features` or `8`