[2.23.0] Add /agent-native-audit command
- New command for comprehensive agent-native architecture review - Launches 8 parallel sub-agents, one per core principle - Principles: Action Parity, Tools as Primitives, Context Injection, Shared Workspace, CRUD Completeness, UI Integration, Capability Discovery, Prompt-Native Features - Each agent produces specific score (X/Y format with percentage) - Generates summary report with overall score and top 10 recommendations - Supports single principle audit via argument Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
{
|
||||
"name": "compound-engineering",
|
||||
"version": "2.22.0",
|
||||
"description": "AI-powered development tools. 27 agents, 20 commands, 13 skills, 2 MCP servers for code review, research, design, and workflow automation.",
|
||||
"version": "2.23.0",
|
||||
"description": "AI-powered development tools. 27 agents, 21 commands, 13 skills, 2 MCP servers for code review, research, design, and workflow automation.",
|
||||
"author": {
|
||||
"name": "Kieran Klaassen",
|
||||
"email": "kieran@every.to",
|
||||
|
||||
@@ -5,6 +5,23 @@ All notable changes to the compound-engineering plugin will be documented in thi
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||
|
||||
## [2.23.0] - 2026-01-08
|
||||
|
||||
### Added
|
||||
|
||||
- **`/agent-native-audit` command** - Comprehensive agent-native architecture review
|
||||
- Launches 8 parallel sub-agents, one per core principle
|
||||
- Principles: Action Parity, Tools as Primitives, Context Injection, Shared Workspace, CRUD Completeness, UI Integration, Capability Discovery, Prompt-Native Features
|
||||
- Each agent produces specific score (X/Y format with percentage)
|
||||
- Generates summary report with overall score and top 10 recommendations
|
||||
- Supports single principle audit via argument
|
||||
|
||||
### Summary
|
||||
|
||||
- 27 agents, 21 commands, 13 skills, 2 MCP servers
|
||||
|
||||
---
|
||||
|
||||
## [2.22.0] - 2026-01-05
|
||||
|
||||
### Added
|
||||
|
||||
277
plugins/compound-engineering/commands/agent-native-audit.md
Normal file
277
plugins/compound-engineering/commands/agent-native-audit.md
Normal file
@@ -0,0 +1,277 @@
|
||||
---
|
||||
name: agent-native-audit
|
||||
description: Run comprehensive agent-native architecture review with scored principles
|
||||
argument-hint: "[optional: specific principle to audit]"
|
||||
---
|
||||
|
||||
# Agent-Native Architecture Audit
|
||||
|
||||
Conduct a comprehensive review of the codebase against agent-native architecture principles, launching parallel sub-agents for each principle and producing a scored report.
|
||||
|
||||
## Core Principles to Audit
|
||||
|
||||
1. **Action Parity** - "Whatever the user can do, the agent can do"
|
||||
2. **Tools as Primitives** - "Tools provide capability, not behavior"
|
||||
3. **Context Injection** - "System prompt includes dynamic context about app state"
|
||||
4. **Shared Workspace** - "Agent and user work in the same data space"
|
||||
5. **CRUD Completeness** - "Every entity has full CRUD (Create, Read, Update, Delete)"
|
||||
6. **UI Integration** - "Agent actions immediately reflected in UI"
|
||||
7. **Capability Discovery** - "Users can discover what the agent can do"
|
||||
8. **Prompt-Native Features** - "Features are prompts defining outcomes, not code"
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1: Load the Agent-Native Skill
|
||||
|
||||
First, invoke the agent-native-architecture skill to understand all principles:
|
||||
|
||||
```
|
||||
/compound-engineering:agent-native-architecture
|
||||
```
|
||||
|
||||
Select option 7 (action parity) to load the full reference material.
|
||||
|
||||
### Step 2: Launch Parallel Sub-Agents
|
||||
|
||||
Launch 8 parallel sub-agents using the Task tool with `subagent_type: Explore`, one for each principle. Each agent should:
|
||||
|
||||
1. Enumerate ALL instances in the codebase (user actions, tools, contexts, data stores, etc.)
|
||||
2. Check compliance against the principle
|
||||
3. Provide a SPECIFIC SCORE like "X out of Y (percentage%)"
|
||||
4. List specific gaps and recommendations
|
||||
|
||||
<sub-agents>
|
||||
|
||||
**Agent 1: Action Parity**
|
||||
```
|
||||
Audit for ACTION PARITY - "Whatever the user can do, the agent can do."
|
||||
|
||||
Tasks:
|
||||
1. Enumerate ALL user actions in frontend (API calls, button clicks, form submissions)
|
||||
- Search for API service files, fetch calls, form handlers
|
||||
- Check routes and components for user interactions
|
||||
2. Check which have corresponding agent tools
|
||||
- Search for agent tool definitions
|
||||
- Map user actions to agent capabilities
|
||||
3. Score: "Agent can do X out of Y user actions"
|
||||
|
||||
Format:
|
||||
## Action Parity Audit
|
||||
### User Actions Found
|
||||
| Action | Location | Agent Tool | Status |
|
||||
### Score: X/Y (percentage%)
|
||||
### Missing Agent Tools
|
||||
### Recommendations
|
||||
```
|
||||
|
||||
**Agent 2: Tools as Primitives**
|
||||
```
|
||||
Audit for TOOLS AS PRIMITIVES - "Tools provide capability, not behavior."
|
||||
|
||||
Tasks:
|
||||
1. Find and read ALL agent tool files
|
||||
2. Classify each as:
|
||||
- PRIMITIVE (good): read, write, store, list - enables capability without business logic
|
||||
- WORKFLOW (bad): encodes business logic, makes decisions, orchestrates steps
|
||||
3. Score: "X out of Y tools are proper primitives"
|
||||
|
||||
Format:
|
||||
## Tools as Primitives Audit
|
||||
### Tool Analysis
|
||||
| Tool | File | Type | Reasoning |
|
||||
### Score: X/Y (percentage%)
|
||||
### Problematic Tools (workflows that should be primitives)
|
||||
### Recommendations
|
||||
```
|
||||
|
||||
**Agent 3: Context Injection**
|
||||
```
|
||||
Audit for CONTEXT INJECTION - "System prompt includes dynamic context about app state"
|
||||
|
||||
Tasks:
|
||||
1. Find context injection code (search for "context", "system prompt", "inject")
|
||||
2. Read agent prompts and system messages
|
||||
3. Enumerate what IS injected vs what SHOULD be:
|
||||
- Available resources (files, drafts, documents)
|
||||
- User preferences/settings
|
||||
- Recent activity
|
||||
- Available capabilities listed
|
||||
- Session history
|
||||
- Workspace state
|
||||
|
||||
Format:
|
||||
## Context Injection Audit
|
||||
### Context Types Analysis
|
||||
| Context Type | Injected? | Location | Notes |
|
||||
### Score: X/Y (percentage%)
|
||||
### Missing Context
|
||||
### Recommendations
|
||||
```
|
||||
|
||||
**Agent 4: Shared Workspace**
|
||||
```
|
||||
Audit for SHARED WORKSPACE - "Agent and user work in the same data space"
|
||||
|
||||
Tasks:
|
||||
1. Identify all data stores/tables/models
|
||||
2. Check if agents read/write to SAME tables or separate ones
|
||||
3. Look for sandbox isolation anti-pattern (agent has separate data space)
|
||||
|
||||
Format:
|
||||
## Shared Workspace Audit
|
||||
### Data Store Analysis
|
||||
| Data Store | User Access | Agent Access | Shared? |
|
||||
### Score: X/Y (percentage%)
|
||||
### Isolated Data (anti-pattern)
|
||||
### Recommendations
|
||||
```
|
||||
|
||||
**Agent 5: CRUD Completeness**
|
||||
```
|
||||
Audit for CRUD COMPLETENESS - "Every entity has full CRUD"
|
||||
|
||||
Tasks:
|
||||
1. Identify all entities/models in the codebase
|
||||
2. For each entity, check if agent tools exist for:
|
||||
- Create
|
||||
- Read
|
||||
- Update
|
||||
- Delete
|
||||
3. Score per entity and overall
|
||||
|
||||
Format:
|
||||
## CRUD Completeness Audit
|
||||
### Entity CRUD Analysis
|
||||
| Entity | Create | Read | Update | Delete | Score |
|
||||
### Overall Score: X/Y entities with full CRUD (percentage%)
|
||||
### Incomplete Entities (list missing operations)
|
||||
### Recommendations
|
||||
```
|
||||
|
||||
**Agent 6: UI Integration**
|
||||
```
|
||||
Audit for UI INTEGRATION - "Agent actions immediately reflected in UI"
|
||||
|
||||
Tasks:
|
||||
1. Check how agent writes/changes propagate to frontend
|
||||
2. Look for:
|
||||
- Streaming updates (SSE, WebSocket)
|
||||
- Polling mechanisms
|
||||
- Shared state/services
|
||||
- Event buses
|
||||
- File watching
|
||||
3. Identify "silent actions" anti-pattern (agent changes state but UI doesn't update)
|
||||
|
||||
Format:
|
||||
## UI Integration Audit
|
||||
### Agent Action → UI Update Analysis
|
||||
| Agent Action | UI Mechanism | Immediate? | Notes |
|
||||
### Score: X/Y (percentage%)
|
||||
### Silent Actions (anti-pattern)
|
||||
### Recommendations
|
||||
```
|
||||
|
||||
**Agent 7: Capability Discovery**
|
||||
```
|
||||
Audit for CAPABILITY DISCOVERY - "Users can discover what the agent can do"
|
||||
|
||||
Tasks:
|
||||
1. Check for these 7 discovery mechanisms:
|
||||
- Onboarding flow showing agent capabilities
|
||||
- Help documentation
|
||||
- Capability hints in UI
|
||||
- Agent self-describes in responses
|
||||
- Suggested prompts/actions
|
||||
- Empty state guidance
|
||||
- Slash commands (/help, /tools)
|
||||
2. Score against 7 mechanisms
|
||||
|
||||
Format:
|
||||
## Capability Discovery Audit
|
||||
### Discovery Mechanism Analysis
|
||||
| Mechanism | Exists? | Location | Quality |
|
||||
### Score: X/7 (percentage%)
|
||||
### Missing Discovery
|
||||
### Recommendations
|
||||
```
|
||||
|
||||
**Agent 8: Prompt-Native Features**
|
||||
```
|
||||
Audit for PROMPT-NATIVE FEATURES - "Features are prompts defining outcomes, not code"
|
||||
|
||||
Tasks:
|
||||
1. Read all agent prompts
|
||||
2. Classify each feature/behavior as defined in:
|
||||
- PROMPT (good): outcomes defined in natural language
|
||||
- CODE (bad): business logic hardcoded
|
||||
3. Check if behavior changes require prompt edit vs code change
|
||||
|
||||
Format:
|
||||
## Prompt-Native Features Audit
|
||||
### Feature Definition Analysis
|
||||
| Feature | Defined In | Type | Notes |
|
||||
### Score: X/Y (percentage%)
|
||||
### Code-Defined Features (anti-pattern)
|
||||
### Recommendations
|
||||
```
|
||||
|
||||
</sub-agents>
|
||||
|
||||
### Step 3: Compile Summary Report
|
||||
|
||||
After all agents complete, compile a summary with:
|
||||
|
||||
```markdown
|
||||
## Agent-Native Architecture Review: [Project Name]
|
||||
|
||||
### Overall Score Summary
|
||||
|
||||
| Core Principle | Score | Percentage | Status |
|
||||
|----------------|-------|------------|--------|
|
||||
| Action Parity | X/Y | Z% | ✅/⚠️/❌ |
|
||||
| Tools as Primitives | X/Y | Z% | ✅/⚠️/❌ |
|
||||
| Context Injection | X/Y | Z% | ✅/⚠️/❌ |
|
||||
| Shared Workspace | X/Y | Z% | ✅/⚠️/❌ |
|
||||
| CRUD Completeness | X/Y | Z% | ✅/⚠️/❌ |
|
||||
| UI Integration | X/Y | Z% | ✅/⚠️/❌ |
|
||||
| Capability Discovery | X/Y | Z% | ✅/⚠️/❌ |
|
||||
| Prompt-Native Features | X/Y | Z% | ✅/⚠️/❌ |
|
||||
|
||||
**Overall Agent-Native Score: X%**
|
||||
|
||||
### Status Legend
|
||||
- ✅ Excellent (80%+)
|
||||
- ⚠️ Partial (50-79%)
|
||||
- ❌ Needs Work (<50%)
|
||||
|
||||
### Top 10 Recommendations by Impact
|
||||
|
||||
| Priority | Action | Principle | Effort |
|
||||
|----------|--------|-----------|--------|
|
||||
|
||||
### What's Working Excellently
|
||||
|
||||
[List top 5 strengths]
|
||||
```
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] All 8 sub-agents complete their audits
|
||||
- [ ] Each principle has a specific numeric score (X/Y format)
|
||||
- [ ] Summary table shows all scores and status indicators
|
||||
- [ ] Top 10 recommendations are prioritized by impact
|
||||
- [ ] Report identifies both strengths and gaps
|
||||
|
||||
## Optional: Single Principle Audit
|
||||
|
||||
If $ARGUMENTS specifies a single principle (e.g., "action parity"), only run that sub-agent and provide detailed findings for that principle alone.
|
||||
|
||||
Valid arguments:
|
||||
- `action parity` or `1`
|
||||
- `tools` or `primitives` or `2`
|
||||
- `context` or `injection` or `3`
|
||||
- `shared` or `workspace` or `4`
|
||||
- `crud` or `5`
|
||||
- `ui` or `integration` or `6`
|
||||
- `discovery` or `7`
|
||||
- `prompt` or `features` or `8`
|
||||
Reference in New Issue
Block a user