[2.23.0] Add /agent-native-audit command
- New command for comprehensive agent-native architecture review - Launches 8 parallel sub-agents, one per core principle - Principles: Action Parity, Tools as Primitives, Context Injection, Shared Workspace, CRUD Completeness, UI Integration, Capability Discovery, Prompt-Native Features - Each agent produces specific score (X/Y format with percentage) - Generates summary report with overall score and top 10 recommendations - Supports single principle audit via argument Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -1,7 +1,7 @@
|
|||||||
{
|
{
|
||||||
"name": "compound-engineering",
|
"name": "compound-engineering",
|
||||||
"version": "2.22.0",
|
"version": "2.23.0",
|
||||||
"description": "AI-powered development tools. 27 agents, 20 commands, 13 skills, 2 MCP servers for code review, research, design, and workflow automation.",
|
"description": "AI-powered development tools. 27 agents, 21 commands, 13 skills, 2 MCP servers for code review, research, design, and workflow automation.",
|
||||||
"author": {
|
"author": {
|
||||||
"name": "Kieran Klaassen",
|
"name": "Kieran Klaassen",
|
||||||
"email": "kieran@every.to",
|
"email": "kieran@every.to",
|
||||||
|
|||||||
@@ -5,6 +5,23 @@ All notable changes to the compound-engineering plugin will be documented in thi
|
|||||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||||
|
|
||||||
|
## [2.23.0] - 2026-01-08
|
||||||
|
|
||||||
|
### Added
|
||||||
|
|
||||||
|
- **`/agent-native-audit` command** - Comprehensive agent-native architecture review
|
||||||
|
- Launches 8 parallel sub-agents, one per core principle
|
||||||
|
- Principles: Action Parity, Tools as Primitives, Context Injection, Shared Workspace, CRUD Completeness, UI Integration, Capability Discovery, Prompt-Native Features
|
||||||
|
- Each agent produces specific score (X/Y format with percentage)
|
||||||
|
- Generates summary report with overall score and top 10 recommendations
|
||||||
|
- Supports single principle audit via argument
|
||||||
|
|
||||||
|
### Summary
|
||||||
|
|
||||||
|
- 27 agents, 21 commands, 13 skills, 2 MCP servers
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## [2.22.0] - 2026-01-05
|
## [2.22.0] - 2026-01-05
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
|
|||||||
277
plugins/compound-engineering/commands/agent-native-audit.md
Normal file
277
plugins/compound-engineering/commands/agent-native-audit.md
Normal file
@@ -0,0 +1,277 @@
|
|||||||
|
---
|
||||||
|
name: agent-native-audit
|
||||||
|
description: Run comprehensive agent-native architecture review with scored principles
|
||||||
|
argument-hint: "[optional: specific principle to audit]"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Agent-Native Architecture Audit
|
||||||
|
|
||||||
|
Conduct a comprehensive review of the codebase against agent-native architecture principles, launching parallel sub-agents for each principle and producing a scored report.
|
||||||
|
|
||||||
|
## Core Principles to Audit
|
||||||
|
|
||||||
|
1. **Action Parity** - "Whatever the user can do, the agent can do"
|
||||||
|
2. **Tools as Primitives** - "Tools provide capability, not behavior"
|
||||||
|
3. **Context Injection** - "System prompt includes dynamic context about app state"
|
||||||
|
4. **Shared Workspace** - "Agent and user work in the same data space"
|
||||||
|
5. **CRUD Completeness** - "Every entity has full CRUD (Create, Read, Update, Delete)"
|
||||||
|
6. **UI Integration** - "Agent actions immediately reflected in UI"
|
||||||
|
7. **Capability Discovery** - "Users can discover what the agent can do"
|
||||||
|
8. **Prompt-Native Features** - "Features are prompts defining outcomes, not code"
|
||||||
|
|
||||||
|
## Workflow
|
||||||
|
|
||||||
|
### Step 1: Load the Agent-Native Skill
|
||||||
|
|
||||||
|
First, invoke the agent-native-architecture skill to understand all principles:
|
||||||
|
|
||||||
|
```
|
||||||
|
/compound-engineering:agent-native-architecture
|
||||||
|
```
|
||||||
|
|
||||||
|
Select option 7 (action parity) to load the full reference material.
|
||||||
|
|
||||||
|
### Step 2: Launch Parallel Sub-Agents
|
||||||
|
|
||||||
|
Launch 8 parallel sub-agents using the Task tool with `subagent_type: Explore`, one for each principle. Each agent should:
|
||||||
|
|
||||||
|
1. Enumerate ALL instances in the codebase (user actions, tools, contexts, data stores, etc.)
|
||||||
|
2. Check compliance against the principle
|
||||||
|
3. Provide a SPECIFIC SCORE like "X out of Y (percentage%)"
|
||||||
|
4. List specific gaps and recommendations
|
||||||
|
|
||||||
|
<sub-agents>
|
||||||
|
|
||||||
|
**Agent 1: Action Parity**
|
||||||
|
```
|
||||||
|
Audit for ACTION PARITY - "Whatever the user can do, the agent can do."
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
1. Enumerate ALL user actions in frontend (API calls, button clicks, form submissions)
|
||||||
|
- Search for API service files, fetch calls, form handlers
|
||||||
|
- Check routes and components for user interactions
|
||||||
|
2. Check which have corresponding agent tools
|
||||||
|
- Search for agent tool definitions
|
||||||
|
- Map user actions to agent capabilities
|
||||||
|
3. Score: "Agent can do X out of Y user actions"
|
||||||
|
|
||||||
|
Format:
|
||||||
|
## Action Parity Audit
|
||||||
|
### User Actions Found
|
||||||
|
| Action | Location | Agent Tool | Status |
|
||||||
|
### Score: X/Y (percentage%)
|
||||||
|
### Missing Agent Tools
|
||||||
|
### Recommendations
|
||||||
|
```
|
||||||
|
|
||||||
|
**Agent 2: Tools as Primitives**
|
||||||
|
```
|
||||||
|
Audit for TOOLS AS PRIMITIVES - "Tools provide capability, not behavior."
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
1. Find and read ALL agent tool files
|
||||||
|
2. Classify each as:
|
||||||
|
- PRIMITIVE (good): read, write, store, list - enables capability without business logic
|
||||||
|
- WORKFLOW (bad): encodes business logic, makes decisions, orchestrates steps
|
||||||
|
3. Score: "X out of Y tools are proper primitives"
|
||||||
|
|
||||||
|
Format:
|
||||||
|
## Tools as Primitives Audit
|
||||||
|
### Tool Analysis
|
||||||
|
| Tool | File | Type | Reasoning |
|
||||||
|
### Score: X/Y (percentage%)
|
||||||
|
### Problematic Tools (workflows that should be primitives)
|
||||||
|
### Recommendations
|
||||||
|
```
|
||||||
|
|
||||||
|
**Agent 3: Context Injection**
|
||||||
|
```
|
||||||
|
Audit for CONTEXT INJECTION - "System prompt includes dynamic context about app state"
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
1. Find context injection code (search for "context", "system prompt", "inject")
|
||||||
|
2. Read agent prompts and system messages
|
||||||
|
3. Enumerate what IS injected vs what SHOULD be:
|
||||||
|
- Available resources (files, drafts, documents)
|
||||||
|
- User preferences/settings
|
||||||
|
- Recent activity
|
||||||
|
- Available capabilities listed
|
||||||
|
- Session history
|
||||||
|
- Workspace state
|
||||||
|
|
||||||
|
Format:
|
||||||
|
## Context Injection Audit
|
||||||
|
### Context Types Analysis
|
||||||
|
| Context Type | Injected? | Location | Notes |
|
||||||
|
### Score: X/Y (percentage%)
|
||||||
|
### Missing Context
|
||||||
|
### Recommendations
|
||||||
|
```
|
||||||
|
|
||||||
|
**Agent 4: Shared Workspace**
|
||||||
|
```
|
||||||
|
Audit for SHARED WORKSPACE - "Agent and user work in the same data space"
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
1. Identify all data stores/tables/models
|
||||||
|
2. Check if agents read/write to SAME tables or separate ones
|
||||||
|
3. Look for sandbox isolation anti-pattern (agent has separate data space)
|
||||||
|
|
||||||
|
Format:
|
||||||
|
## Shared Workspace Audit
|
||||||
|
### Data Store Analysis
|
||||||
|
| Data Store | User Access | Agent Access | Shared? |
|
||||||
|
### Score: X/Y (percentage%)
|
||||||
|
### Isolated Data (anti-pattern)
|
||||||
|
### Recommendations
|
||||||
|
```
|
||||||
|
|
||||||
|
**Agent 5: CRUD Completeness**
|
||||||
|
```
|
||||||
|
Audit for CRUD COMPLETENESS - "Every entity has full CRUD"
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
1. Identify all entities/models in the codebase
|
||||||
|
2. For each entity, check if agent tools exist for:
|
||||||
|
- Create
|
||||||
|
- Read
|
||||||
|
- Update
|
||||||
|
- Delete
|
||||||
|
3. Score per entity and overall
|
||||||
|
|
||||||
|
Format:
|
||||||
|
## CRUD Completeness Audit
|
||||||
|
### Entity CRUD Analysis
|
||||||
|
| Entity | Create | Read | Update | Delete | Score |
|
||||||
|
### Overall Score: X/Y entities with full CRUD (percentage%)
|
||||||
|
### Incomplete Entities (list missing operations)
|
||||||
|
### Recommendations
|
||||||
|
```
|
||||||
|
|
||||||
|
**Agent 6: UI Integration**
|
||||||
|
```
|
||||||
|
Audit for UI INTEGRATION - "Agent actions immediately reflected in UI"
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
1. Check how agent writes/changes propagate to frontend
|
||||||
|
2. Look for:
|
||||||
|
- Streaming updates (SSE, WebSocket)
|
||||||
|
- Polling mechanisms
|
||||||
|
- Shared state/services
|
||||||
|
- Event buses
|
||||||
|
- File watching
|
||||||
|
3. Identify "silent actions" anti-pattern (agent changes state but UI doesn't update)
|
||||||
|
|
||||||
|
Format:
|
||||||
|
## UI Integration Audit
|
||||||
|
### Agent Action → UI Update Analysis
|
||||||
|
| Agent Action | UI Mechanism | Immediate? | Notes |
|
||||||
|
### Score: X/Y (percentage%)
|
||||||
|
### Silent Actions (anti-pattern)
|
||||||
|
### Recommendations
|
||||||
|
```
|
||||||
|
|
||||||
|
**Agent 7: Capability Discovery**
|
||||||
|
```
|
||||||
|
Audit for CAPABILITY DISCOVERY - "Users can discover what the agent can do"
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
1. Check for these 7 discovery mechanisms:
|
||||||
|
- Onboarding flow showing agent capabilities
|
||||||
|
- Help documentation
|
||||||
|
- Capability hints in UI
|
||||||
|
- Agent self-describes in responses
|
||||||
|
- Suggested prompts/actions
|
||||||
|
- Empty state guidance
|
||||||
|
- Slash commands (/help, /tools)
|
||||||
|
2. Score against 7 mechanisms
|
||||||
|
|
||||||
|
Format:
|
||||||
|
## Capability Discovery Audit
|
||||||
|
### Discovery Mechanism Analysis
|
||||||
|
| Mechanism | Exists? | Location | Quality |
|
||||||
|
### Score: X/7 (percentage%)
|
||||||
|
### Missing Discovery
|
||||||
|
### Recommendations
|
||||||
|
```
|
||||||
|
|
||||||
|
**Agent 8: Prompt-Native Features**
|
||||||
|
```
|
||||||
|
Audit for PROMPT-NATIVE FEATURES - "Features are prompts defining outcomes, not code"
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
1. Read all agent prompts
|
||||||
|
2. Classify each feature/behavior as defined in:
|
||||||
|
- PROMPT (good): outcomes defined in natural language
|
||||||
|
- CODE (bad): business logic hardcoded
|
||||||
|
3. Check if behavior changes require prompt edit vs code change
|
||||||
|
|
||||||
|
Format:
|
||||||
|
## Prompt-Native Features Audit
|
||||||
|
### Feature Definition Analysis
|
||||||
|
| Feature | Defined In | Type | Notes |
|
||||||
|
### Score: X/Y (percentage%)
|
||||||
|
### Code-Defined Features (anti-pattern)
|
||||||
|
### Recommendations
|
||||||
|
```
|
||||||
|
|
||||||
|
</sub-agents>
|
||||||
|
|
||||||
|
### Step 3: Compile Summary Report
|
||||||
|
|
||||||
|
After all agents complete, compile a summary with:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
## Agent-Native Architecture Review: [Project Name]
|
||||||
|
|
||||||
|
### Overall Score Summary
|
||||||
|
|
||||||
|
| Core Principle | Score | Percentage | Status |
|
||||||
|
|----------------|-------|------------|--------|
|
||||||
|
| Action Parity | X/Y | Z% | ✅/⚠️/❌ |
|
||||||
|
| Tools as Primitives | X/Y | Z% | ✅/⚠️/❌ |
|
||||||
|
| Context Injection | X/Y | Z% | ✅/⚠️/❌ |
|
||||||
|
| Shared Workspace | X/Y | Z% | ✅/⚠️/❌ |
|
||||||
|
| CRUD Completeness | X/Y | Z% | ✅/⚠️/❌ |
|
||||||
|
| UI Integration | X/Y | Z% | ✅/⚠️/❌ |
|
||||||
|
| Capability Discovery | X/Y | Z% | ✅/⚠️/❌ |
|
||||||
|
| Prompt-Native Features | X/Y | Z% | ✅/⚠️/❌ |
|
||||||
|
|
||||||
|
**Overall Agent-Native Score: X%**
|
||||||
|
|
||||||
|
### Status Legend
|
||||||
|
- ✅ Excellent (80%+)
|
||||||
|
- ⚠️ Partial (50-79%)
|
||||||
|
- ❌ Needs Work (<50%)
|
||||||
|
|
||||||
|
### Top 10 Recommendations by Impact
|
||||||
|
|
||||||
|
| Priority | Action | Principle | Effort |
|
||||||
|
|----------|--------|-----------|--------|
|
||||||
|
|
||||||
|
### What's Working Excellently
|
||||||
|
|
||||||
|
[List top 5 strengths]
|
||||||
|
```
|
||||||
|
|
||||||
|
## Success Criteria
|
||||||
|
|
||||||
|
- [ ] All 8 sub-agents complete their audits
|
||||||
|
- [ ] Each principle has a specific numeric score (X/Y format)
|
||||||
|
- [ ] Summary table shows all scores and status indicators
|
||||||
|
- [ ] Top 10 recommendations are prioritized by impact
|
||||||
|
- [ ] Report identifies both strengths and gaps
|
||||||
|
|
||||||
|
## Optional: Single Principle Audit
|
||||||
|
|
||||||
|
If $ARGUMENTS specifies a single principle (e.g., "action parity"), only run that sub-agent and provide detailed findings for that principle alone.
|
||||||
|
|
||||||
|
Valid arguments:
|
||||||
|
- `action parity` or `1`
|
||||||
|
- `tools` or `primitives` or `2`
|
||||||
|
- `context` or `injection` or `3`
|
||||||
|
- `shared` or `workspace` or `4`
|
||||||
|
- `crud` or `5`
|
||||||
|
- `ui` or `integration` or `6`
|
||||||
|
- `discovery` or `7`
|
||||||
|
- `prompt` or `features` or `8`
|
||||||
Reference in New Issue
Block a user