diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
index 9084f9a..b69e512 100644
--- a/.claude-plugin/marketplace.json
+++ b/.claude-plugin/marketplace.json
@@ -12,7 +12,7 @@
{
"name": "compound-engineering",
"description": "AI-powered development tools that get smarter with every use. Make each unit of engineering work easier than the last. Includes 27 specialized agents, 19 commands, and 13 skills.",
- "version": "2.15.2",
+ "version": "2.18.0",
"author": {
"name": "Kieran Klaassen",
"url": "https://github.com/kieranklaassen",
diff --git a/plugins/compound-engineering/.claude-plugin/plugin.json b/plugins/compound-engineering/.claude-plugin/plugin.json
index e79cd6f..18215f3 100644
--- a/plugins/compound-engineering/.claude-plugin/plugin.json
+++ b/plugins/compound-engineering/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
{
"name": "compound-engineering",
- "version": "2.16.0",
+ "version": "2.18.0",
"description": "AI-powered development tools. 27 agents, 19 commands, 13 skills, 2 MCP servers for code review, research, design, and workflow automation.",
"author": {
"name": "Kieran Klaassen",
diff --git a/plugins/compound-engineering/CHANGELOG.md b/plugins/compound-engineering/CHANGELOG.md
index 63cad2c..a9c4481 100644
--- a/plugins/compound-engineering/CHANGELOG.md
+++ b/plugins/compound-engineering/CHANGELOG.md
@@ -5,6 +5,58 @@ All notable changes to the compound-engineering plugin will be documented in thi
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [2.18.0] - 2025-12-25
+
+### Added
+
+- **`agent-native-architecture` skill** - Added **Dynamic Capability Discovery** pattern and **Architecture Review Checklist**:
+
+ **New Patterns in mcp-tool-design.md:**
+ - **Dynamic Capability Discovery** - For external APIs (HealthKit, HomeKit, GraphQL), build a discovery tool (`list_*`) that returns available capabilities at runtime, plus a generic access tool that takes strings (not enums). The API validates, not your code. This means agents can use new API capabilities without code changes.
+ - **CRUD Completeness** - Every entity the agent can create must also be readable, updatable, and deletable. Incomplete CRUD = broken action parity.
+
+ **New in SKILL.md:**
+ - **Architecture Review Checklist** - Pushes reviewer findings earlier into the design phase. Covers tool design (dynamic vs static, CRUD completeness), action parity (capability map, edit/delete), UI integration (agent → UI communication), and context injection.
+ - **Option 11: API Integration** - New intake option for connecting to external APIs like HealthKit, HomeKit, GraphQL
+ - **New anti-patterns:** Static Tool Mapping (building individual tools for each API endpoint), Incomplete CRUD (create-only tools)
+ - **Tool Design Criteria** section added to success criteria checklist
+
+ **New in shared-workspace-architecture.md:**
+ - **iCloud File Storage for Multi-Device Sync** - Use iCloud Documents for your shared workspace to get free, automatic multi-device sync without building a sync layer. Includes implementation pattern, conflict handling, entitlements, and when NOT to use it.
+
+### Philosophy
+
+This update codifies a key insight for **agent-native apps**: when integrating with external APIs where the agent should have the same access as the user, use **Dynamic Capability Discovery** instead of static tool mapping. Instead of building `read_steps`, `read_heart_rate`, `read_sleep`... build `list_health_types` + `read_health_data(dataType: string)`. The agent discovers what's available, the API validates the type.
+
+Note: This pattern is specifically for agent-native apps following the "whatever the user can do, the agent can do" philosophy. For constrained agents with intentionally limited capabilities, static tool mapping may be appropriate.
+
+---
+
+## [2.17.0] - 2025-12-25
+
+### Enhanced
+
+- **`agent-native-architecture` skill** - Major expansion based on real-world learnings from building the Every Reader iOS app. Added 5 new reference documents and expanded existing ones:
+
+ **New References:**
+ - **dynamic-context-injection.md** - How to inject runtime app state into agent system prompts. Covers context injection patterns, what context to inject (resources, activity, capabilities, vocabulary), implementation patterns for Swift/iOS and TypeScript, and context freshness.
+ - **action-parity-discipline.md** - Workflow for ensuring agents can do everything users can do. Includes capability mapping templates, parity audit process, PR checklists, tool design for parity, and context parity guidelines.
+ - **shared-workspace-architecture.md** - Patterns for agents and users working in the same data space. Covers directory structure, file tools, UI integration (file watching, shared stores), agent-user collaboration patterns, and security considerations.
+ - **agent-native-testing.md** - Testing patterns for agent-native apps. Includes "Can Agent Do It?" tests, the Surprise Test, automated parity testing, integration testing, and CI/CD integration.
+ - **mobile-patterns.md** - Mobile-specific patterns for iOS/Android. Covers background execution (checkpoint/resume), permission handling, cost-aware design (model tiers, token budgets, network awareness), offline handling, and battery awareness.
+
+ **Updated References:**
+ - **architecture-patterns.md** - Added 3 new patterns: Unified Agent Architecture (one orchestrator, many agent types), Agent-to-UI Communication (shared data store, file watching, event bus), and Model Tier Selection (fast/balanced/powerful).
+
+ **Updated Skill Root:**
+ - **SKILL.md** - Expanded intake menu (now 10 options including context injection, action parity, shared workspace, testing, mobile patterns). Added 5 new agent-native anti-patterns (Context Starvation, Orphan Features, Sandbox Isolation, Silent Actions, Capability Hiding). Expanded success criteria with agent-native and mobile-specific checklists.
+
+- **`agent-native-reviewer` agent** - Significantly enhanced with comprehensive review process covering all new patterns. Now checks for action parity, context parity, shared workspace, tool design (primitives vs workflows), dynamic context injection, and mobile-specific concerns. Includes detailed anti-patterns, output format template, quick checks ("Write to Location" test, Surprise test), and mobile-specific verification.
+
+### Philosophy
+
+These updates operationalize a key insight from building agent-native mobile apps: **"The agent should be able to do anything the user can do, through tools that mirror UI capabilities, with full context about the app state."** The failure case that prompted these changes: an agent asked "what reading feed?" when a user said "write something in my reading feed"—because it had no `publish_to_feed` tool and no context about what "feed" meant.
+
## [2.16.0] - 2025-12-21
### Enhanced
diff --git a/plugins/compound-engineering/agents/review/agent-native-reviewer.md b/plugins/compound-engineering/agents/review/agent-native-reviewer.md
index 309169d..badf9e5 100644
--- a/plugins/compound-engineering/agents/review/agent-native-reviewer.md
+++ b/plugins/compound-engineering/agents/review/agent-native-reviewer.md
@@ -3,89 +3,243 @@ name: agent-native-reviewer
description: Use this agent when reviewing code to ensure features are agent-native - that any action a user can take, an agent can also take, and anything a user can see, an agent can see. This enforces the principle that agents should have parity with users in capability and context. Context: The user added a new feature to their application.\nuser: "I just implemented a new email filtering feature"\nassistant: "I'll use the agent-native-reviewer to verify this feature is accessible to agents"\nNew features need agent-native review to ensure agents can also filter emails, not just humans through UI.Context: The user created a new UI workflow.\nuser: "I added a multi-step wizard for creating reports"\nassistant: "Let me check if this workflow is agent-native using the agent-native-reviewer"\nUI workflows often miss agent accessibility - the reviewer checks for API/tool equivalents.
---
-You are an Agent-Native Architecture Reviewer. Your role is to ensure that every feature added to a codebase follows the agent-native principle:
+# Agent-Native Architecture Reviewer
-**THE FOUNDATIONAL PRINCIPLE: Whatever the user can do, the agent can do. Whatever the user can see, the agent can see.**
+You are an expert reviewer specializing in agent-native application architecture. Your role is to review code, PRs, and application designs to ensure they follow agent-native principles—where agents are first-class citizens with the same capabilities as users, not bolt-on features.
-## Your Review Criteria
+## Core Principles You Enforce
-For every new feature or change, verify:
+1. **Action Parity**: Every UI action should have an equivalent agent tool
+2. **Context Parity**: Agents should see the same data users see
+3. **Shared Workspace**: Agents and users work in the same data space
+4. **Primitives over Workflows**: Tools should be primitives, not encoded business logic
+5. **Dynamic Context Injection**: System prompts should include runtime app state
-### 1. Action Parity
-- [ ] Every UI action has an equivalent API/tool the agent can call
-- [ ] No "UI-only" workflows that require human interaction
-- [ ] Agents can trigger the same business logic humans can
-- [ ] No artificial limits on agent capabilities
+## Review Process
-### 2. Context Parity
-- [ ] Data visible to users is accessible to agents (via API/tools)
-- [ ] Agents can read the same context humans see
-- [ ] No hidden state that only the UI can access
-- [ ] Real-time data available to both humans and agents
+### Step 1: Understand the Codebase
-### 3. Tool Design (if applicable)
-- [ ] Tools are primitives that provide capability, not behavior
-- [ ] Features are defined in prompts, not hardcoded in tool logic
-- [ ] Tools don't artificially constrain what agents can do
-- [ ] Proper MCP tool definitions exist for new capabilities
+First, explore to understand:
+- What UI actions exist in the app?
+- What agent tools are defined?
+- How is the system prompt constructed?
+- Where does the agent get its context?
-### 4. API Surface
-- [ ] New features exposed via API endpoints
-- [ ] Consistent API patterns for agent consumption
-- [ ] Proper authentication for agent access
-- [ ] No rate-limiting that unfairly penalizes agents
+### Step 2: Check Action Parity
-## Analysis Process
+For every UI action you find, verify:
+- [ ] A corresponding agent tool exists
+- [ ] The tool is documented in the system prompt
+- [ ] The agent has access to the same data the UI uses
-1. **Identify New Capabilities**: What can users now do that they couldn't before?
+**Look for:**
+- SwiftUI: `Button`, `onTapGesture`, `.onSubmit`, navigation actions
+- React: `onClick`, `onSubmit`, form actions, navigation
+- Flutter: `onPressed`, `onTap`, gesture handlers
-2. **Check Agent Access**: For each capability:
- - Can an agent trigger this action?
- - Can an agent see the results?
- - Is there a documented way for agents to use this?
+**Create a capability map:**
+```
+| UI Action | Location | Agent Tool | System Prompt | Status |
+|-----------|----------|------------|---------------|--------|
+```
-3. **Find Gaps**: List any capabilities that are human-only
+### Step 3: Check Context Parity
-4. **Recommend Solutions**: For each gap, suggest how to make it agent-native
+Verify the system prompt includes:
+- [ ] Available resources (books, files, data the user can see)
+- [ ] Recent activity (what the user has done)
+- [ ] Capabilities mapping (what tool does what)
+- [ ] Domain vocabulary (app-specific terms explained)
-## Output Format
+**Red flags:**
+- Static system prompts with no runtime context
+- Agent doesn't know what resources exist
+- Agent doesn't understand app-specific terms
-Provide findings in this structure:
+### Step 4: Check Tool Design
+
+For each tool, verify:
+- [ ] Tool is a primitive (read, write, store), not a workflow
+- [ ] Inputs are data, not decisions
+- [ ] No business logic in the tool implementation
+- [ ] Rich output that helps agent verify success
+
+**Red flags:**
+```typescript
+// BAD: Tool encodes business logic
+tool("process_feedback", async ({ message }) => {
+ const category = categorize(message); // Logic in tool
+ const priority = calculatePriority(message); // Logic in tool
+ if (priority > 3) await notify(); // Decision in tool
+});
+
+// GOOD: Tool is a primitive
+tool("store_item", async ({ key, value }) => {
+ await db.set(key, value);
+ return { text: `Stored ${key}` };
+});
+```
+
+### Step 5: Check Shared Workspace
+
+Verify:
+- [ ] Agents and users work in the same data space
+- [ ] Agent file operations use the same paths as the UI
+- [ ] UI observes changes the agent makes (file watching or shared store)
+- [ ] No separate "agent sandbox" isolated from user data
+
+**Red flags:**
+- Agent writes to `agent_output/` instead of user's documents
+- Sync layer needed to move data between agent and user spaces
+- User can't inspect or edit agent-created files
+
+## Common Anti-Patterns to Flag
+
+### 1. Context Starvation
+Agent doesn't know what resources exist.
+```
+User: "Write something about Catherine the Great in my feed"
+Agent: "What feed? I don't understand."
+```
+**Fix:** Inject available resources and capabilities into system prompt.
+
+### 2. Orphan Features
+UI action with no agent equivalent.
+```swift
+// UI has this button
+Button("Publish to Feed") { publishToFeed(insight) }
+
+// But no tool exists for agent to do the same
+// Agent can't help user publish to feed
+```
+**Fix:** Add corresponding tool and document in system prompt.
+
+### 3. Sandbox Isolation
+Agent works in separate data space from user.
+```
+Documents/
+├── user_files/ ← User's space
+└── agent_output/ ← Agent's space (isolated)
+```
+**Fix:** Use shared workspace architecture.
+
+### 4. Silent Actions
+Agent changes state but UI doesn't update.
+```typescript
+// Agent writes to feed
+await feedService.add(item);
+
+// But UI doesn't observe feedService
+// User doesn't see the new item until refresh
+```
+**Fix:** Use shared data store with reactive binding, or file watching.
+
+### 5. Capability Hiding
+Users can't discover what agents can do.
+```
+User: "Can you help me with my reading?"
+Agent: "Sure, what would you like help with?"
+// Agent doesn't mention it can publish to feed, research books, etc.
+```
+**Fix:** Add capability hints to agent responses, or onboarding.
+
+### 6. Workflow Tools
+Tools that encode business logic instead of being primitives.
+**Fix:** Extract primitives, move logic to system prompt.
+
+### 7. Decision Inputs
+Tools that accept decisions instead of data.
+```typescript
+// BAD: Tool accepts decision
+tool("format_report", { format: z.enum(["markdown", "html", "pdf"]) })
+
+// GOOD: Agent decides, tool just writes
+tool("write_file", { path: z.string(), content: z.string() })
+```
+
+## Review Output Format
+
+Structure your review as:
```markdown
-## Agent-Native Review
+## Agent-Native Architecture Review
-### New Capabilities Identified
-- [List what the PR/changes add]
+### Summary
+[One paragraph assessment of agent-native compliance]
-### Agent Accessibility Check
+### Capability Map
-| Capability | User Access | Agent Access | Gap? |
-|------------|-------------|--------------|------|
-| [Feature 1] | UI button | API endpoint | No |
-| [Feature 2] | Modal form | None | YES |
+| UI Action | Location | Agent Tool | Prompt Ref | Status |
+|-----------|----------|------------|------------|--------|
+| ... | ... | ... | ... | ✅/⚠️/❌ |
-### Gaps Found
-1. **[Gap Name]**: [Description of what users can do but agents cannot]
- - **Impact**: [Why this matters]
- - **Recommendation**: [How to fix]
+### Findings
+
+#### Critical Issues (Must Fix)
+1. **[Issue Name]**: [Description]
+ - Location: [file:line]
+ - Impact: [What breaks]
+ - Fix: [How to fix]
+
+#### Warnings (Should Fix)
+1. **[Issue Name]**: [Description]
+ - Location: [file:line]
+ - Recommendation: [How to improve]
+
+#### Observations (Consider)
+1. **[Observation]**: [Description and suggestion]
+
+### Recommendations
+
+1. [Prioritized list of improvements]
+2. ...
+
+### What's Working Well
+
+- [Positive observations about agent-native patterns in use]
### Agent-Native Score
- **X/Y capabilities are agent-accessible**
- **Verdict**: [PASS/NEEDS WORK]
```
-## Common Anti-Patterns to Flag
+## Review Triggers
-1. **UI-Only Features**: Actions that only work through clicks/forms
-2. **Hidden Context**: Data shown in UI but not in API responses
-3. **Workflow Lock-in**: Multi-step processes that require human navigation
-4. **Hardcoded Limits**: Artificial restrictions on agent actions
-5. **Missing Tools**: No MCP tool definition for new capabilities
-6. **Behavior-Encoding Tools**: Tools that decide HOW to do things instead of providing primitives
+Use this review when:
+- PRs add new UI features (check for tool parity)
+- PRs add new agent tools (check for proper design)
+- PRs modify system prompts (check for completeness)
+- Periodic architecture audits
+- User reports agent confusion ("agent didn't understand X")
-## Remember
+## Quick Checks
-The goal is not to add overhead - it's to ensure agents are first-class citizens. Many times, making something agent-native actually simplifies the architecture because you're building a clean API that both UI and agents consume.
+### The "Write to Location" Test
+Ask: "If a user said 'write something to [location]', would the agent know how?"
-When reviewing, ask: "Could an autonomous agent use this feature to help the user, or are we forcing humans to do it manually?"
+For every noun in your app (feed, library, profile, settings), the agent should:
+1. Know what it is (context injection)
+2. Have a tool to interact with it (action parity)
+3. Be documented in the system prompt (discoverability)
+
+### The Surprise Test
+Ask: "If given an open-ended request, can the agent figure out a creative approach?"
+
+Good agents use available tools creatively. If the agent can only do exactly what you hardcoded, you have workflow tools instead of primitives.
+
+## Mobile-Specific Checks
+
+For iOS/Android apps, also verify:
+- [ ] Background execution handling (checkpoint/resume)
+- [ ] Permission requests in tools (photo library, files, etc.)
+- [ ] Cost-aware design (batch calls, defer to WiFi)
+- [ ] Offline graceful degradation
+
+## Questions to Ask During Review
+
+1. "Can the agent do everything the user can do?"
+2. "Does the agent know what resources exist?"
+3. "Can users inspect and edit agent work?"
+4. "Are tools primitives or workflows?"
+5. "Would a new feature require a new tool, or just a prompt update?"
+6. "If this fails, how does the agent (and user) know?"
diff --git a/plugins/compound-engineering/skills/agent-native-architecture/SKILL.md b/plugins/compound-engineering/skills/agent-native-architecture/SKILL.md
index 67090f2..10d5a23 100644
--- a/plugins/compound-engineering/skills/agent-native-architecture/SKILL.md
+++ b/plugins/compound-engineering/skills/agent-native-architecture/SKILL.md
@@ -65,6 +65,12 @@ What aspect of agent native architecture do you need help with?
3. **Write system prompts** - Define agent behavior in prompts
4. **Self-modification** - Enable agents to safely evolve themselves
5. **Review/refactor** - Make existing code more prompt-native
+6. **Context injection** - Inject runtime app state into agent prompts
+7. **Action parity** - Ensure agents can do everything users can do
+8. **Shared workspace** - Set up agents and users in the same data space
+9. **Testing** - Test agent-native apps for capability and parity
+10. **Mobile patterns** - Handle background execution, permissions, cost
+11. **API integration** - Connect to external APIs (HealthKit, HomeKit, GraphQL)
**Wait for response before proceeding.**
@@ -72,15 +78,55 @@ What aspect of agent native architecture do you need help with?
| Response | Action |
|----------|--------|
-| 1, "design", "architecture", "plan" | Read [architecture-patterns.md](./references/architecture-patterns.md) |
+| 1, "design", "architecture", "plan" | Read [architecture-patterns.md](./references/architecture-patterns.md), then apply Architecture Checklist below |
| 2, "tool", "mcp", "primitive" | Read [mcp-tool-design.md](./references/mcp-tool-design.md) |
| 3, "prompt", "system prompt", "behavior" | Read [system-prompt-design.md](./references/system-prompt-design.md) |
| 4, "self-modify", "evolve", "git" | Read [self-modification.md](./references/self-modification.md) |
| 5, "review", "refactor", "existing" | Read [refactoring-to-prompt-native.md](./references/refactoring-to-prompt-native.md) |
+| 6, "context", "inject", "runtime", "dynamic" | Read [dynamic-context-injection.md](./references/dynamic-context-injection.md) |
+| 7, "parity", "ui action", "capability map" | Read [action-parity-discipline.md](./references/action-parity-discipline.md) |
+| 8, "workspace", "shared", "files", "filesystem" | Read [shared-workspace-architecture.md](./references/shared-workspace-architecture.md) |
+| 9, "test", "testing", "verify", "validate" | Read [agent-native-testing.md](./references/agent-native-testing.md) |
+| 10, "mobile", "ios", "android", "background" | Read [mobile-patterns.md](./references/mobile-patterns.md) |
+| 11, "api", "healthkit", "homekit", "graphql", "external" | Read [mcp-tool-design.md](./references/mcp-tool-design.md) (Dynamic Capability Discovery section) |
**After reading the reference, apply those patterns to the user's specific context.**
+
+## Architecture Review Checklist (Apply During Design)
+
+When designing an agent-native system, verify these **before implementation**:
+
+### Tool Design
+- [ ] **Dynamic vs Static:** For external APIs where agent should have full user-level access (HealthKit, HomeKit, GraphQL), use Dynamic Capability Discovery. Only use static mapping if intentionally limiting agent scope.
+- [ ] **CRUD Completeness:** Every entity has create, read, update, AND delete tools
+- [ ] **Primitives not Workflows:** Tools enable capability, they don't encode business logic
+- [ ] **API as Validator:** Use `z.string()` inputs when the API validates, not `z.enum()`
+
+### Action Parity
+- [ ] **Capability Map:** Every UI action has a corresponding agent tool
+- [ ] **Edit/Delete:** If UI can edit or delete, agent must be able to too
+- [ ] **The Write Test:** "Write something to [app location]" must work for all locations
+
+### UI Integration
+- [ ] **Agent → UI:** Define how agent changes reflect in UI (shared service, file watching, or event bus)
+- [ ] **No Silent Actions:** Agent writes should trigger UI updates immediately
+- [ ] **Capability Discovery:** Users can learn what agent can do (onboarding, hints)
+
+### Context Injection
+- [ ] **Available Resources:** System prompt includes what exists (files, data, types)
+- [ ] **Available Capabilities:** System prompt documents what agent can do with user vocabulary
+- [ ] **Dynamic Context:** Context refreshes for long sessions (or provide `refresh_context` tool)
+
+### Mobile (if applicable)
+- [ ] **Background Execution:** Checkpoint/resume pattern for iOS app suspension
+- [ ] **Permissions:** Just-in-time permission requests in tools
+- [ ] **Cost Awareness:** Model tier selection (Haiku/Sonnet/Opus)
+
+**When designing architecture, explicitly address each checkbox in your plan.**
+
+
Build a prompt-native agent in three steps:
@@ -123,11 +169,19 @@ query({
All references in `references/`:
+**Core Patterns:**
- **Architecture:** [architecture-patterns.md](./references/architecture-patterns.md)
-- **Tool Design:** [mcp-tool-design.md](./references/mcp-tool-design.md)
+- **Tool Design:** [mcp-tool-design.md](./references/mcp-tool-design.md) - includes Dynamic Capability Discovery, CRUD Completeness
- **Prompts:** [system-prompt-design.md](./references/system-prompt-design.md)
- **Self-Modification:** [self-modification.md](./references/self-modification.md)
- **Refactoring:** [refactoring-to-prompt-native.md](./references/refactoring-to-prompt-native.md)
+
+**Agent-Native Disciplines:**
+- **Context Injection:** [dynamic-context-injection.md](./references/dynamic-context-injection.md)
+- **Action Parity:** [action-parity-discipline.md](./references/action-parity-discipline.md)
+- **Shared Workspace:** [shared-workspace-architecture.md](./references/shared-workspace-architecture.md)
+- **Testing:** [agent-native-testing.md](./references/agent-native-testing.md)
+- **Mobile Patterns:** [mobile-patterns.md](./references/mobile-patterns.md)
@@ -186,11 +240,80 @@ each under 20 words, formatted with em-dashes...
// Right - define outcome, trust intelligence
Create clear, useful summaries. Use your judgment.
```
+
+### Agent-Native Anti-Patterns
+
+**Context Starvation**
+Agent doesn't know what resources exist in the app.
+```
+User: "Write something about Catherine the Great in my feed"
+Agent: "What feed? I don't understand what system you're referring to."
+```
+Fix: Inject available resources, capabilities, and vocabulary into the system prompt at runtime.
+
+**Orphan Features**
+UI action with no agent equivalent.
+```swift
+// UI has a "Publish to Feed" button
+Button("Publish") { publishToFeed(insight) }
+// But no agent tool exists to do the same thing
+```
+Fix: Add corresponding tool and document in system prompt for every UI action.
+
+**Sandbox Isolation**
+Agent works in separate data space from user.
+```
+Documents/
+├── user_files/ ← User's space
+└── agent_output/ ← Agent's space (isolated)
+```
+Fix: Use shared workspace where both agent and user operate on the same files.
+
+**Silent Actions**
+Agent changes state but UI doesn't update.
+```typescript
+// Agent writes to database
+await db.insert("feed", content);
+// But UI doesn't observe this table - user sees nothing
+```
+Fix: Use shared data stores with reactive binding, or file system observation.
+
+**Capability Hiding**
+Users can't discover what agents can do.
+```
+User: "Help me with my reading"
+Agent: "What would you like help with?"
+// Agent doesn't mention it can publish to feed, research books, etc.
+```
+Fix: Include capability hints in agent responses or provide onboarding.
+
+**Static Tool Mapping (for agent-native apps)**
+Building individual tools for each API endpoint when you want the agent to have full access.
+```typescript
+// You built 50 tools for 50 HealthKit types
+tool("read_steps", ...)
+tool("read_heart_rate", ...)
+tool("read_sleep", ...)
+// When glucose tracking is added... code change required
+// Agent can only access what you anticipated
+```
+Fix: Use Dynamic Capability Discovery - one `list_*` tool to discover what's available, one generic tool to access any type. See [mcp-tool-design.md](./references/mcp-tool-design.md). (Note: Static mapping is fine for constrained agents with intentionally limited scope.)
+
+**Incomplete CRUD**
+Agent can create but not update or delete.
+```typescript
+// ❌ User: "Delete that journal entry"
+// Agent: "I don't have a tool for that"
+tool("create_journal_entry", ...)
+// Missing: update_journal_entry, delete_journal_entry
+```
+Fix: Every entity needs full CRUD (Create, Read, Update, Delete). The CRUD Audit: for each entity, verify all four operations exist.
You've built a prompt-native agent when:
+**Core Prompt-Native Criteria:**
- [ ] The agent figures out HOW to achieve outcomes, not just calls your functions
- [ ] Whatever a user could do, the agent can do (no artificial limits)
- [ ] Features are prompts that define outcomes, not code that defines workflows
@@ -198,4 +321,25 @@ You've built a prompt-native agent when:
- [ ] Changing behavior means editing prose, not refactoring code
- [ ] The agent can surprise you with clever approaches you didn't anticipate
- [ ] You could add a new feature by writing a new prompt section, not new code
+
+**Tool Design Criteria:**
+- [ ] External APIs (where agent should have full access) use Dynamic Capability Discovery
+- [ ] Every entity has full CRUD (Create, Read, Update, Delete)
+- [ ] API validates inputs, not your enum definitions
+- [ ] Discovery tools exist for each API surface (`list_*`, `discover_*`)
+
+**Agent-Native Criteria:**
+- [ ] System prompt includes dynamic context about app state (available resources, recent activity)
+- [ ] Every UI action has a corresponding agent tool (action parity)
+- [ ] Agent tools are documented in the system prompt with user vocabulary
+- [ ] Agent and user work in the same data space (shared workspace)
+- [ ] Agent actions are immediately reflected in the UI (shared service, file watching, or event bus)
+- [ ] The "write something to [app location]" test passes for all locations
+- [ ] Users can discover what the agent can do (capability hints, onboarding)
+- [ ] Context refreshes for long sessions (or `refresh_context` tool exists)
+
+**Mobile-Specific Criteria (if applicable):**
+- [ ] Background execution handling implemented (checkpoint/resume)
+- [ ] Permission requests handled gracefully in tools
+- [ ] Cost-aware design (appropriate model tiers, batching)
diff --git a/plugins/compound-engineering/skills/agent-native-architecture/references/action-parity-discipline.md b/plugins/compound-engineering/skills/agent-native-architecture/references/action-parity-discipline.md
new file mode 100644
index 0000000..1b68273
--- /dev/null
+++ b/plugins/compound-engineering/skills/agent-native-architecture/references/action-parity-discipline.md
@@ -0,0 +1,409 @@
+
+A structured discipline for ensuring agents can do everything users can do. Every UI action should have an equivalent agent tool. This isn't a one-time check—it's an ongoing practice integrated into your development workflow.
+
+**Core principle:** When adding a UI feature, add the corresponding tool in the same PR.
+
+
+
+## Why Action Parity Matters
+
+**The failure case:**
+```
+User: "Write something about Catherine the Great in my reading feed"
+Agent: "What system are you referring to? I'm not sure what reading feed means."
+```
+
+The user could publish to their feed through the UI. But the agent had no `publish_to_feed` tool. The fix was simple—add the tool. But the insight is profound:
+
+**Every action a user can take through the UI must have an equivalent tool the agent can call.**
+
+Without this parity:
+- Users ask agents to do things they can't do
+- Agents ask clarifying questions about features they should understand
+- The agent feels limited compared to direct app usage
+- Users lose trust in the agent's capabilities
+
+
+
+## The Capability Map
+
+Maintain a structured map of UI actions to agent tools:
+
+| UI Action | UI Location | Agent Tool | System Prompt Reference |
+|-----------|-------------|------------|-------------------------|
+| View library | Library tab | `read_library` | "View books and highlights" |
+| Add book | Library → Add | `add_book` | "Add books to library" |
+| Publish insight | Analysis view | `publish_to_feed` | "Create insights for Feed tab" |
+| Start research | Book detail | `start_research` | "Research books via web search" |
+| Edit profile | Settings | `write_file(profile.md)` | "Update reading profile" |
+| Take screenshot | Camera | N/A (user action) | — |
+| Search web | Chat | `web_search` | "Search the internet" |
+
+**Update this table whenever adding features.**
+
+### Template for Your App
+
+```markdown
+# Capability Map - [Your App Name]
+
+| UI Action | UI Location | Agent Tool | System Prompt | Status |
+|-----------|-------------|------------|---------------|--------|
+| | | | | ⚠️ Missing |
+| | | | | ✅ Done |
+| | | | | 🚫 N/A |
+```
+
+Status meanings:
+- ✅ Done: Tool exists and is documented in system prompt
+- ⚠️ Missing: UI action exists but no agent equivalent
+- 🚫 N/A: User-only action (e.g., biometric auth, camera capture)
+
+
+
+## The Action Parity Workflow
+
+### When Adding a New Feature
+
+Before merging any PR that adds UI functionality:
+
+```
+1. What action is this?
+ → "User can publish an insight to their reading feed"
+
+2. Does an agent tool exist for this?
+ → Check tool definitions
+ → If NO: Create the tool
+
+3. Is it documented in the system prompt?
+ → Check system prompt capabilities section
+ → If NO: Add documentation
+
+4. Is the context available?
+ → Does agent know what "feed" means?
+ → Does agent see available books?
+ → If NO: Add to context injection
+
+5. Update the capability map
+ → Add row to tracking document
+```
+
+### PR Checklist
+
+Add to your PR template:
+
+```markdown
+## Agent-Native Checklist
+
+- [ ] Every new UI action has a corresponding agent tool
+- [ ] System prompt updated to mention new capability
+- [ ] Agent has access to same data UI uses
+- [ ] Capability map updated
+- [ ] Tested with natural language request
+```
+
+
+
+## The Parity Audit
+
+Periodically audit your app for action parity gaps:
+
+### Step 1: List All UI Actions
+
+Walk through every screen and list what users can do:
+
+```
+Library Screen:
+- View list of books
+- Search books
+- Filter by category
+- Add new book
+- Delete book
+- Open book detail
+
+Book Detail Screen:
+- View book info
+- Start research
+- View highlights
+- Add highlight
+- Share book
+- Remove from library
+
+Feed Screen:
+- View insights
+- Create new insight
+- Edit insight
+- Delete insight
+- Share insight
+
+Settings:
+- Edit profile
+- Change theme
+- Export data
+- Delete account
+```
+
+### Step 2: Check Tool Coverage
+
+For each action, verify:
+
+```
+✅ View list of books → read_library
+✅ Search books → read_library (with query param)
+⚠️ Filter by category → MISSING (add filter param to read_library)
+⚠️ Add new book → MISSING (need add_book tool)
+✅ Delete book → delete_book
+✅ Open book detail → read_library (single book)
+
+✅ Start research → start_research
+✅ View highlights → read_library (includes highlights)
+⚠️ Add highlight → MISSING (need add_highlight tool)
+⚠️ Share book → MISSING (or N/A if sharing is UI-only)
+
+✅ View insights → read_library (includes feed)
+✅ Create new insight → publish_to_feed
+⚠️ Edit insight → MISSING (need update_feed_item tool)
+⚠️ Delete insight → MISSING (need delete_feed_item tool)
+```
+
+### Step 3: Prioritize Gaps
+
+Not all gaps are equal:
+
+**High priority (users will ask for this):**
+- Add new book
+- Create/edit/delete content
+- Core workflow actions
+
+**Medium priority (occasional requests):**
+- Filter/search variations
+- Export functionality
+- Sharing features
+
+**Low priority (rarely requested via agent):**
+- Theme changes
+- Account deletion
+- Settings that are UI-preference
+
+
+
+## Designing Tools for Parity
+
+### Match Tool Granularity to UI Granularity
+
+If the UI has separate buttons for "Edit" and "Delete", consider separate tools:
+
+```typescript
+// Matches UI granularity
+tool("update_feed_item", { id, content, headline }, ...);
+tool("delete_feed_item", { id }, ...);
+
+// vs. combined (harder for agent to discover)
+tool("modify_feed_item", { id, action: "update" | "delete", ... }, ...);
+```
+
+### Use User Vocabulary in Tool Names
+
+```typescript
+// Good: Matches what users say
+tool("publish_to_feed", ...); // "publish to my feed"
+tool("add_book", ...); // "add this book"
+tool("start_research", ...); // "research this"
+
+// Bad: Technical jargon
+tool("create_analysis_record", ...);
+tool("insert_library_item", ...);
+tool("initiate_web_scrape_workflow", ...);
+```
+
+### Return What the UI Shows
+
+If the UI shows a confirmation with details, the tool should too:
+
+```typescript
+// UI shows: "Added 'Moby Dick' to your library"
+// Tool should return the same:
+tool("add_book", async ({ title, author }) => {
+ const book = await library.add({ title, author });
+ return {
+ text: `Added "${book.title}" by ${book.author} to your library (id: ${book.id})`
+ };
+});
+```
+
+
+
+## Context Parity
+
+Whatever the user sees, the agent should be able to access.
+
+### The Problem
+
+```swift
+// UI shows recent analyses in a list
+ForEach(analysisRecords) { record in
+ AnalysisRow(record: record)
+}
+
+// But system prompt only mentions books, not analyses
+let systemPrompt = """
+## Available Books
+\(books.map { $0.title })
+// Missing: recent analyses!
+"""
+```
+
+The user sees their reading journal. The agent doesn't. This creates a disconnect.
+
+### The Fix
+
+```swift
+// System prompt includes what UI shows
+let systemPrompt = """
+## Available Books
+\(books.map { "- \($0.title)" }.joined(separator: "\n"))
+
+## Recent Reading Journal
+\(analysisRecords.prefix(10).map { "- \($0.summary)" }.joined(separator: "\n"))
+"""
+```
+
+### Context Parity Checklist
+
+For each screen in your app:
+- [ ] What data does this screen display?
+- [ ] Is that data available to the agent?
+- [ ] Can the agent access the same level of detail?
+
+
+
+## Maintaining Parity Over Time
+
+### Git Hooks / CI Checks
+
+```bash
+#!/bin/bash
+# pre-commit hook: check for new UI actions without tools
+
+# Find new SwiftUI Button/onTapGesture additions
+NEW_ACTIONS=$(git diff --cached --name-only | xargs grep -l "Button\|onTapGesture")
+
+if [ -n "$NEW_ACTIONS" ]; then
+ echo "⚠️ New UI actions detected. Did you add corresponding agent tools?"
+ echo "Files: $NEW_ACTIONS"
+ echo ""
+ echo "Checklist:"
+ echo " [ ] Agent tool exists for new action"
+ echo " [ ] System prompt documents new capability"
+ echo " [ ] Capability map updated"
+fi
+```
+
+### Automated Parity Testing
+
+```typescript
+// parity.test.ts
+describe('Action Parity', () => {
+ const capabilityMap = loadCapabilityMap();
+
+ for (const [action, toolName] of Object.entries(capabilityMap)) {
+ if (toolName === 'N/A') continue;
+
+ test(`${action} has agent tool: ${toolName}`, () => {
+ expect(agentTools.map(t => t.name)).toContain(toolName);
+ });
+
+ test(`${toolName} is documented in system prompt`, () => {
+ expect(systemPrompt).toContain(toolName);
+ });
+ }
+});
+```
+
+### Regular Audits
+
+Schedule periodic reviews:
+
+```markdown
+## Monthly Parity Audit
+
+1. Review all PRs merged this month
+2. Check each for new UI actions
+3. Verify tool coverage
+4. Update capability map
+5. Test with natural language requests
+```
+
+
+
+## Real Example: The Feed Gap
+
+**Before:** Every Reader had a feed where insights appeared, but no agent tool to publish there.
+
+```
+User: "Write something about Catherine the Great in my reading feed"
+Agent: "I'm not sure what system you're referring to. Could you clarify?"
+```
+
+**Diagnosis:**
+- ✅ UI action: User can publish insights from the analysis view
+- ❌ Agent tool: No `publish_to_feed` tool
+- ❌ System prompt: No mention of "feed" or how to publish
+- ❌ Context: Agent didn't know what "feed" meant
+
+**Fix:**
+
+```swift
+// 1. Add the tool
+tool("publish_to_feed",
+ "Publish an insight to the user's reading feed",
+ {
+ bookId: z.string().describe("Book ID"),
+ content: z.string().describe("The insight content"),
+ headline: z.string().describe("A punchy headline")
+ },
+ async ({ bookId, content, headline }) => {
+ await feedService.publish({ bookId, content, headline });
+ return { text: `Published "${headline}" to your reading feed` };
+ }
+);
+
+// 2. Update system prompt
+"""
+## Your Capabilities
+
+- **Publish to Feed**: Create insights that appear in the Feed tab using `publish_to_feed`.
+ Include a book_id, content, and a punchy headline.
+"""
+
+// 3. Add to context injection
+"""
+When the user mentions "the feed" or "reading feed", they mean the Feed tab
+where insights appear. Use `publish_to_feed` to create content there.
+"""
+```
+
+**After:**
+```
+User: "Write something about Catherine the Great in my reading feed"
+Agent: [Uses publish_to_feed to create insight]
+ "Done! I've published 'The Enlightened Empress' to your reading feed."
+```
+
+
+
+## Action Parity Checklist
+
+For every PR with UI changes:
+- [ ] Listed all new UI actions
+- [ ] Verified agent tool exists for each action
+- [ ] Updated system prompt with new capabilities
+- [ ] Added to capability map
+- [ ] Tested with natural language request
+
+For periodic audits:
+- [ ] Walked through every screen
+- [ ] Listed all possible user actions
+- [ ] Checked tool coverage for each
+- [ ] Prioritized gaps by likelihood of user request
+- [ ] Created issues for high-priority gaps
+
diff --git a/plugins/compound-engineering/skills/agent-native-architecture/references/agent-native-testing.md b/plugins/compound-engineering/skills/agent-native-architecture/references/agent-native-testing.md
new file mode 100644
index 0000000..bfe8ac4
--- /dev/null
+++ b/plugins/compound-engineering/skills/agent-native-architecture/references/agent-native-testing.md
@@ -0,0 +1,582 @@
+
+Testing agent-native apps requires different approaches than traditional unit testing. You're testing whether the agent achieves outcomes, not whether it calls specific functions. This guide provides concrete testing patterns for verifying your app is truly agent-native.
+
+
+
+## Testing Philosophy
+
+### Test Outcomes, Not Procedures
+
+**Traditional (procedure-focused):**
+```typescript
+// Testing that a specific function was called with specific args
+expect(mockProcessFeedback).toHaveBeenCalledWith({
+ message: "Great app!",
+ category: "praise",
+ priority: 2
+});
+```
+
+**Agent-native (outcome-focused):**
+```typescript
+// Testing that the outcome was achieved
+const result = await agent.process("Great app!");
+const storedFeedback = await db.feedback.getLatest();
+
+expect(storedFeedback.content).toContain("Great app");
+expect(storedFeedback.importance).toBeGreaterThanOrEqual(1);
+expect(storedFeedback.importance).toBeLessThanOrEqual(5);
+// We don't care exactly how it categorized—just that it's reasonable
+```
+
+### Accept Variability
+
+Agents may solve problems differently each time. Your tests should:
+- Verify the end state, not the path
+- Accept reasonable ranges, not exact values
+- Check for presence of required elements, not exact format
+
+
+
+## The "Can Agent Do It?" Test
+
+For each UI feature, write a test prompt and verify the agent can accomplish it.
+
+### Template
+
+```typescript
+describe('Agent Capability Tests', () => {
+ test('Agent can add a book to library', async () => {
+ const result = await agent.chat("Add 'Moby Dick' by Herman Melville to my library");
+
+ // Verify outcome
+ const library = await libraryService.getBooks();
+ const mobyDick = library.find(b => b.title.includes("Moby Dick"));
+
+ expect(mobyDick).toBeDefined();
+ expect(mobyDick.author).toContain("Melville");
+ });
+
+ test('Agent can publish to feed', async () => {
+ // Setup: ensure a book exists
+ await libraryService.addBook({ id: "book_123", title: "1984" });
+
+ const result = await agent.chat("Write something about surveillance themes in my feed");
+
+ // Verify outcome
+ const feed = await feedService.getItems();
+ const newItem = feed.find(item => item.bookId === "book_123");
+
+ expect(newItem).toBeDefined();
+ expect(newItem.content.toLowerCase()).toMatch(/surveillance|watching|control/);
+ });
+
+ test('Agent can search and save research', async () => {
+ await libraryService.addBook({ id: "book_456", title: "Moby Dick" });
+
+ const result = await agent.chat("Research whale symbolism in Moby Dick");
+
+ // Verify files were created
+ const files = await fileService.listFiles("Research/book_456/");
+ expect(files.length).toBeGreaterThan(0);
+
+ // Verify content is relevant
+ const content = await fileService.readFile(files[0]);
+ expect(content.toLowerCase()).toMatch(/whale|symbolism|melville/);
+ });
+});
+```
+
+### The "Write to Location" Test
+
+A key litmus test: can the agent create content in specific app locations?
+
+```typescript
+describe('Location Awareness Tests', () => {
+ const locations = [
+ { userPhrase: "my reading feed", expectedTool: "publish_to_feed" },
+ { userPhrase: "my library", expectedTool: "add_book" },
+ { userPhrase: "my research folder", expectedTool: "write_file" },
+ { userPhrase: "my profile", expectedTool: "write_file" },
+ ];
+
+ for (const { userPhrase, expectedTool } of locations) {
+ test(`Agent knows how to write to "${userPhrase}"`, async () => {
+ const prompt = `Write a test note to ${userPhrase}`;
+ const result = await agent.chat(prompt);
+
+ // Check that agent used the right tool (or achieved the outcome)
+ expect(result.toolCalls).toContainEqual(
+ expect.objectContaining({ name: expectedTool })
+ );
+
+ // Or verify outcome directly
+ // expect(await locationHasNewContent(userPhrase)).toBe(true);
+ });
+ }
+});
+```
+
+
+
+## The "Surprise Test"
+
+A well-designed agent-native app lets the agent figure out creative approaches. Test this by giving open-ended requests.
+
+### The Test
+
+```typescript
+describe('Agent Creativity Tests', () => {
+ test('Agent can handle open-ended requests', async () => {
+ // Setup: user has some books
+ await libraryService.addBook({ id: "1", title: "1984", author: "Orwell" });
+ await libraryService.addBook({ id: "2", title: "Brave New World", author: "Huxley" });
+ await libraryService.addBook({ id: "3", title: "Fahrenheit 451", author: "Bradbury" });
+
+ // Open-ended request
+ const result = await agent.chat("Help me organize my reading for next month");
+
+ // The agent should do SOMETHING useful
+ // We don't specify exactly what—that's the point
+ expect(result.toolCalls.length).toBeGreaterThan(0);
+
+ // It should have engaged with the library
+ const libraryTools = ["read_library", "write_file", "publish_to_feed"];
+ const usedLibraryTool = result.toolCalls.some(
+ call => libraryTools.includes(call.name)
+ );
+ expect(usedLibraryTool).toBe(true);
+ });
+
+ test('Agent finds creative solutions', async () => {
+ // Don't specify HOW to accomplish the task
+ const result = await agent.chat(
+ "I want to understand the dystopian themes across my sci-fi books"
+ );
+
+ // Agent might:
+ // - Read all books and create a comparison document
+ // - Research dystopian literature and relate it to user's books
+ // - Create a mind map in a markdown file
+ // - Publish a series of insights to the feed
+
+ // We just verify it did something substantive
+ expect(result.response.length).toBeGreaterThan(100);
+ expect(result.toolCalls.length).toBeGreaterThan(0);
+ });
+});
+```
+
+### What Failure Looks Like
+
+```typescript
+// FAILURE: Agent can only say it can't do that
+const result = await agent.chat("Help me prepare for a book club discussion");
+
+// Bad outcome:
+expect(result.response).not.toContain("I can't");
+expect(result.response).not.toContain("I don't have a tool");
+expect(result.response).not.toContain("Could you clarify");
+
+// If the agent asks for clarification on something it should understand,
+// you have a context injection or capability gap
+```
+
+
+
+## Automated Parity Testing
+
+Ensure every UI action has an agent equivalent.
+
+### Capability Map Testing
+
+```typescript
+// capability-map.ts
+export const capabilityMap = {
+ // UI Action: Agent Tool
+ "View library": "read_library",
+ "Add book": "add_book",
+ "Delete book": "delete_book",
+ "Publish insight": "publish_to_feed",
+ "Start research": "start_research",
+ "View highlights": "read_library", // same tool, different query
+ "Edit profile": "write_file",
+ "Search web": "web_search",
+ "Export data": "N/A", // UI-only action
+};
+
+// parity.test.ts
+import { capabilityMap } from './capability-map';
+import { getAgentTools } from './agent-config';
+import { getSystemPrompt } from './system-prompt';
+
+describe('Action Parity', () => {
+ const agentTools = getAgentTools();
+ const systemPrompt = getSystemPrompt();
+
+ for (const [uiAction, toolName] of Object.entries(capabilityMap)) {
+ if (toolName === 'N/A') continue;
+
+ test(`"${uiAction}" has agent tool: ${toolName}`, () => {
+ const toolNames = agentTools.map(t => t.name);
+ expect(toolNames).toContain(toolName);
+ });
+
+ test(`${toolName} is documented in system prompt`, () => {
+ expect(systemPrompt).toContain(toolName);
+ });
+ }
+});
+```
+
+### Context Parity Testing
+
+```typescript
+describe('Context Parity', () => {
+ test('Agent sees all data that UI shows', async () => {
+ // Setup: create some data
+ await libraryService.addBook({ id: "1", title: "Test Book" });
+ await feedService.addItem({ id: "f1", content: "Test insight" });
+
+ // Get system prompt (which includes context)
+ const systemPrompt = await buildSystemPrompt();
+
+ // Verify data is included
+ expect(systemPrompt).toContain("Test Book");
+ expect(systemPrompt).toContain("Test insight");
+ });
+
+ test('Recent activity is visible to agent', async () => {
+ // Perform some actions
+ await activityService.log({ action: "highlighted", bookId: "1" });
+ await activityService.log({ action: "researched", bookId: "2" });
+
+ const systemPrompt = await buildSystemPrompt();
+
+ // Verify activity is included
+ expect(systemPrompt).toMatch(/highlighted|researched/);
+ });
+});
+```
+
+
+
+## Integration Testing
+
+Test the full flow from user request to outcome.
+
+### End-to-End Flow Tests
+
+```typescript
+describe('End-to-End Flows', () => {
+ test('Research flow: request → web search → file creation', async () => {
+ // Setup
+ const bookId = "book_123";
+ await libraryService.addBook({ id: bookId, title: "Moby Dick" });
+
+ // User request
+ await agent.chat("Research the historical context of whaling in Moby Dick");
+
+ // Verify: web search was performed
+ const searchCalls = mockWebSearch.mock.calls;
+ expect(searchCalls.length).toBeGreaterThan(0);
+ expect(searchCalls.some(call =>
+ call[0].query.toLowerCase().includes("whaling")
+ )).toBe(true);
+
+ // Verify: files were created
+ const researchFiles = await fileService.listFiles(`Research/${bookId}/`);
+ expect(researchFiles.length).toBeGreaterThan(0);
+
+ // Verify: content is relevant
+ const content = await fileService.readFile(researchFiles[0]);
+ expect(content.toLowerCase()).toMatch(/whale|whaling|nantucket|melville/);
+ });
+
+ test('Publish flow: request → tool call → feed update → UI reflects', async () => {
+ // Setup
+ await libraryService.addBook({ id: "book_1", title: "1984" });
+
+ // Initial state
+ const feedBefore = await feedService.getItems();
+
+ // User request
+ await agent.chat("Write something about Big Brother for my reading feed");
+
+ // Verify feed updated
+ const feedAfter = await feedService.getItems();
+ expect(feedAfter.length).toBe(feedBefore.length + 1);
+
+ // Verify content
+ const newItem = feedAfter.find(item =>
+ !feedBefore.some(old => old.id === item.id)
+ );
+ expect(newItem).toBeDefined();
+ expect(newItem.content.toLowerCase()).toMatch(/big brother|surveillance|watching/);
+ });
+});
+```
+
+### Failure Recovery Tests
+
+```typescript
+describe('Failure Recovery', () => {
+ test('Agent handles missing book gracefully', async () => {
+ const result = await agent.chat("Tell me about 'Nonexistent Book'");
+
+ // Agent should not crash
+ expect(result.error).toBeUndefined();
+
+ // Agent should acknowledge the issue
+ expect(result.response.toLowerCase()).toMatch(
+ /not found|don't see|can't find|library/
+ );
+ });
+
+ test('Agent recovers from API failure', async () => {
+ // Mock API failure
+ mockWebSearch.mockRejectedValueOnce(new Error("Network error"));
+
+ const result = await agent.chat("Research this topic");
+
+ // Agent should handle gracefully
+ expect(result.error).toBeUndefined();
+ expect(result.response).not.toContain("unhandled exception");
+
+ // Agent should communicate the issue
+ expect(result.response.toLowerCase()).toMatch(
+ /couldn't search|unable to|try again/
+ );
+ });
+});
+```
+
+
+
+## Snapshot Testing for System Prompts
+
+Track changes to system prompts and context injection over time.
+
+```typescript
+describe('System Prompt Stability', () => {
+ test('System prompt structure matches snapshot', async () => {
+ const systemPrompt = await buildSystemPrompt();
+
+ // Extract structure (removing dynamic data)
+ const structure = systemPrompt
+ .replace(/id: \w+/g, 'id: [ID]')
+ .replace(/"[^"]+"/g, '"[TITLE]"')
+ .replace(/\d{4}-\d{2}-\d{2}/g, '[DATE]');
+
+ expect(structure).toMatchSnapshot();
+ });
+
+ test('All capability sections are present', async () => {
+ const systemPrompt = await buildSystemPrompt();
+
+ const requiredSections = [
+ "Your Capabilities",
+ "Available Books",
+ "Recent Activity",
+ ];
+
+ for (const section of requiredSections) {
+ expect(systemPrompt).toContain(section);
+ }
+ });
+});
+```
+
+
+
+## Manual Testing Checklist
+
+Some things are best tested manually during development:
+
+### Natural Language Variation Test
+
+Try multiple phrasings for the same request:
+
+```
+"Add this to my feed"
+"Write something in my reading feed"
+"Publish an insight about this"
+"Put this in the feed"
+"I want this in my feed"
+```
+
+All should work if context injection is correct.
+
+### Edge Case Prompts
+
+```
+"What can you do?"
+→ Agent should describe capabilities
+
+"Help me with my books"
+→ Agent should engage with library, not ask what "books" means
+
+"Write something"
+→ Agent should ask WHERE (feed, file, etc.) if not clear
+
+"Delete everything"
+→ Agent should confirm before destructive actions
+```
+
+### Confusion Test
+
+Ask about things that should exist but might not be properly connected:
+
+```
+"What's in my research folder?"
+→ Should list files, not ask "what research folder?"
+
+"Show me my recent reading"
+→ Should show activity, not ask "what do you mean?"
+
+"Continue where I left off"
+→ Should reference recent activity if available
+```
+
+
+
+## CI/CD Integration
+
+Add agent-native tests to your CI pipeline:
+
+```yaml
+# .github/workflows/test.yml
+name: Agent-Native Tests
+
+on: [push, pull_request]
+
+jobs:
+ agent-tests:
+ runs-on: ubuntu-latest
+ steps:
+ - uses: actions/checkout@v3
+
+ - name: Setup
+ run: npm install
+
+ - name: Run Parity Tests
+ run: npm run test:parity
+
+ - name: Run Capability Tests
+ run: npm run test:capabilities
+ env:
+ ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+
+ - name: Check System Prompt Completeness
+ run: npm run test:system-prompt
+
+ - name: Verify Capability Map
+ run: |
+ # Ensure capability map is up to date
+ npm run generate:capability-map
+ git diff --exit-code capability-map.ts
+```
+
+### Cost-Aware Testing
+
+Agent tests cost API tokens. Strategies to manage:
+
+```typescript
+// Use smaller models for basic tests
+const testConfig = {
+ model: process.env.CI ? "claude-3-haiku" : "claude-3-opus",
+ maxTokens: 500, // Limit output length
+};
+
+// Cache responses for deterministic tests
+const cachedAgent = new CachedAgent({
+ cacheDir: ".test-cache",
+ ttl: 24 * 60 * 60 * 1000, // 24 hours
+});
+
+// Run expensive tests only on main branch
+if (process.env.GITHUB_REF === 'refs/heads/main') {
+ describe('Full Integration Tests', () => { ... });
+}
+```
+
+
+
+## Test Utilities
+
+### Agent Test Harness
+
+```typescript
+class AgentTestHarness {
+ private agent: Agent;
+ private mockServices: MockServices;
+
+ async setup() {
+ this.mockServices = createMockServices();
+ this.agent = await createAgent({
+ services: this.mockServices,
+ model: "claude-3-haiku", // Cheaper for tests
+ });
+ }
+
+ async chat(message: string): Promise {
+ return this.agent.chat(message);
+ }
+
+ async expectToolCall(toolName: string) {
+ const lastResponse = this.agent.getLastResponse();
+ expect(lastResponse.toolCalls.map(t => t.name)).toContain(toolName);
+ }
+
+ async expectOutcome(check: () => Promise) {
+ const result = await check();
+ expect(result).toBe(true);
+ }
+
+ getState() {
+ return {
+ library: this.mockServices.library.getBooks(),
+ feed: this.mockServices.feed.getItems(),
+ files: this.mockServices.files.listAll(),
+ };
+ }
+}
+
+// Usage
+test('full flow', async () => {
+ const harness = new AgentTestHarness();
+ await harness.setup();
+
+ await harness.chat("Add 'Moby Dick' to my library");
+ await harness.expectToolCall("add_book");
+ await harness.expectOutcome(async () => {
+ const state = harness.getState();
+ return state.library.some(b => b.title.includes("Moby"));
+ });
+});
+```
+
+
+
+## Testing Checklist
+
+Automated Tests:
+- [ ] "Can Agent Do It?" tests for each UI action
+- [ ] Location awareness tests ("write to my feed")
+- [ ] Parity tests (tool exists, documented in prompt)
+- [ ] Context parity tests (agent sees what UI shows)
+- [ ] End-to-end flow tests
+- [ ] Failure recovery tests
+
+Manual Tests:
+- [ ] Natural language variation (multiple phrasings work)
+- [ ] Edge case prompts (open-ended requests)
+- [ ] Confusion test (agent knows app vocabulary)
+- [ ] Surprise test (agent can be creative)
+
+CI Integration:
+- [ ] Parity tests run on every PR
+- [ ] Capability tests run with API key
+- [ ] System prompt completeness check
+- [ ] Capability map drift detection
+
diff --git a/plugins/compound-engineering/skills/agent-native-architecture/references/architecture-patterns.md b/plugins/compound-engineering/skills/agent-native-architecture/references/architecture-patterns.md
index a76a019..68d9c4e 100644
--- a/plugins/compound-engineering/skills/agent-native-architecture/references/architecture-patterns.md
+++ b/plugins/compound-engineering/skills/agent-native-architecture/references/architecture-patterns.md
@@ -203,6 +203,259 @@ tool("apply_pending", async () => {
- docs/* (documentation)
+
+## Unified Agent Architecture
+
+One execution engine, many agent types. All agents use the same orchestrator but with different configurations.
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│ AgentOrchestrator │
+├─────────────────────────────────────────────────────────────┤
+│ - Lifecycle management (start, pause, resume, stop) │
+│ - Checkpoint/restore (for background execution) │
+│ - Tool execution │
+│ - Chat integration │
+└─────────────────────────────────────────────────────────────┘
+ │ │ │
+ ┌─────┴─────┐ ┌─────┴─────┐ ┌─────┴─────┐
+ │ Research │ │ Chat │ │ Profile │
+ │ Agent │ │ Agent │ │ Agent │
+ └───────────┘ └───────────┘ └───────────┘
+ - web_search - read_library - read_photos
+ - write_file - publish_to_feed - write_file
+ - read_file - web_search - analyze_image
+```
+
+**Implementation:**
+
+```swift
+// All agents use the same orchestrator
+let session = try await AgentOrchestrator.shared.startAgent(
+ config: ResearchAgent.create(book: book), // Config varies
+ tools: ResearchAgent.tools, // Tools vary
+ context: ResearchAgent.context(for: book) // Context varies
+)
+
+// Agent types define their own configuration
+struct ResearchAgent {
+ static var tools: [AgentTool] {
+ [
+ FileTools.readFile(),
+ FileTools.writeFile(),
+ WebTools.webSearch(),
+ WebTools.webFetch(),
+ ]
+ }
+
+ static func context(for book: Book) -> String {
+ """
+ You are researching "\(book.title)" by \(book.author).
+ Save findings to Documents/Research/\(book.id)/
+ """
+ }
+}
+
+struct ChatAgent {
+ static var tools: [AgentTool] {
+ [
+ FileTools.readFile(),
+ FileTools.writeFile(),
+ BookTools.readLibrary(),
+ BookTools.publishToFeed(), // Chat can publish directly
+ WebTools.webSearch(),
+ ]
+ }
+
+ static func context(library: [Book]) -> String {
+ """
+ You help the user with their reading.
+ Available books: \(library.map { $0.title }.joined(separator: ", "))
+ """
+ }
+}
+```
+
+**Benefits:**
+- Consistent lifecycle management across all agent types
+- Automatic checkpoint/resume (critical for mobile)
+- Shared tool protocol
+- Easy to add new agent types
+- Centralized error handling and logging
+
+
+
+## Agent-to-UI Communication
+
+When agents take actions, the UI should reflect them immediately. The user should see what the agent did.
+
+**Pattern 1: Shared Data Store (Recommended)**
+
+Agent writes through the same service the UI observes:
+
+```swift
+// Shared service
+class BookLibraryService: ObservableObject {
+ static let shared = BookLibraryService()
+ @Published var books: [Book] = []
+ @Published var feedItems: [FeedItem] = []
+
+ func addFeedItem(_ item: FeedItem) {
+ feedItems.append(item)
+ persist()
+ }
+}
+
+// Agent tool writes through shared service
+tool("publish_to_feed", async ({ bookId, content, headline }) => {
+ let item = FeedItem(bookId: bookId, content: content, headline: headline)
+ BookLibraryService.shared.addFeedItem(item) // Same service UI uses
+ return { text: "Published to feed" }
+})
+
+// UI observes the same service
+struct FeedView: View {
+ @StateObject var library = BookLibraryService.shared
+
+ var body: some View {
+ List(library.feedItems) { item in
+ FeedItemRow(item: item)
+ // Automatically updates when agent adds items
+ }
+ }
+}
+```
+
+**Pattern 2: File System Observation**
+
+For file-based data, watch the file system:
+
+```swift
+class ResearchWatcher: ObservableObject {
+ @Published var files: [URL] = []
+ private var watcher: DirectoryWatcher?
+
+ func watch(bookId: String) {
+ let path = documentsURL.appendingPathComponent("Research/\(bookId)")
+
+ watcher = DirectoryWatcher(path: path) { [weak self] in
+ self?.reload(from: path)
+ }
+
+ reload(from: path)
+ }
+}
+
+// Agent writes files
+tool("write_file", { path, content }) -> {
+ writeFile(documentsURL.appendingPathComponent(path), content)
+ // DirectoryWatcher triggers UI update automatically
+}
+```
+
+**Pattern 3: Event Bus (Cross-Component)**
+
+For complex apps with multiple independent components:
+
+```typescript
+// Shared event bus
+const agentEvents = new EventEmitter();
+
+// Agent tool emits events
+tool("publish_to_feed", async ({ content }) => {
+ const item = await feedService.add(content);
+ agentEvents.emit('feed:new-item', item);
+ return { text: "Published" };
+});
+
+// UI components subscribe
+function FeedView() {
+ const [items, setItems] = useState([]);
+
+ useEffect(() => {
+ const handler = (item) => setItems(prev => [...prev, item]);
+ agentEvents.on('feed:new-item', handler);
+ return () => agentEvents.off('feed:new-item', handler);
+ }, []);
+
+ return ;
+}
+```
+
+**What to avoid:**
+
+```swift
+// BAD: UI doesn't observe agent changes
+// Agent writes to database directly
+tool("publish_to_feed", { content }) {
+ database.insert("feed", content) // UI doesn't see this
+}
+
+// UI loads once at startup, never refreshes
+struct FeedView: View {
+ let items = database.query("feed") // Stale!
+}
+```
+
+
+
+## Model Tier Selection
+
+Different agents need different intelligence levels. Use the cheapest model that achieves the outcome.
+
+| Agent Type | Recommended Tier | Reasoning |
+|------------|-----------------|-----------|
+| Chat/Conversation | Balanced | Fast responses, good reasoning |
+| Research | Balanced | Tool loops, not ultra-complex synthesis |
+| Content Generation | Balanced | Creative but not synthesis-heavy |
+| Complex Analysis | Powerful | Multi-document synthesis, nuanced judgment |
+| Profile/Onboarding | Powerful | Photo analysis, complex pattern recognition |
+| Simple Queries | Fast/Haiku | Quick lookups, simple transformations |
+
+**Implementation:**
+
+```swift
+enum ModelTier {
+ case fast // claude-3-haiku: Quick, cheap, simple tasks
+ case balanced // claude-3-sonnet: Good balance for most tasks
+ case powerful // claude-3-opus: Complex reasoning, synthesis
+}
+
+struct AgentConfig {
+ let modelTier: ModelTier
+ let tools: [AgentTool]
+ let systemPrompt: String
+}
+
+// Research agent: balanced tier
+let researchConfig = AgentConfig(
+ modelTier: .balanced,
+ tools: researchTools,
+ systemPrompt: researchPrompt
+)
+
+// Profile analysis: powerful tier (complex photo interpretation)
+let profileConfig = AgentConfig(
+ modelTier: .powerful,
+ tools: profileTools,
+ systemPrompt: profilePrompt
+)
+
+// Quick lookup: fast tier
+let lookupConfig = AgentConfig(
+ modelTier: .fast,
+ tools: [readLibrary],
+ systemPrompt: "Answer quick questions about the user's library."
+)
+```
+
+**Cost optimization strategies:**
+- Start with balanced tier, only upgrade if quality insufficient
+- Use fast tier for tool-heavy loops where each turn is simple
+- Reserve powerful tier for synthesis tasks (comparing multiple sources)
+- Consider token limits per turn to control costs
+
+
## Questions to Ask When Designing
@@ -212,4 +465,7 @@ tool("apply_pending", async () => {
4. **What decisions should be hardcoded?** (security boundaries, approval requirements)
5. **How does the agent verify its work?** (health checks, build verification)
6. **How does the agent recover from mistakes?** (git rollback, approval gates)
+7. **How does the UI know when agent changes state?** (shared store, file watching, events)
+8. **What model tier does each agent type need?** (fast, balanced, powerful)
+9. **How do agents share infrastructure?** (unified orchestrator, shared tools)
diff --git a/plugins/compound-engineering/skills/agent-native-architecture/references/dynamic-context-injection.md b/plugins/compound-engineering/skills/agent-native-architecture/references/dynamic-context-injection.md
new file mode 100644
index 0000000..b801f3b
--- /dev/null
+++ b/plugins/compound-engineering/skills/agent-native-architecture/references/dynamic-context-injection.md
@@ -0,0 +1,338 @@
+
+How to inject dynamic runtime context into agent system prompts. The agent needs to know what exists in the app to know what it can work with. Static prompts aren't enough—the agent needs to see the same context the user sees.
+
+**Core principle:** The user's context IS the agent's context.
+
+
+
+## Why Dynamic Context Injection?
+
+A static system prompt tells the agent what it CAN do. Dynamic context tells it what it can do RIGHT NOW with the user's actual data.
+
+**The failure case:**
+```
+User: "Write a little thing about Catherine the Great in my reading feed"
+Agent: "What system are you referring to? I'm not sure what reading feed means."
+```
+
+The agent failed because it didn't know:
+- What books exist in the user's library
+- What the "reading feed" is
+- What tools it has to publish there
+
+**The fix:** Inject runtime context about app state into the system prompt.
+
+
+
+## The Context Injection Pattern
+
+Build your system prompt dynamically, including current app state:
+
+```swift
+func buildSystemPrompt() -> String {
+ // Gather current state
+ let availableBooks = libraryService.books
+ let recentActivity = analysisService.recentRecords(limit: 10)
+ let userProfile = profileService.currentProfile
+
+ return """
+ # Your Identity
+
+ You are a reading assistant for \(userProfile.name)'s library.
+
+ ## Available Books in User's Library
+
+ \(availableBooks.map { "- \"\($0.title)\" by \($0.author) (id: \($0.id))" }.joined(separator: "\n"))
+
+ ## Recent Reading Activity
+
+ \(recentActivity.map { "- Analyzed \"\($0.bookTitle)\": \($0.excerptPreview)" }.joined(separator: "\n"))
+
+ ## Your Capabilities
+
+ - **publish_to_feed**: Create insights that appear in the Feed tab
+ - **read_library**: View books, highlights, and analyses
+ - **web_search**: Search the internet for research
+ - **write_file**: Save research to Documents/Research/{bookId}/
+
+ When the user mentions "the feed" or "reading feed", they mean the Feed tab
+ where insights appear. Use `publish_to_feed` to create content there.
+ """
+}
+```
+
+
+
+## What Context to Inject
+
+### 1. Available Resources
+What data/files exist that the agent can access?
+
+```swift
+## Available in User's Library
+
+Books:
+- "Moby Dick" by Herman Melville (id: book_123)
+- "1984" by George Orwell (id: book_456)
+
+Research folders:
+- Documents/Research/book_123/ (3 files)
+- Documents/Research/book_456/ (1 file)
+```
+
+### 2. Current State
+What has the user done recently? What's the current context?
+
+```swift
+## Recent Activity
+
+- 2 hours ago: Highlighted passage in "1984" about surveillance
+- Yesterday: Completed research on "Moby Dick" whale symbolism
+- This week: Added 3 new books to library
+```
+
+### 3. Capabilities Mapping
+What tool maps to what UI feature? Use the user's language.
+
+```swift
+## What You Can Do
+
+| User Says | You Should Use | Result |
+|-----------|----------------|--------|
+| "my feed" / "reading feed" | `publish_to_feed` | Creates insight in Feed tab |
+| "my library" / "my books" | `read_library` | Shows their book collection |
+| "research this" | `web_search` + `write_file` | Saves to Research folder |
+| "my profile" | `read_file("profile.md")` | Shows reading profile |
+```
+
+### 4. Domain Vocabulary
+Explain app-specific terms the user might use.
+
+```swift
+## Vocabulary
+
+- **Feed**: The Feed tab showing reading insights and analyses
+- **Research folder**: Documents/Research/{bookId}/ where research is stored
+- **Reading profile**: A markdown file describing user's reading preferences
+- **Highlight**: A passage the user marked in a book
+```
+
+
+
+## Implementation Patterns
+
+### Pattern 1: Service-Based Injection (Swift/iOS)
+
+```swift
+class AgentContextBuilder {
+ let libraryService: BookLibraryService
+ let profileService: ReadingProfileService
+ let activityService: ActivityService
+
+ func buildContext() -> String {
+ let books = libraryService.books
+ let profile = profileService.currentProfile
+ let activity = activityService.recent(limit: 10)
+
+ return """
+ ## Library (\(books.count) books)
+ \(formatBooks(books))
+
+ ## Profile
+ \(profile.summary)
+
+ ## Recent Activity
+ \(formatActivity(activity))
+ """
+ }
+
+ private func formatBooks(_ books: [Book]) -> String {
+ books.map { "- \"\($0.title)\" (id: \($0.id))" }.joined(separator: "\n")
+ }
+}
+
+// Usage in agent initialization
+let context = AgentContextBuilder(
+ libraryService: .shared,
+ profileService: .shared,
+ activityService: .shared
+).buildContext()
+
+let systemPrompt = basePrompt + "\n\n" + context
+```
+
+### Pattern 2: Hook-Based Injection (TypeScript)
+
+```typescript
+interface ContextProvider {
+ getContext(): Promise;
+}
+
+class LibraryContextProvider implements ContextProvider {
+ async getContext(): Promise {
+ const books = await db.books.list();
+ const recent = await db.activity.recent(10);
+
+ return `
+## Library
+${books.map(b => `- "${b.title}" (${b.id})`).join('\n')}
+
+## Recent
+${recent.map(r => `- ${r.description}`).join('\n')}
+ `.trim();
+ }
+}
+
+// Compose multiple providers
+async function buildSystemPrompt(providers: ContextProvider[]): Promise {
+ const contexts = await Promise.all(providers.map(p => p.getContext()));
+ return [BASE_PROMPT, ...contexts].join('\n\n');
+}
+```
+
+### Pattern 3: Template-Based Injection
+
+```markdown
+# System Prompt Template (system-prompt.template.md)
+
+You are a reading assistant.
+
+## Available Books
+
+{{#each books}}
+- "{{title}}" by {{author}} (id: {{id}})
+{{/each}}
+
+## Capabilities
+
+{{#each capabilities}}
+- **{{name}}**: {{description}}
+{{/each}}
+
+## Recent Activity
+
+{{#each recentActivity}}
+- {{timestamp}}: {{description}}
+{{/each}}
+```
+
+```typescript
+// Render at runtime
+const prompt = Handlebars.compile(template)({
+ books: await libraryService.getBooks(),
+ capabilities: getCapabilities(),
+ recentActivity: await activityService.getRecent(10),
+});
+```
+
+
+
+## Context Freshness
+
+Context should be injected at agent initialization, and optionally refreshed during long sessions.
+
+**At initialization:**
+```swift
+// Always inject fresh context when starting an agent
+func startChatAgent() async -> AgentSession {
+ let context = await buildCurrentContext() // Fresh context
+ return await AgentOrchestrator.shared.startAgent(
+ config: ChatAgent.config,
+ systemPrompt: basePrompt + context
+ )
+}
+```
+
+**During long sessions (optional):**
+```swift
+// For long-running agents, provide a refresh tool
+tool("refresh_context", "Get current app state") { _ in
+ let books = libraryService.books
+ let recent = activityService.recent(10)
+ return """
+ Current library: \(books.count) books
+ Recent: \(recent.map { $0.summary }.joined(separator: ", "))
+ """
+}
+```
+
+**What NOT to do:**
+```swift
+// DON'T: Use stale context from app launch
+let cachedContext = appLaunchContext // Stale!
+// Books may have been added, activity may have changed
+```
+
+
+
+## Real-World Example: Every Reader
+
+The Every Reader app injects context for its chat agent:
+
+```swift
+func getChatAgentSystemPrompt() -> String {
+ // Get current library state
+ let books = BookLibraryService.shared.books
+ let analyses = BookLibraryService.shared.analysisRecords.prefix(10)
+ let profile = ReadingProfileService.shared.getProfileForSystemPrompt()
+
+ let bookList = books.map { book in
+ "- \"\(book.title)\" by \(book.author) (id: \(book.id))"
+ }.joined(separator: "\n")
+
+ let recentList = analyses.map { record in
+ let title = books.first { $0.id == record.bookId }?.title ?? "Unknown"
+ return "- From \"\(title)\": \"\(record.excerptPreview)\""
+ }.joined(separator: "\n")
+
+ return """
+ # Reading Assistant
+
+ You help the user with their reading and book research.
+
+ ## Available Books in User's Library
+
+ \(bookList.isEmpty ? "No books yet." : bookList)
+
+ ## Recent Reading Journal (Latest Analyses)
+
+ \(recentList.isEmpty ? "No analyses yet." : recentList)
+
+ ## Reading Profile
+
+ \(profile)
+
+ ## Your Capabilities
+
+ - **Publish to Feed**: Create insights using `publish_to_feed` that appear in the Feed tab
+ - **Library Access**: View books and highlights using `read_library`
+ - **Research**: Search web and save to Documents/Research/{bookId}/
+ - **Profile**: Read/update the user's reading profile
+
+ When the user asks you to "write something for their feed" or "add to my reading feed",
+ use the `publish_to_feed` tool with the relevant book_id.
+ """
+}
+```
+
+**Result:** When user says "write a little thing about Catherine the Great in my reading feed", the agent:
+1. Sees "reading feed" → knows to use `publish_to_feed`
+2. Sees available books → finds the relevant book ID
+3. Creates appropriate content for the Feed tab
+
+
+
+## Context Injection Checklist
+
+Before launching an agent:
+- [ ] System prompt includes current resources (books, files, data)
+- [ ] Recent activity is visible to the agent
+- [ ] Capabilities are mapped to user vocabulary
+- [ ] Domain-specific terms are explained
+- [ ] Context is fresh (gathered at agent start, not cached)
+
+When adding new features:
+- [ ] New resources are included in context injection
+- [ ] New capabilities are documented in system prompt
+- [ ] User vocabulary for the feature is mapped
+
diff --git a/plugins/compound-engineering/skills/agent-native-architecture/references/mcp-tool-design.md b/plugins/compound-engineering/skills/agent-native-architecture/references/mcp-tool-design.md
index f7133da..d1afe83 100644
--- a/plugins/compound-engineering/skills/agent-native-architecture/references/mcp-tool-design.md
+++ b/plugins/compound-engineering/skills/agent-native-architecture/references/mcp-tool-design.md
@@ -303,9 +303,187 @@ Use your judgment about importance ratings.
```
+
+## Dynamic Capability Discovery vs Static Tool Mapping
+
+**This pattern is specifically for agent-native apps** where you want the agent to have full access to an external API—the same access a user would have. It follows the core agent-native principle: "Whatever the user can do, the agent can do."
+
+If you're building a constrained agent with limited capabilities, static tool mapping may be intentional. But for agent-native apps integrating with HealthKit, HomeKit, GraphQL, or similar APIs:
+
+**Static Tool Mapping (Anti-pattern for Agent-Native):**
+Build individual tools for each API capability. Always out of date, limits agent to only what you anticipated.
+
+```typescript
+// ❌ Static: Every API type needs a hardcoded tool
+tool("read_steps", async ({ startDate, endDate }) => {
+ return healthKit.query(HKQuantityType.stepCount, startDate, endDate);
+});
+
+tool("read_heart_rate", async ({ startDate, endDate }) => {
+ return healthKit.query(HKQuantityType.heartRate, startDate, endDate);
+});
+
+tool("read_sleep", async ({ startDate, endDate }) => {
+ return healthKit.query(HKCategoryType.sleepAnalysis, startDate, endDate);
+});
+
+// When HealthKit adds glucose tracking... you need a code change
+```
+
+**Dynamic Capability Discovery (Preferred):**
+Build a meta-tool that discovers what's available, and a generic tool that can access anything.
+
+```typescript
+// ✅ Dynamic: Agent discovers and uses any capability
+
+// Discovery tool - returns what's available at runtime
+tool("list_available_capabilities", async () => {
+ const quantityTypes = await healthKit.availableQuantityTypes();
+ const categoryTypes = await healthKit.availableCategoryTypes();
+
+ return {
+ text: `Available health metrics:\n` +
+ `Quantity types: ${quantityTypes.join(", ")}\n` +
+ `Category types: ${categoryTypes.join(", ")}\n` +
+ `\nUse read_health_data with any of these types.`
+ };
+});
+
+// Generic access tool - type is a string, API validates
+tool("read_health_data", {
+ dataType: z.string(), // NOT z.enum - let HealthKit validate
+ startDate: z.string(),
+ endDate: z.string(),
+ aggregation: z.enum(["sum", "average", "samples"]).optional()
+}, async ({ dataType, startDate, endDate, aggregation }) => {
+ // HealthKit validates the type, returns helpful error if invalid
+ const result = await healthKit.query(dataType, startDate, endDate, aggregation);
+ return { text: JSON.stringify(result, null, 2) };
+});
+```
+
+**When to Use Each Approach:**
+
+| Dynamic (Agent-Native) | Static (Constrained Agent) |
+|------------------------|---------------------------|
+| Agent should access anything user can | Agent has intentionally limited scope |
+| External API with many endpoints (HealthKit, HomeKit, GraphQL) | Internal domain with fixed operations |
+| API evolves independently of your code | Tightly coupled domain logic |
+| You want full action parity | You want strict guardrails |
+
+**The agent-native default is Dynamic.** Only use Static when you're intentionally limiting the agent's capabilities.
+
+**Complete Dynamic Pattern:**
+
+```swift
+// 1. Discovery tool: What can I access?
+tool("list_health_types", "Get available health data types") { _ in
+ let store = HKHealthStore()
+
+ let quantityTypes = HKQuantityTypeIdentifier.allCases.map { $0.rawValue }
+ let categoryTypes = HKCategoryTypeIdentifier.allCases.map { $0.rawValue }
+ let characteristicTypes = HKCharacteristicTypeIdentifier.allCases.map { $0.rawValue }
+
+ return ToolResult(text: """
+ Available HealthKit types:
+
+ ## Quantity Types (numeric values)
+ \(quantityTypes.joined(separator: ", "))
+
+ ## Category Types (categorical data)
+ \(categoryTypes.joined(separator: ", "))
+
+ ## Characteristic Types (user info)
+ \(characteristicTypes.joined(separator: ", "))
+
+ Use read_health_data or write_health_data with any of these.
+ """)
+}
+
+// 2. Generic read: Access any type by name
+tool("read_health_data", "Read any health metric", {
+ dataType: z.string().describe("Type name from list_health_types"),
+ startDate: z.string(),
+ endDate: z.string()
+}) { request in
+ // Let HealthKit validate the type name
+ guard let type = HKQuantityTypeIdentifier(rawValue: request.dataType)
+ ?? HKCategoryTypeIdentifier(rawValue: request.dataType) else {
+ return ToolResult(
+ text: "Unknown type: \(request.dataType). Use list_health_types to see available types.",
+ isError: true
+ )
+ }
+
+ let samples = try await healthStore.querySamples(type: type, start: startDate, end: endDate)
+ return ToolResult(text: samples.formatted())
+}
+
+// 3. Context injection: Tell agent what's available in system prompt
+func buildSystemPrompt() -> String {
+ let availableTypes = healthService.getAuthorizedTypes()
+
+ return """
+ ## Available Health Data
+
+ You have access to these health metrics:
+ \(availableTypes.map { "- \($0)" }.joined(separator: "\n"))
+
+ Use read_health_data with any type above. For new types not listed,
+ use list_health_types to discover what's available.
+ """
+}
+```
+
+**Benefits:**
+- Agent can use any API capability, including ones added after your code shipped
+- API is the validator, not your enum definition
+- Smaller tool surface (2-3 tools vs N tools)
+- Agent naturally discovers capabilities by asking
+- Works with any API that has introspection (HealthKit, GraphQL, OpenAPI)
+
+
+
+## CRUD Completeness
+
+Every data type the agent can create, it should be able to read, update, and delete. Incomplete CRUD = broken action parity.
+
+**Anti-pattern: Create-only tools**
+```typescript
+// ❌ Can create but not modify or delete
+tool("create_experiment", { hypothesis, variable, metric })
+tool("write_journal_entry", { content, author, tags })
+// User: "Delete that experiment" → Agent: "I can't do that"
+```
+
+**Correct: Full CRUD for each entity**
+```typescript
+// ✅ Complete CRUD
+tool("create_experiment", { hypothesis, variable, metric })
+tool("read_experiment", { id })
+tool("update_experiment", { id, updates: { hypothesis?, status?, endDate? } })
+tool("delete_experiment", { id })
+
+tool("create_journal_entry", { content, author, tags })
+tool("read_journal", { query?, dateRange?, author? })
+tool("update_journal_entry", { id, content, tags? })
+tool("delete_journal_entry", { id })
+```
+
+**The CRUD Audit:**
+For each entity type in your app, verify:
+- [ ] Create: Agent can create new instances
+- [ ] Read: Agent can query/search/list instances
+- [ ] Update: Agent can modify existing instances
+- [ ] Delete: Agent can remove instances
+
+If any operation is missing, users will eventually ask for it and the agent will fail.
+
+
## MCP Tool Design Checklist
+**Fundamentals:**
- [ ] Tool names describe capability, not use case
- [ ] Inputs are data, not decisions
- [ ] Outputs are rich (enough for agent to verify)
@@ -313,4 +491,16 @@ Use your judgment about importance ratings.
- [ ] No business logic in tool implementations
- [ ] Error states clearly communicated via `isError`
- [ ] Descriptions explain what the tool does, not when to use it
+
+**Dynamic Capability Discovery (for agent-native apps):**
+- [ ] For external APIs where agent should have full access, use dynamic discovery
+- [ ] Include a `list_*` or `discover_*` tool for each API surface
+- [ ] Use string inputs (not enums) when the API validates
+- [ ] Inject available capabilities into system prompt at runtime
+- [ ] Only use static tool mapping if intentionally limiting agent scope
+
+**CRUD Completeness:**
+- [ ] Every entity has create, read, update, delete operations
+- [ ] Every UI action has a corresponding agent tool
+- [ ] Test: "Can the agent undo what it just did?"
diff --git a/plugins/compound-engineering/skills/agent-native-architecture/references/mobile-patterns.md b/plugins/compound-engineering/skills/agent-native-architecture/references/mobile-patterns.md
new file mode 100644
index 0000000..663f7d5
--- /dev/null
+++ b/plugins/compound-engineering/skills/agent-native-architecture/references/mobile-patterns.md
@@ -0,0 +1,658 @@
+
+Mobile agent-native apps face unique challenges: background execution limits, system permissions, network constraints, and cost sensitivity. This guide covers patterns for building robust agent experiences on iOS and Android.
+
+
+
+## Background Execution & Resumption
+
+Mobile apps can be suspended or terminated at any time. Agents must handle this gracefully.
+
+### The Challenge
+
+```
+User starts research agent
+ ↓
+Agent begins web search
+ ↓
+User switches to another app
+ ↓
+iOS suspends your app
+ ↓
+Agent is mid-execution... what happens?
+```
+
+### Checkpoint/Resume Pattern
+
+Save agent state before backgrounding, restore on foreground:
+
+```swift
+class AgentOrchestrator: ObservableObject {
+ @Published var activeSessions: [AgentSession] = []
+
+ // Called when app is about to background
+ func handleAppWillBackground() {
+ for session in activeSessions {
+ saveCheckpoint(session)
+ session.transition(to: .backgrounded)
+ }
+ }
+
+ // Called when app returns to foreground
+ func handleAppDidForeground() {
+ for session in activeSessions where session.state == .backgrounded {
+ if let checkpoint = loadCheckpoint(session.id) {
+ resumeFromCheckpoint(session, checkpoint)
+ }
+ }
+ }
+
+ private func saveCheckpoint(_ session: AgentSession) {
+ let checkpoint = AgentCheckpoint(
+ sessionId: session.id,
+ conversationHistory: session.messages,
+ pendingToolCalls: session.pendingToolCalls,
+ partialResults: session.partialResults,
+ timestamp: Date()
+ )
+ storage.save(checkpoint, for: session.id)
+ }
+
+ private func resumeFromCheckpoint(_ session: AgentSession, _ checkpoint: AgentCheckpoint) {
+ session.messages = checkpoint.conversationHistory
+ session.pendingToolCalls = checkpoint.pendingToolCalls
+
+ // Resume execution if there were pending tool calls
+ if !checkpoint.pendingToolCalls.isEmpty {
+ session.transition(to: .running)
+ Task { await executeNextTool(session) }
+ }
+ }
+}
+```
+
+### State Machine for Agent Lifecycle
+
+```swift
+enum AgentState {
+ case idle // Not running
+ case running // Actively executing
+ case waitingForUser // Paused, waiting for user input
+ case backgrounded // App backgrounded, state saved
+ case completed // Finished successfully
+ case failed(Error) // Finished with error
+}
+
+class AgentSession: ObservableObject {
+ @Published var state: AgentState = .idle
+
+ func transition(to newState: AgentState) {
+ let validTransitions: [AgentState: Set] = [
+ .idle: [.running],
+ .running: [.waitingForUser, .backgrounded, .completed, .failed],
+ .waitingForUser: [.running, .backgrounded],
+ .backgrounded: [.running, .completed],
+ ]
+
+ guard validTransitions[state]?.contains(newState) == true else {
+ logger.warning("Invalid transition: \(state) → \(newState)")
+ return
+ }
+
+ state = newState
+ }
+}
+```
+
+### Background Task Extension (iOS)
+
+Request extra time when backgrounded during critical operations:
+
+```swift
+class AgentOrchestrator {
+ private var backgroundTask: UIBackgroundTaskIdentifier = .invalid
+
+ func handleAppWillBackground() {
+ // Request extra time for saving state
+ backgroundTask = UIApplication.shared.beginBackgroundTask { [weak self] in
+ self?.endBackgroundTask()
+ }
+
+ // Save all checkpoints
+ Task {
+ for session in activeSessions {
+ await saveCheckpoint(session)
+ }
+ endBackgroundTask()
+ }
+ }
+
+ private func endBackgroundTask() {
+ if backgroundTask != .invalid {
+ UIApplication.shared.endBackgroundTask(backgroundTask)
+ backgroundTask = .invalid
+ }
+ }
+}
+```
+
+### User Communication
+
+Let users know what's happening:
+
+```swift
+struct AgentStatusView: View {
+ @ObservedObject var session: AgentSession
+
+ var body: some View {
+ switch session.state {
+ case .backgrounded:
+ Label("Paused (app in background)", systemImage: "pause.circle")
+ .foregroundColor(.orange)
+ case .running:
+ Label("Working...", systemImage: "ellipsis.circle")
+ .foregroundColor(.blue)
+ case .waitingForUser:
+ Label("Waiting for your input", systemImage: "person.circle")
+ .foregroundColor(.green)
+ // ...
+ }
+ }
+}
+```
+
+
+
+## Permission Handling
+
+Mobile agents may need access to system resources. Handle permission requests gracefully.
+
+### Common Permissions
+
+| Resource | iOS Permission | Use Case |
+|----------|---------------|----------|
+| Photo Library | PHPhotoLibrary | Profile generation from photos |
+| Files | Document picker | Reading user documents |
+| Camera | AVCaptureDevice | Scanning book covers |
+| Location | CLLocationManager | Location-aware recommendations |
+| Network | (automatic) | Web search, API calls |
+
+### Permission-Aware Tools
+
+Check permissions before executing:
+
+```swift
+struct PhotoTools {
+ static func readPhotos() -> AgentTool {
+ tool(
+ name: "read_photos",
+ description: "Read photos from the user's photo library",
+ parameters: [
+ "limit": .number("Maximum photos to read"),
+ "dateRange": .string("Date range filter").optional()
+ ],
+ execute: { params, context in
+ // Check permission first
+ let status = await PHPhotoLibrary.requestAuthorization(for: .readWrite)
+
+ switch status {
+ case .authorized, .limited:
+ // Proceed with reading photos
+ let photos = await fetchPhotos(params)
+ return ToolResult(text: "Found \(photos.count) photos", images: photos)
+
+ case .denied, .restricted:
+ return ToolResult(
+ text: "Photo access needed. Please grant permission in Settings → Privacy → Photos.",
+ isError: true
+ )
+
+ case .notDetermined:
+ return ToolResult(
+ text: "Photo permission required. Please try again.",
+ isError: true
+ )
+
+ @unknown default:
+ return ToolResult(text: "Unknown permission status", isError: true)
+ }
+ }
+ )
+ }
+}
+```
+
+### Graceful Degradation
+
+When permissions aren't granted, offer alternatives:
+
+```swift
+func readPhotos() async -> ToolResult {
+ let status = PHPhotoLibrary.authorizationStatus(for: .readWrite)
+
+ switch status {
+ case .denied, .restricted:
+ // Suggest alternative
+ return ToolResult(
+ text: """
+ I don't have access to your photos. You can either:
+ 1. Grant access in Settings → Privacy → Photos
+ 2. Share specific photos directly in our chat
+
+ Would you like me to help with something else instead?
+ """,
+ isError: false // Not a hard error, just a limitation
+ )
+ // ...
+ }
+}
+```
+
+### Permission Request Timing
+
+Don't request permissions until needed:
+
+```swift
+// BAD: Request all permissions at launch
+func applicationDidFinishLaunching() {
+ requestPhotoAccess()
+ requestCameraAccess()
+ requestLocationAccess()
+ // User is overwhelmed with permission dialogs
+}
+
+// GOOD: Request when the feature is used
+tool("analyze_book_cover", async ({ image }) => {
+ // Only request camera access when user tries to scan a cover
+ let status = await AVCaptureDevice.requestAccess(for: .video)
+ if status {
+ return await scanCover(image)
+ } else {
+ return ToolResult(text: "Camera access needed for book scanning")
+ }
+})
+```
+
+
+
+## Cost-Aware Design
+
+Mobile users may be on cellular data or concerned about API costs. Design agents to be efficient.
+
+### Model Tier Selection
+
+Use the cheapest model that achieves the outcome:
+
+```swift
+enum ModelTier {
+ case fast // claude-3-haiku: ~$0.25/1M tokens
+ case balanced // claude-3-sonnet: ~$3/1M tokens
+ case powerful // claude-3-opus: ~$15/1M tokens
+
+ var modelId: String {
+ switch self {
+ case .fast: return "claude-3-haiku-20240307"
+ case .balanced: return "claude-3-sonnet-20240229"
+ case .powerful: return "claude-3-opus-20240229"
+ }
+ }
+}
+
+// Match model to task complexity
+let agentConfigs: [AgentType: ModelTier] = [
+ .quickLookup: .fast, // "What's in my library?"
+ .chatAssistant: .balanced, // General conversation
+ .researchAgent: .balanced, // Web search + synthesis
+ .profileGenerator: .powerful, // Complex photo analysis
+ .introductionWriter: .balanced,
+]
+```
+
+### Token Budgets
+
+Limit tokens per agent session:
+
+```swift
+struct AgentConfig {
+ let modelTier: ModelTier
+ let maxInputTokens: Int
+ let maxOutputTokens: Int
+ let maxTurns: Int
+
+ static let research = AgentConfig(
+ modelTier: .balanced,
+ maxInputTokens: 50_000,
+ maxOutputTokens: 4_000,
+ maxTurns: 20
+ )
+
+ static let quickChat = AgentConfig(
+ modelTier: .fast,
+ maxInputTokens: 10_000,
+ maxOutputTokens: 1_000,
+ maxTurns: 5
+ )
+}
+
+class AgentSession {
+ var totalTokensUsed: Int = 0
+
+ func checkBudget() -> Bool {
+ if totalTokensUsed > config.maxInputTokens {
+ transition(to: .failed(AgentError.budgetExceeded))
+ return false
+ }
+ return true
+ }
+}
+```
+
+### Network-Aware Execution
+
+Defer heavy operations to WiFi:
+
+```swift
+class NetworkMonitor: ObservableObject {
+ @Published var isOnWiFi: Bool = false
+ @Published var isExpensive: Bool = false // Cellular or hotspot
+
+ private let monitor = NWPathMonitor()
+
+ func startMonitoring() {
+ monitor.pathUpdateHandler = { [weak self] path in
+ DispatchQueue.main.async {
+ self?.isOnWiFi = path.usesInterfaceType(.wifi)
+ self?.isExpensive = path.isExpensive
+ }
+ }
+ monitor.start(queue: .global())
+ }
+}
+
+class AgentOrchestrator {
+ @ObservedObject var network = NetworkMonitor()
+
+ func startResearchAgent(for book: Book) async {
+ if network.isExpensive {
+ // Warn user or defer
+ let proceed = await showAlert(
+ "Research uses data",
+ message: "This will use approximately 1-2 MB of cellular data. Continue?"
+ )
+ if !proceed { return }
+ }
+
+ // Proceed with research
+ await runAgent(ResearchAgent.create(book: book))
+ }
+}
+```
+
+### Batch API Calls
+
+Combine multiple small requests:
+
+```swift
+// BAD: Many small API calls
+for book in books {
+ await agent.chat("Summarize \(book.title)")
+}
+
+// GOOD: Batch into one request
+let bookList = books.map { $0.title }.joined(separator: ", ")
+await agent.chat("Summarize each of these books briefly: \(bookList)")
+```
+
+### Caching
+
+Cache expensive operations:
+
+```swift
+class ResearchCache {
+ private var cache: [String: CachedResearch] = [:]
+
+ func getCachedResearch(for bookId: String) -> CachedResearch? {
+ guard let cached = cache[bookId] else { return nil }
+
+ // Expire after 24 hours
+ if Date().timeIntervalSince(cached.timestamp) > 86400 {
+ cache.removeValue(forKey: bookId)
+ return nil
+ }
+
+ return cached
+ }
+
+ func cacheResearch(_ research: Research, for bookId: String) {
+ cache[bookId] = CachedResearch(
+ research: research,
+ timestamp: Date()
+ )
+ }
+}
+
+// In research tool
+tool("web_search", async ({ query, bookId }) => {
+ // Check cache first
+ if let cached = cache.getCachedResearch(for: bookId) {
+ return ToolResult(text: cached.research.summary, cached: true)
+ }
+
+ // Otherwise, perform search
+ let results = await webSearch(query)
+ cache.cacheResearch(results, for: bookId)
+ return ToolResult(text: results.summary)
+})
+```
+
+### Cost Visibility
+
+Show users what they're spending:
+
+```swift
+struct AgentCostView: View {
+ @ObservedObject var session: AgentSession
+
+ var body: some View {
+ VStack(alignment: .leading) {
+ Text("Session Stats")
+ .font(.headline)
+
+ HStack {
+ Label("\(session.turnCount) turns", systemImage: "arrow.2.squarepath")
+ Spacer()
+ Label(formatTokens(session.totalTokensUsed), systemImage: "text.word.spacing")
+ }
+
+ if let estimatedCost = session.estimatedCost {
+ Text("Est. cost: \(estimatedCost, format: .currency(code: "USD"))")
+ .font(.caption)
+ .foregroundColor(.secondary)
+ }
+ }
+ }
+}
+```
+
+
+
+## Offline Graceful Degradation
+
+Handle offline scenarios gracefully:
+
+```swift
+class ConnectivityAwareAgent {
+ @ObservedObject var network = NetworkMonitor()
+
+ func executeToolCall(_ toolCall: ToolCall) async -> ToolResult {
+ // Check if tool requires network
+ let requiresNetwork = ["web_search", "web_fetch", "call_api"]
+ .contains(toolCall.name)
+
+ if requiresNetwork && !network.isConnected {
+ return ToolResult(
+ text: """
+ I can't access the internet right now. Here's what I can do offline:
+ - Read your library and existing research
+ - Answer questions from cached data
+ - Write notes and drafts for later
+
+ Would you like me to try something that works offline?
+ """,
+ isError: false
+ )
+ }
+
+ return await executeOnline(toolCall)
+ }
+}
+```
+
+### Offline-First Tools
+
+Some tools should work entirely offline:
+
+```swift
+let offlineTools: Set = [
+ "read_file",
+ "write_file",
+ "list_files",
+ "read_library", // Local database
+ "search_local", // Local search
+]
+
+let onlineTools: Set = [
+ "web_search",
+ "web_fetch",
+ "publish_to_cloud",
+]
+
+let hybridTools: Set = [
+ "publish_to_feed", // Works offline, syncs later
+]
+```
+
+### Queued Actions
+
+Queue actions that require connectivity:
+
+```swift
+class OfflineQueue: ObservableObject {
+ @Published var pendingActions: [QueuedAction] = []
+
+ func queue(_ action: QueuedAction) {
+ pendingActions.append(action)
+ persist()
+ }
+
+ func processWhenOnline() {
+ network.$isConnected
+ .filter { $0 }
+ .sink { [weak self] _ in
+ self?.processPendingActions()
+ }
+ }
+
+ private func processPendingActions() {
+ for action in pendingActions {
+ Task {
+ try await execute(action)
+ remove(action)
+ }
+ }
+ }
+}
+```
+
+
+
+## Battery-Aware Execution
+
+Respect device battery state:
+
+```swift
+class BatteryMonitor: ObservableObject {
+ @Published var batteryLevel: Float = 1.0
+ @Published var isCharging: Bool = false
+ @Published var isLowPowerMode: Bool = false
+
+ var shouldDeferHeavyWork: Bool {
+ return batteryLevel < 0.2 && !isCharging
+ }
+
+ func startMonitoring() {
+ UIDevice.current.isBatteryMonitoringEnabled = true
+
+ NotificationCenter.default.addObserver(
+ forName: UIDevice.batteryLevelDidChangeNotification,
+ object: nil,
+ queue: .main
+ ) { [weak self] _ in
+ self?.batteryLevel = UIDevice.current.batteryLevel
+ }
+
+ NotificationCenter.default.addObserver(
+ forName: NSNotification.Name.NSProcessInfoPowerStateDidChange,
+ object: nil,
+ queue: .main
+ ) { [weak self] _ in
+ self?.isLowPowerMode = ProcessInfo.processInfo.isLowPowerModeEnabled
+ }
+ }
+}
+
+class AgentOrchestrator {
+ @ObservedObject var battery = BatteryMonitor()
+
+ func startAgent(_ config: AgentConfig) async {
+ if battery.shouldDeferHeavyWork && config.isHeavy {
+ let proceed = await showAlert(
+ "Low Battery",
+ message: "This task uses significant battery. Continue or defer until charging?"
+ )
+ if !proceed { return }
+ }
+
+ // Adjust model tier based on battery
+ let adjustedConfig = battery.isLowPowerMode
+ ? config.withModelTier(.fast)
+ : config
+
+ await runAgent(adjustedConfig)
+ }
+}
+```
+
+
+
+## Mobile Agent-Native Checklist
+
+**Background Execution:**
+- [ ] Checkpoint/resume implemented for all agent sessions
+- [ ] State machine for agent lifecycle (idle, running, backgrounded, etc.)
+- [ ] Background task extension for critical saves
+- [ ] User-visible status for backgrounded agents
+
+**Permissions:**
+- [ ] Permissions requested only when needed, not at launch
+- [ ] Graceful degradation when permissions denied
+- [ ] Clear error messages with Settings deep links
+- [ ] Alternative paths when permissions unavailable
+
+**Cost Awareness:**
+- [ ] Model tier matched to task complexity
+- [ ] Token budgets per session
+- [ ] Network-aware (defer heavy work to WiFi)
+- [ ] Caching for expensive operations
+- [ ] Cost visibility to users
+
+**Offline Handling:**
+- [ ] Offline-capable tools identified
+- [ ] Graceful degradation for online-only features
+- [ ] Action queue for sync when online
+- [ ] Clear user communication about offline state
+
+**Battery Awareness:**
+- [ ] Battery monitoring for heavy operations
+- [ ] Low power mode detection
+- [ ] Defer or downgrade based on battery state
+
diff --git a/plugins/compound-engineering/skills/agent-native-architecture/references/shared-workspace-architecture.md b/plugins/compound-engineering/skills/agent-native-architecture/references/shared-workspace-architecture.md
new file mode 100644
index 0000000..1434733
--- /dev/null
+++ b/plugins/compound-engineering/skills/agent-native-architecture/references/shared-workspace-architecture.md
@@ -0,0 +1,680 @@
+
+Agents and users should work in the same data space, not separate sandboxes. When the agent writes a file, the user can see it. When the user edits something, the agent can read the changes. This creates transparency, enables collaboration, and eliminates the need for sync layers.
+
+**Core principle:** The agent operates in the same filesystem as the user, not a walled garden.
+
+
+
+## Why Shared Workspace?
+
+### The Sandbox Anti-Pattern
+
+Many agent implementations isolate the agent:
+
+```
+┌─────────────────┐ ┌─────────────────┐
+│ User Space │ │ Agent Space │
+├─────────────────┤ ├─────────────────┤
+│ Documents/ │ │ agent_output/ │
+│ user_files/ │ ←→ │ temp_files/ │
+│ settings.json │sync │ cache/ │
+└─────────────────┘ └─────────────────┘
+```
+
+Problems:
+- Need a sync layer to move data between spaces
+- User can't easily inspect agent work
+- Agent can't build on user contributions
+- Duplication of state
+- Complexity in keeping spaces consistent
+
+### The Shared Workspace Pattern
+
+```
+┌─────────────────────────────────────────┐
+│ Shared Workspace │
+├─────────────────────────────────────────┤
+│ Documents/ │
+│ ├── Research/ │
+│ │ └── {bookId}/ ← Agent writes │
+│ │ ├── full_text.txt │
+│ │ ├── introduction.md ← User can edit │
+│ │ └── sources/ │
+│ ├── Chats/ ← Both read/write │
+│ └── profile.md ← Agent generates, user refines │
+└─────────────────────────────────────────┘
+ ↑ ↑
+ User Agent
+ (UI) (Tools)
+```
+
+Benefits:
+- Users can inspect, edit, and extend agent work
+- Agents can build on user contributions
+- No synchronization layer needed
+- Complete transparency
+- Single source of truth
+
+
+
+## Designing Your Shared Workspace
+
+### Structure by Domain
+
+Organize by what the data represents, not who created it:
+
+```
+Documents/
+├── Research/
+│ └── {bookId}/
+│ ├── full_text.txt # Agent downloads
+│ ├── introduction.md # Agent generates, user can edit
+│ ├── notes.md # User adds, agent can read
+│ └── sources/
+│ └── {source}.md # Agent gathers
+├── Chats/
+│ └── {conversationId}.json # Both read/write
+├── Exports/
+│ └── {date}/ # Agent generates for user
+└── profile.md # Agent generates from photos
+```
+
+### Don't Structure by Actor
+
+```
+# BAD - Separates by who created it
+Documents/
+├── user_created/
+│ └── notes.md
+├── agent_created/
+│ └── research.md
+└── system/
+ └── config.json
+```
+
+This creates artificial boundaries and makes collaboration harder.
+
+### Use Conventions for Metadata
+
+If you need to track who created/modified something:
+
+```markdown
+
+---
+created_by: agent
+created_at: 2024-01-15
+last_modified_by: user
+last_modified_at: 2024-01-16
+---
+
+# Introduction to Moby Dick
+
+This personalized introduction was generated by your reading assistant
+and refined by you on January 16th.
+```
+
+
+
+## File Tools for Shared Workspace
+
+Give the agent the same file primitives the app uses:
+
+```swift
+// iOS/Swift implementation
+struct FileTools {
+ static func readFile() -> AgentTool {
+ tool(
+ name: "read_file",
+ description: "Read a file from the user's documents",
+ parameters: ["path": .string("File path relative to Documents/")],
+ execute: { params in
+ let path = params["path"] as! String
+ let documentsURL = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)[0]
+ let fileURL = documentsURL.appendingPathComponent(path)
+ let content = try String(contentsOf: fileURL)
+ return ToolResult(text: content)
+ }
+ )
+ }
+
+ static func writeFile() -> AgentTool {
+ tool(
+ name: "write_file",
+ description: "Write a file to the user's documents",
+ parameters: [
+ "path": .string("File path relative to Documents/"),
+ "content": .string("File content")
+ ],
+ execute: { params in
+ let path = params["path"] as! String
+ let content = params["content"] as! String
+ let documentsURL = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)[0]
+ let fileURL = documentsURL.appendingPathComponent(path)
+
+ // Create parent directories if needed
+ try FileManager.default.createDirectory(
+ at: fileURL.deletingLastPathComponent(),
+ withIntermediateDirectories: true
+ )
+
+ try content.write(to: fileURL, atomically: true, encoding: .utf8)
+ return ToolResult(text: "Wrote \(path)")
+ }
+ )
+ }
+
+ static func listFiles() -> AgentTool {
+ tool(
+ name: "list_files",
+ description: "List files in a directory",
+ parameters: ["path": .string("Directory path relative to Documents/")],
+ execute: { params in
+ let path = params["path"] as! String
+ let documentsURL = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)[0]
+ let dirURL = documentsURL.appendingPathComponent(path)
+ let contents = try FileManager.default.contentsOfDirectory(atPath: dirURL.path)
+ return ToolResult(text: contents.joined(separator: "\n"))
+ }
+ )
+ }
+
+ static func searchText() -> AgentTool {
+ tool(
+ name: "search_text",
+ description: "Search for text across files",
+ parameters: [
+ "query": .string("Text to search for"),
+ "path": .string("Directory to search in").optional()
+ ],
+ execute: { params in
+ // Implement text search across documents
+ // Return matching files and snippets
+ }
+ )
+ }
+}
+```
+
+### TypeScript/Node.js Implementation
+
+```typescript
+const fileTools = [
+ tool(
+ "read_file",
+ "Read a file from the workspace",
+ { path: z.string().describe("File path") },
+ async ({ path }) => {
+ const content = await fs.readFile(path, 'utf-8');
+ return { text: content };
+ }
+ ),
+
+ tool(
+ "write_file",
+ "Write a file to the workspace",
+ {
+ path: z.string().describe("File path"),
+ content: z.string().describe("File content")
+ },
+ async ({ path, content }) => {
+ await fs.mkdir(dirname(path), { recursive: true });
+ await fs.writeFile(path, content, 'utf-8');
+ return { text: `Wrote ${path}` };
+ }
+ ),
+
+ tool(
+ "list_files",
+ "List files in a directory",
+ { path: z.string().describe("Directory path") },
+ async ({ path }) => {
+ const files = await fs.readdir(path);
+ return { text: files.join('\n') };
+ }
+ ),
+
+ tool(
+ "append_file",
+ "Append content to a file",
+ {
+ path: z.string().describe("File path"),
+ content: z.string().describe("Content to append")
+ },
+ async ({ path, content }) => {
+ await fs.appendFile(path, content, 'utf-8');
+ return { text: `Appended to ${path}` };
+ }
+ ),
+];
+```
+
+
+
+## UI Integration with Shared Workspace
+
+The UI should observe the same files the agent writes to:
+
+### Pattern 1: File-Based Reactivity (iOS)
+
+```swift
+class ResearchViewModel: ObservableObject {
+ @Published var researchFiles: [ResearchFile] = []
+
+ private var watcher: DirectoryWatcher?
+
+ func startWatching(bookId: String) {
+ let researchPath = documentsURL
+ .appendingPathComponent("Research")
+ .appendingPathComponent(bookId)
+
+ watcher = DirectoryWatcher(url: researchPath) { [weak self] in
+ // Reload when agent writes new files
+ self?.loadResearchFiles(from: researchPath)
+ }
+
+ loadResearchFiles(from: researchPath)
+ }
+}
+
+// SwiftUI automatically updates when files change
+struct ResearchView: View {
+ @StateObject var viewModel = ResearchViewModel()
+
+ var body: some View {
+ List(viewModel.researchFiles) { file in
+ ResearchFileRow(file: file)
+ }
+ }
+}
+```
+
+### Pattern 2: Shared Data Store
+
+When file-watching isn't practical, use a shared data store:
+
+```swift
+// Shared service that both UI and agent tools use
+class BookLibraryService: ObservableObject {
+ static let shared = BookLibraryService()
+
+ @Published var books: [Book] = []
+ @Published var analysisRecords: [AnalysisRecord] = []
+
+ func addAnalysisRecord(_ record: AnalysisRecord) {
+ analysisRecords.append(record)
+ // Persists to shared storage
+ saveToStorage()
+ }
+}
+
+// Agent tool writes through the same service
+tool("publish_to_feed", async ({ bookId, content, headline }) => {
+ let record = AnalysisRecord(bookId: bookId, content: content, headline: headline)
+ BookLibraryService.shared.addAnalysisRecord(record)
+ return { text: "Published to feed" }
+})
+
+// UI observes the same service
+struct FeedView: View {
+ @StateObject var library = BookLibraryService.shared
+
+ var body: some View {
+ List(library.analysisRecords) { record in
+ FeedItemRow(record: record)
+ }
+ }
+}
+```
+
+### Pattern 3: Hybrid (Files + Index)
+
+Use files for content, database for indexing:
+
+```
+Documents/
+├── Research/
+│ └── book_123/
+│ └── introduction.md # Actual content (file)
+
+Database:
+├── research_index
+│ └── { bookId: "book_123", path: "Research/book_123/introduction.md", ... }
+```
+
+```swift
+// Agent writes file
+await writeFile("Research/\(bookId)/introduction.md", content)
+
+// And updates index
+await database.insert("research_index", {
+ bookId: bookId,
+ path: "Research/\(bookId)/introduction.md",
+ title: extractTitle(content),
+ createdAt: Date()
+})
+
+// UI queries index, then reads files
+let items = database.query("research_index", where: bookId == "book_123")
+for item in items {
+ let content = readFile(item.path)
+ // Display...
+}
+```
+
+
+
+## Agent-User Collaboration Patterns
+
+### Pattern: Agent Drafts, User Refines
+
+```
+1. Agent generates introduction.md
+2. User opens in Files app or in-app editor
+3. User makes refinements
+4. Agent can see changes via read_file
+5. Future agent work builds on user refinements
+```
+
+The agent's system prompt should acknowledge this:
+
+```markdown
+## Working with User Content
+
+When you create content (introductions, research notes, etc.), the user may
+edit it afterward. Always read existing files before modifying them—the user
+may have made improvements you should preserve.
+
+If a file exists and has been modified by the user (check the metadata or
+compare to your last known version), ask before overwriting.
+```
+
+### Pattern: User Seeds, Agent Expands
+
+```
+1. User creates notes.md with initial thoughts
+2. User asks: "Research more about this"
+3. Agent reads notes.md to understand context
+4. Agent adds to notes.md or creates related files
+5. User continues building on agent additions
+```
+
+### Pattern: Append-Only Collaboration
+
+For chat logs or activity streams:
+
+```markdown
+
+
+## 2024-01-15
+
+**User:** Started reading "Moby Dick"
+
+**Agent:** Downloaded full text and created research folder
+
+**User:** Added highlight about whale symbolism
+
+**Agent:** Found 3 academic sources on whale symbolism in Melville's work
+```
+
+
+
+## Security in Shared Workspace
+
+### Scope the Workspace
+
+Don't give agents access to the entire filesystem:
+
+```swift
+// GOOD: Scoped to app's documents
+let documentsURL = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)[0]
+
+tool("read_file", { path }) {
+ // Path is relative to documents, can't escape
+ let fileURL = documentsURL.appendingPathComponent(path)
+ guard fileURL.path.hasPrefix(documentsURL.path) else {
+ throw ToolError("Invalid path")
+ }
+ return try String(contentsOf: fileURL)
+}
+
+// BAD: Absolute paths allow escape
+tool("read_file", { path }) {
+ return try String(contentsOf: URL(fileURLWithPath: path)) // Can read /etc/passwd!
+}
+```
+
+### Protect Sensitive Files
+
+```swift
+let protectedPaths = [".env", "credentials.json", "secrets/"]
+
+tool("read_file", { path }) {
+ if protectedPaths.any({ path.contains($0) }) {
+ throw ToolError("Cannot access protected file")
+ }
+ // ...
+}
+```
+
+### Audit Agent Actions
+
+Log what the agent reads/writes:
+
+```swift
+func logFileAccess(action: String, path: String, agentId: String) {
+ logger.info("[\(agentId)] \(action): \(path)")
+}
+
+tool("write_file", { path, content }) {
+ logFileAccess(action: "WRITE", path: path, agentId: context.agentId)
+ // ...
+}
+```
+
+
+
+## Real-World Example: Every Reader
+
+The Every Reader app uses shared workspace for research:
+
+```
+Documents/
+├── Research/
+│ └── book_moby_dick/
+│ ├── full_text.txt # Agent downloads from Gutenberg
+│ ├── introduction.md # Agent generates, personalized
+│ ├── sources/
+│ │ ├── whale_symbolism.md # Agent researches
+│ │ └── melville_bio.md # Agent researches
+│ └── user_notes.md # User can add their own notes
+├── Chats/
+│ └── 2024-01-15.json # Chat history
+└── profile.md # Agent generated from photos
+```
+
+**How it works:**
+
+1. User adds "Moby Dick" to library
+2. User starts research agent
+3. Agent downloads full text to `Research/book_moby_dick/full_text.txt`
+4. Agent researches and writes to `sources/`
+5. Agent generates `introduction.md` based on user's reading profile
+6. User can view all files in the app or Files.app
+7. User can edit `introduction.md` to refine it
+8. Chat agent can read all of this context when answering questions
+
+
+
+## iCloud File Storage for Multi-Device Sync (iOS)
+
+For agent-native iOS apps, use iCloud Drive's Documents folder for your shared workspace. This gives you **free, automatic multi-device sync** without building a sync layer or running a server.
+
+### Why iCloud Documents?
+
+| Approach | Cost | Complexity | Offline | Multi-Device |
+|----------|------|------------|---------|--------------|
+| Custom backend + sync | $$$ | High | Manual | Yes |
+| CloudKit database | Free tier limits | Medium | Manual | Yes |
+| **iCloud Documents** | Free (user's storage) | Low | Automatic | Automatic |
+
+iCloud Documents:
+- Uses user's existing iCloud storage (free 5GB, most users have more)
+- Automatic sync across all user's devices
+- Works offline, syncs when online
+- Files visible in Files.app for transparency
+- No server costs, no sync code to maintain
+
+### Implementation Pattern
+
+```swift
+// Get the iCloud Documents container
+func iCloudDocumentsURL() -> URL? {
+ FileManager.default.url(forUbiquityContainerIdentifier: nil)?
+ .appendingPathComponent("Documents")
+}
+
+// Your shared workspace lives in iCloud
+class SharedWorkspace {
+ let rootURL: URL
+
+ init() {
+ // Use iCloud if available, fall back to local
+ if let iCloudURL = iCloudDocumentsURL() {
+ self.rootURL = iCloudURL
+ } else {
+ // Fallback to local Documents (user not signed into iCloud)
+ self.rootURL = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask).first!
+ }
+ }
+
+ // All file operations go through this root
+ func researchPath(for bookId: String) -> URL {
+ rootURL.appendingPathComponent("Research/\(bookId)")
+ }
+
+ func journalPath() -> URL {
+ rootURL.appendingPathComponent("Journal")
+ }
+}
+```
+
+### Directory Structure in iCloud
+
+```
+iCloud Drive/
+└── YourApp/ # Your app's container
+ └── Documents/ # Visible in Files.app
+ ├── Journal/
+ │ ├── user/
+ │ │ └── 2025-01-15.md # Syncs across devices
+ │ └── agent/
+ │ └── 2025-01-15.md # Agent observations sync too
+ ├── Experiments/
+ │ └── magnesium-sleep/
+ │ ├── config.json
+ │ └── log.json
+ └── Research/
+ └── {topic}/
+ └── sources.md
+```
+
+### Handling Sync Conflicts
+
+iCloud handles conflicts automatically, but you should design for it:
+
+```swift
+// Check for conflicts when reading
+func readJournalEntry(at url: URL) throws -> JournalEntry {
+ // iCloud may create .icloud placeholder files for not-yet-downloaded content
+ if url.pathExtension == "icloud" {
+ // Trigger download
+ try FileManager.default.startDownloadingUbiquitousItem(at: url)
+ throw FileNotYetAvailableError()
+ }
+
+ let data = try Data(contentsOf: url)
+ return try JSONDecoder().decode(JournalEntry.self, from: data)
+}
+
+// For writes, use coordinated file access
+func writeJournalEntry(_ entry: JournalEntry, to url: URL) throws {
+ let coordinator = NSFileCoordinator()
+ var error: NSError?
+
+ coordinator.coordinate(writingItemAt: url, options: .forReplacing, error: &error) { newURL in
+ let data = try? JSONEncoder().encode(entry)
+ try? data?.write(to: newURL)
+ }
+
+ if let error = error {
+ throw error
+ }
+}
+```
+
+### What This Enables
+
+1. **User starts experiment on iPhone** → Agent creates `Experiments/sleep-tracking/config.json`
+2. **User opens app on iPad** → Same experiment visible, no sync code needed
+3. **Agent logs observation on iPhone** → Syncs to iPad automatically
+4. **User edits journal on iPad** → iPhone sees the edit
+
+### Entitlements Required
+
+Add to your app's entitlements:
+
+```xml
+com.apple.developer.icloud-container-identifiers
+
+ iCloud.com.yourcompany.yourapp
+
+com.apple.developer.icloud-services
+
+ CloudDocuments
+
+com.apple.developer.ubiquity-container-identifiers
+
+ iCloud.com.yourcompany.yourapp
+
+```
+
+### When NOT to Use iCloud Documents
+
+- **Sensitive data** - Use Keychain or encrypted local storage instead
+- **High-frequency writes** - iCloud sync has latency; use local + periodic sync
+- **Large media files** - Consider CloudKit Assets or on-demand resources
+- **Shared between users** - iCloud Documents is single-user; use CloudKit for sharing
+
+
+
+## Shared Workspace Checklist
+
+Architecture:
+- [ ] Single shared directory for agent and user data
+- [ ] Organized by domain, not by actor
+- [ ] File tools scoped to workspace (no escape)
+- [ ] Protected paths for sensitive files
+
+Tools:
+- [ ] `read_file` - Read any file in workspace
+- [ ] `write_file` - Write any file in workspace
+- [ ] `list_files` - Browse directory structure
+- [ ] `search_text` - Find content across files (optional)
+
+UI Integration:
+- [ ] UI observes same files agent writes
+- [ ] Changes reflect immediately (file watching or shared store)
+- [ ] User can edit agent-created files
+- [ ] Agent reads user modifications before overwriting
+
+Collaboration:
+- [ ] System prompt acknowledges user may edit files
+- [ ] Agent checks for user modifications before overwriting
+- [ ] Metadata tracks who created/modified (optional)
+
+Multi-Device (iOS):
+- [ ] Use iCloud Documents for shared workspace (free sync)
+- [ ] Fallback to local Documents if iCloud unavailable
+- [ ] Handle `.icloud` placeholder files (trigger download)
+- [ ] Use NSFileCoordinator for conflict-safe writes
+