[2.18.0] Add Dynamic Capability Discovery and iCloud sync patterns (#62)

* [2.17.0] Expand agent-native skill with mobile app learnings Major expansion of agent-native-architecture skill based on real-world learnings from building the Every Reader iOS app. New reference documents: - dynamic-context-injection.md: Runtime app state in system prompts - action-parity-discipline.md: Ensuring agents can do what users can - shared-workspace-architecture.md: Agents and users in same data space - agent-native-testing.md: Testing patterns for agent-native apps - mobile-patterns.md: Background execution, permissions, cost awareness Updated references: - architecture-patterns.md: Added Unified Agent Architecture, Agent-to-UI Communication, and Model Tier Selection patterns Enhanced agent-native-reviewer with comprehensive review process covering all new patterns, including mobile-specific verification. Key insight: "The agent should be able to do anything the user can do, through tools that mirror UI capabilities, with full context about the app state." 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * [2.18.0] Add Dynamic Capability Discovery and iCloud sync patterns New patterns in agent-native-architecture skill: - **Dynamic Capability Discovery** - For agent-native apps integrating with external APIs (HealthKit, HomeKit, GraphQL), use a discovery tool (list_*) plus a generic access tool instead of individual tools per endpoint. (Note: Static mapping is fine for constrained agents with limited scope.) - **CRUD Completeness** - Every entity needs create, read, update, AND delete. - **iCloud File Storage** - Use iCloud Documents for shared workspace to get free, automatic multi-device sync without building a sync layer. - **Architecture Review Checklist** - Pushes reviewer findings earlier into design phase. Covers tool design, action parity, UI integration, context. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
2025-12-25 13:03:07 -05:00
parent 5a79f97374
commit 1bc6bd9164
12 changed files with 3523 additions and 60 deletions
--- a/plugins/compound-engineering/skills/agent-native-architecture/SKILL.md
+++ b/plugins/compound-engineering/skills/agent-native-architecture/SKILL.md
@@ -65,6 +65,12 @@ What aspect of agent native architecture do you need help with?
 3. **Write system prompts** - Define agent behavior in prompts
 4. **Self-modification** - Enable agents to safely evolve themselves
 5. **Review/refactor** - Make existing code more prompt-native
+6. **Context injection** - Inject runtime app state into agent prompts
+7. **Action parity** - Ensure agents can do everything users can do
+8. **Shared workspace** - Set up agents and users in the same data space
+9. **Testing** - Test agent-native apps for capability and parity
+10. **Mobile patterns** - Handle background execution, permissions, cost
+11. **API integration** - Connect to external APIs (HealthKit, HomeKit, GraphQL)

 **Wait for response before proceeding.**
 </intake>
@@ -72,15 +78,55 @@ What aspect of agent native architecture do you need help with?
 <routing>
 | Response | Action |
 |----------|--------|
-| 1, "design", "architecture", "plan" | Read [architecture-patterns.md](./references/architecture-patterns.md) |
+| 1, "design", "architecture", "plan" | Read [architecture-patterns.md](./references/architecture-patterns.md), then apply Architecture Checklist below |
 | 2, "tool", "mcp", "primitive" | Read [mcp-tool-design.md](./references/mcp-tool-design.md) |
 | 3, "prompt", "system prompt", "behavior" | Read [system-prompt-design.md](./references/system-prompt-design.md) |
 | 4, "self-modify", "evolve", "git" | Read [self-modification.md](./references/self-modification.md) |
 | 5, "review", "refactor", "existing" | Read [refactoring-to-prompt-native.md](./references/refactoring-to-prompt-native.md) |
+| 6, "context", "inject", "runtime", "dynamic" | Read [dynamic-context-injection.md](./references/dynamic-context-injection.md) |
+| 7, "parity", "ui action", "capability map" | Read [action-parity-discipline.md](./references/action-parity-discipline.md) |
+| 8, "workspace", "shared", "files", "filesystem" | Read [shared-workspace-architecture.md](./references/shared-workspace-architecture.md) |
+| 9, "test", "testing", "verify", "validate" | Read [agent-native-testing.md](./references/agent-native-testing.md) |
+| 10, "mobile", "ios", "android", "background" | Read [mobile-patterns.md](./references/mobile-patterns.md) |
+| 11, "api", "healthkit", "homekit", "graphql", "external" | Read [mcp-tool-design.md](./references/mcp-tool-design.md) (Dynamic Capability Discovery section) |

 **After reading the reference, apply those patterns to the user's specific context.**
 </routing>

+<architecture_checklist>
+## Architecture Review Checklist (Apply During Design)
+
+When designing an agent-native system, verify these **before implementation**:
+
+### Tool Design
+- [ ] **Dynamic vs Static:** For external APIs where agent should have full user-level access (HealthKit, HomeKit, GraphQL), use Dynamic Capability Discovery. Only use static mapping if intentionally limiting agent scope.
+- [ ] **CRUD Completeness:** Every entity has create, read, update, AND delete tools
+- [ ] **Primitives not Workflows:** Tools enable capability, they don't encode business logic
+- [ ] **API as Validator:** Use `z.string()` inputs when the API validates, not `z.enum()`
+
+### Action Parity
+- [ ] **Capability Map:** Every UI action has a corresponding agent tool
+- [ ] **Edit/Delete:** If UI can edit or delete, agent must be able to too
+- [ ] **The Write Test:** "Write something to [app location]" must work for all locations
+
+### UI Integration
+- [ ] **Agent → UI:** Define how agent changes reflect in UI (shared service, file watching, or event bus)
+- [ ] **No Silent Actions:** Agent writes should trigger UI updates immediately
+- [ ] **Capability Discovery:** Users can learn what agent can do (onboarding, hints)
+
+### Context Injection
+- [ ] **Available Resources:** System prompt includes what exists (files, data, types)
+- [ ] **Available Capabilities:** System prompt documents what agent can do with user vocabulary
+- [ ] **Dynamic Context:** Context refreshes for long sessions (or provide `refresh_context` tool)
+
+### Mobile (if applicable)
+- [ ] **Background Execution:** Checkpoint/resume pattern for iOS app suspension
+- [ ] **Permissions:** Just-in-time permission requests in tools
+- [ ] **Cost Awareness:** Model tier selection (Haiku/Sonnet/Opus)
+
+**When designing architecture, explicitly address each checkbox in your plan.**
+</architecture_checklist>
+
 <quick_start>
 Build a prompt-native agent in three steps:

@@ -123,11 +169,19 @@ query({

 All references in `references/`:

+**Core Patterns:**
 - **Architecture:** [architecture-patterns.md](./references/architecture-patterns.md)
- **Tool Design:** [mcp-tool-design.md](./references/mcp-tool-design.md)
+- **Tool Design:** [mcp-tool-design.md](./references/mcp-tool-design.md) - includes Dynamic Capability Discovery, CRUD Completeness
 - **Prompts:** [system-prompt-design.md](./references/system-prompt-design.md)
 - **Self-Modification:** [self-modification.md](./references/self-modification.md)
 - **Refactoring:** [refactoring-to-prompt-native.md](./references/refactoring-to-prompt-native.md)
+
+**Agent-Native Disciplines:**
+- **Context Injection:** [dynamic-context-injection.md](./references/dynamic-context-injection.md)
+- **Action Parity:** [action-parity-discipline.md](./references/action-parity-discipline.md)
+- **Shared Workspace:** [shared-workspace-architecture.md](./references/shared-workspace-architecture.md)
+- **Testing:** [agent-native-testing.md](./references/agent-native-testing.md)
+- **Mobile Patterns:** [mobile-patterns.md](./references/mobile-patterns.md)
 </reference_index>

 <anti_patterns>
@@ -186,11 +240,80 @@ each under 20 words, formatted with em-dashes...
 // Right - define outcome, trust intelligence
 Create clear, useful summaries. Use your judgment.
 ```
+
+### Agent-Native Anti-Patterns
+
+**Context Starvation**
+Agent doesn't know what resources exist in the app.
+```
+User: "Write something about Catherine the Great in my feed"
+Agent: "What feed? I don't understand what system you're referring to."
+```
+Fix: Inject available resources, capabilities, and vocabulary into the system prompt at runtime.
+
+**Orphan Features**
+UI action with no agent equivalent.
+```swift
+// UI has a "Publish to Feed" button
+Button("Publish") { publishToFeed(insight) }
+// But no agent tool exists to do the same thing
+```
+Fix: Add corresponding tool and document in system prompt for every UI action.
+
+**Sandbox Isolation**
+Agent works in separate data space from user.
+```
+Documents/
+├── user_files/        ← User's space
+└── agent_output/      ← Agent's space (isolated)
+```
+Fix: Use shared workspace where both agent and user operate on the same files.
+
+**Silent Actions**
+Agent changes state but UI doesn't update.
+```typescript
+// Agent writes to database
+await db.insert("feed", content);
+// But UI doesn't observe this table - user sees nothing
+```
+Fix: Use shared data stores with reactive binding, or file system observation.
+
+**Capability Hiding**
+Users can't discover what agents can do.
+```
+User: "Help me with my reading"
+Agent: "What would you like help with?"
+// Agent doesn't mention it can publish to feed, research books, etc.
+```
+Fix: Include capability hints in agent responses or provide onboarding.
+
+**Static Tool Mapping (for agent-native apps)**
+Building individual tools for each API endpoint when you want the agent to have full access.
+```typescript
+// You built 50 tools for 50 HealthKit types
+tool("read_steps", ...)
+tool("read_heart_rate", ...)
+tool("read_sleep", ...)
+// When glucose tracking is added... code change required
+// Agent can only access what you anticipated
+```
+Fix: Use Dynamic Capability Discovery - one `list_*` tool to discover what's available, one generic tool to access any type. See [mcp-tool-design.md](./references/mcp-tool-design.md). (Note: Static mapping is fine for constrained agents with intentionally limited scope.)
+
+**Incomplete CRUD**
+Agent can create but not update or delete.
+```typescript
+// ❌ User: "Delete that journal entry"
+// Agent: "I don't have a tool for that"
+tool("create_journal_entry", ...)
+// Missing: update_journal_entry, delete_journal_entry
+```
+Fix: Every entity needs full CRUD (Create, Read, Update, Delete). The CRUD Audit: for each entity, verify all four operations exist.
 </anti_patterns>

 <success_criteria>
 You've built a prompt-native agent when:

+**Core Prompt-Native Criteria:**
 - [ ] The agent figures out HOW to achieve outcomes, not just calls your functions
 - [ ] Whatever a user could do, the agent can do (no artificial limits)
 - [ ] Features are prompts that define outcomes, not code that defines workflows
@@ -198,4 +321,25 @@ You've built a prompt-native agent when:
 - [ ] Changing behavior means editing prose, not refactoring code
 - [ ] The agent can surprise you with clever approaches you didn't anticipate
 - [ ] You could add a new feature by writing a new prompt section, not new code
+
+**Tool Design Criteria:**
+- [ ] External APIs (where agent should have full access) use Dynamic Capability Discovery
+- [ ] Every entity has full CRUD (Create, Read, Update, Delete)
+- [ ] API validates inputs, not your enum definitions
+- [ ] Discovery tools exist for each API surface (`list_*`, `discover_*`)
+
+**Agent-Native Criteria:**
+- [ ] System prompt includes dynamic context about app state (available resources, recent activity)
+- [ ] Every UI action has a corresponding agent tool (action parity)
+- [ ] Agent tools are documented in the system prompt with user vocabulary
+- [ ] Agent and user work in the same data space (shared workspace)
+- [ ] Agent actions are immediately reflected in the UI (shared service, file watching, or event bus)
+- [ ] The "write something to [app location]" test passes for all locations
+- [ ] Users can discover what the agent can do (capability hints, onboarding)
+- [ ] Context refreshes for long sessions (or `refresh_context` tool exists)
+
+**Mobile-Specific Criteria (if applicable):**
+- [ ] Background execution handling implemented (checkpoint/resume)
+- [ ] Permission requests handled gracefully in tools
+- [ ] Cost-aware design (appropriate model tiers, batching)
 </success_criteria>