[2.18.0] Add Dynamic Capability Discovery and iCloud sync patterns (#62)

* [2.17.0] Expand agent-native skill with mobile app learnings

Major expansion of agent-native-architecture skill based on real-world
learnings from building the Every Reader iOS app.

New reference documents:
- dynamic-context-injection.md: Runtime app state in system prompts
- action-parity-discipline.md: Ensuring agents can do what users can
- shared-workspace-architecture.md: Agents and users in same data space
- agent-native-testing.md: Testing patterns for agent-native apps
- mobile-patterns.md: Background execution, permissions, cost awareness

Updated references:
- architecture-patterns.md: Added Unified Agent Architecture, Agent-to-UI
  Communication, and Model Tier Selection patterns

Enhanced agent-native-reviewer with comprehensive review process covering
all new patterns, including mobile-specific verification.

Key insight: "The agent should be able to do anything the user can do,
through tools that mirror UI capabilities, with full context about the
app state."

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* [2.18.0] Add Dynamic Capability Discovery and iCloud sync patterns

New patterns in agent-native-architecture skill:

- **Dynamic Capability Discovery** - For agent-native apps integrating with
  external APIs (HealthKit, HomeKit, GraphQL), use a discovery tool (list_*)
  plus a generic access tool instead of individual tools per endpoint.
  (Note: Static mapping is fine for constrained agents with limited scope.)

- **CRUD Completeness** - Every entity needs create, read, update, AND delete.

- **iCloud File Storage** - Use iCloud Documents for shared workspace to get
  free, automatic multi-device sync without building a sync layer.

- **Architecture Review Checklist** - Pushes reviewer findings earlier into
  design phase. Covers tool design, action parity, UI integration, context.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
This commit is contained in:
Dan Shipper
2025-12-25 13:03:07 -05:00
committed by GitHub
parent 5a79f97374
commit 1bc6bd9164
12 changed files with 3523 additions and 60 deletions

View File

@@ -65,6 +65,12 @@ What aspect of agent native architecture do you need help with?
3. **Write system prompts** - Define agent behavior in prompts
4. **Self-modification** - Enable agents to safely evolve themselves
5. **Review/refactor** - Make existing code more prompt-native
6. **Context injection** - Inject runtime app state into agent prompts
7. **Action parity** - Ensure agents can do everything users can do
8. **Shared workspace** - Set up agents and users in the same data space
9. **Testing** - Test agent-native apps for capability and parity
10. **Mobile patterns** - Handle background execution, permissions, cost
11. **API integration** - Connect to external APIs (HealthKit, HomeKit, GraphQL)
**Wait for response before proceeding.**
</intake>
@@ -72,15 +78,55 @@ What aspect of agent native architecture do you need help with?
<routing>
| Response | Action |
|----------|--------|
| 1, "design", "architecture", "plan" | Read [architecture-patterns.md](./references/architecture-patterns.md) |
| 1, "design", "architecture", "plan" | Read [architecture-patterns.md](./references/architecture-patterns.md), then apply Architecture Checklist below |
| 2, "tool", "mcp", "primitive" | Read [mcp-tool-design.md](./references/mcp-tool-design.md) |
| 3, "prompt", "system prompt", "behavior" | Read [system-prompt-design.md](./references/system-prompt-design.md) |
| 4, "self-modify", "evolve", "git" | Read [self-modification.md](./references/self-modification.md) |
| 5, "review", "refactor", "existing" | Read [refactoring-to-prompt-native.md](./references/refactoring-to-prompt-native.md) |
| 6, "context", "inject", "runtime", "dynamic" | Read [dynamic-context-injection.md](./references/dynamic-context-injection.md) |
| 7, "parity", "ui action", "capability map" | Read [action-parity-discipline.md](./references/action-parity-discipline.md) |
| 8, "workspace", "shared", "files", "filesystem" | Read [shared-workspace-architecture.md](./references/shared-workspace-architecture.md) |
| 9, "test", "testing", "verify", "validate" | Read [agent-native-testing.md](./references/agent-native-testing.md) |
| 10, "mobile", "ios", "android", "background" | Read [mobile-patterns.md](./references/mobile-patterns.md) |
| 11, "api", "healthkit", "homekit", "graphql", "external" | Read [mcp-tool-design.md](./references/mcp-tool-design.md) (Dynamic Capability Discovery section) |
**After reading the reference, apply those patterns to the user's specific context.**
</routing>
<architecture_checklist>
## Architecture Review Checklist (Apply During Design)
When designing an agent-native system, verify these **before implementation**:
### Tool Design
- [ ] **Dynamic vs Static:** For external APIs where agent should have full user-level access (HealthKit, HomeKit, GraphQL), use Dynamic Capability Discovery. Only use static mapping if intentionally limiting agent scope.
- [ ] **CRUD Completeness:** Every entity has create, read, update, AND delete tools
- [ ] **Primitives not Workflows:** Tools enable capability, they don't encode business logic
- [ ] **API as Validator:** Use `z.string()` inputs when the API validates, not `z.enum()`
### Action Parity
- [ ] **Capability Map:** Every UI action has a corresponding agent tool
- [ ] **Edit/Delete:** If UI can edit or delete, agent must be able to too
- [ ] **The Write Test:** "Write something to [app location]" must work for all locations
### UI Integration
- [ ] **Agent → UI:** Define how agent changes reflect in UI (shared service, file watching, or event bus)
- [ ] **No Silent Actions:** Agent writes should trigger UI updates immediately
- [ ] **Capability Discovery:** Users can learn what agent can do (onboarding, hints)
### Context Injection
- [ ] **Available Resources:** System prompt includes what exists (files, data, types)
- [ ] **Available Capabilities:** System prompt documents what agent can do with user vocabulary
- [ ] **Dynamic Context:** Context refreshes for long sessions (or provide `refresh_context` tool)
### Mobile (if applicable)
- [ ] **Background Execution:** Checkpoint/resume pattern for iOS app suspension
- [ ] **Permissions:** Just-in-time permission requests in tools
- [ ] **Cost Awareness:** Model tier selection (Haiku/Sonnet/Opus)
**When designing architecture, explicitly address each checkbox in your plan.**
</architecture_checklist>
<quick_start>
Build a prompt-native agent in three steps:
@@ -123,11 +169,19 @@ query({
All references in `references/`:
**Core Patterns:**
- **Architecture:** [architecture-patterns.md](./references/architecture-patterns.md)
- **Tool Design:** [mcp-tool-design.md](./references/mcp-tool-design.md)
- **Tool Design:** [mcp-tool-design.md](./references/mcp-tool-design.md) - includes Dynamic Capability Discovery, CRUD Completeness
- **Prompts:** [system-prompt-design.md](./references/system-prompt-design.md)
- **Self-Modification:** [self-modification.md](./references/self-modification.md)
- **Refactoring:** [refactoring-to-prompt-native.md](./references/refactoring-to-prompt-native.md)
**Agent-Native Disciplines:**
- **Context Injection:** [dynamic-context-injection.md](./references/dynamic-context-injection.md)
- **Action Parity:** [action-parity-discipline.md](./references/action-parity-discipline.md)
- **Shared Workspace:** [shared-workspace-architecture.md](./references/shared-workspace-architecture.md)
- **Testing:** [agent-native-testing.md](./references/agent-native-testing.md)
- **Mobile Patterns:** [mobile-patterns.md](./references/mobile-patterns.md)
</reference_index>
<anti_patterns>
@@ -186,11 +240,80 @@ each under 20 words, formatted with em-dashes...
// Right - define outcome, trust intelligence
Create clear, useful summaries. Use your judgment.
```
### Agent-Native Anti-Patterns
**Context Starvation**
Agent doesn't know what resources exist in the app.
```
User: "Write something about Catherine the Great in my feed"
Agent: "What feed? I don't understand what system you're referring to."
```
Fix: Inject available resources, capabilities, and vocabulary into the system prompt at runtime.
**Orphan Features**
UI action with no agent equivalent.
```swift
// UI has a "Publish to Feed" button
Button("Publish") { publishToFeed(insight) }
// But no agent tool exists to do the same thing
```
Fix: Add corresponding tool and document in system prompt for every UI action.
**Sandbox Isolation**
Agent works in separate data space from user.
```
Documents/
├── user_files/ ← User's space
└── agent_output/ ← Agent's space (isolated)
```
Fix: Use shared workspace where both agent and user operate on the same files.
**Silent Actions**
Agent changes state but UI doesn't update.
```typescript
// Agent writes to database
await db.insert("feed", content);
// But UI doesn't observe this table - user sees nothing
```
Fix: Use shared data stores with reactive binding, or file system observation.
**Capability Hiding**
Users can't discover what agents can do.
```
User: "Help me with my reading"
Agent: "What would you like help with?"
// Agent doesn't mention it can publish to feed, research books, etc.
```
Fix: Include capability hints in agent responses or provide onboarding.
**Static Tool Mapping (for agent-native apps)**
Building individual tools for each API endpoint when you want the agent to have full access.
```typescript
// You built 50 tools for 50 HealthKit types
tool("read_steps", ...)
tool("read_heart_rate", ...)
tool("read_sleep", ...)
// When glucose tracking is added... code change required
// Agent can only access what you anticipated
```
Fix: Use Dynamic Capability Discovery - one `list_*` tool to discover what's available, one generic tool to access any type. See [mcp-tool-design.md](./references/mcp-tool-design.md). (Note: Static mapping is fine for constrained agents with intentionally limited scope.)
**Incomplete CRUD**
Agent can create but not update or delete.
```typescript
// ❌ User: "Delete that journal entry"
// Agent: "I don't have a tool for that"
tool("create_journal_entry", ...)
// Missing: update_journal_entry, delete_journal_entry
```
Fix: Every entity needs full CRUD (Create, Read, Update, Delete). The CRUD Audit: for each entity, verify all four operations exist.
</anti_patterns>
<success_criteria>
You've built a prompt-native agent when:
**Core Prompt-Native Criteria:**
- [ ] The agent figures out HOW to achieve outcomes, not just calls your functions
- [ ] Whatever a user could do, the agent can do (no artificial limits)
- [ ] Features are prompts that define outcomes, not code that defines workflows
@@ -198,4 +321,25 @@ You've built a prompt-native agent when:
- [ ] Changing behavior means editing prose, not refactoring code
- [ ] The agent can surprise you with clever approaches you didn't anticipate
- [ ] You could add a new feature by writing a new prompt section, not new code
**Tool Design Criteria:**
- [ ] External APIs (where agent should have full access) use Dynamic Capability Discovery
- [ ] Every entity has full CRUD (Create, Read, Update, Delete)
- [ ] API validates inputs, not your enum definitions
- [ ] Discovery tools exist for each API surface (`list_*`, `discover_*`)
**Agent-Native Criteria:**
- [ ] System prompt includes dynamic context about app state (available resources, recent activity)
- [ ] Every UI action has a corresponding agent tool (action parity)
- [ ] Agent tools are documented in the system prompt with user vocabulary
- [ ] Agent and user work in the same data space (shared workspace)
- [ ] Agent actions are immediately reflected in the UI (shared service, file watching, or event bus)
- [ ] The "write something to [app location]" test passes for all locations
- [ ] Users can discover what the agent can do (capability hints, onboarding)
- [ ] Context refreshes for long sessions (or `refresh_context` tool exists)
**Mobile-Specific Criteria (if applicable):**
- [ ] Background execution handling implemented (checkpoint/resume)
- [ ] Permission requests handled gracefully in tools
- [ ] Cost-aware design (appropriate model tiers, batching)
</success_criteria>