* [2.17.0] Expand agent-native skill with mobile app learnings Major expansion of agent-native-architecture skill based on real-world learnings from building the Every Reader iOS app. New reference documents: - dynamic-context-injection.md: Runtime app state in system prompts - action-parity-discipline.md: Ensuring agents can do what users can - shared-workspace-architecture.md: Agents and users in same data space - agent-native-testing.md: Testing patterns for agent-native apps - mobile-patterns.md: Background execution, permissions, cost awareness Updated references: - architecture-patterns.md: Added Unified Agent Architecture, Agent-to-UI Communication, and Model Tier Selection patterns Enhanced agent-native-reviewer with comprehensive review process covering all new patterns, including mobile-specific verification. Key insight: "The agent should be able to do anything the user can do, through tools that mirror UI capabilities, with full context about the app state." 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * [2.18.0] Add Dynamic Capability Discovery and iCloud sync patterns New patterns in agent-native-architecture skill: - **Dynamic Capability Discovery** - For agent-native apps integrating with external APIs (HealthKit, HomeKit, GraphQL), use a discovery tool (list_*) plus a generic access tool instead of individual tools per endpoint. (Note: Static mapping is fine for constrained agents with limited scope.) - **CRUD Completeness** - Every entity needs create, read, update, AND delete. - **iCloud File Storage** - Use iCloud Documents for shared workspace to get free, automatic multi-device sync without building a sync layer. - **Architecture Review Checklist** - Pushes reviewer findings earlier into design phase. Covers tool design, action parity, UI integration, context. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
18 KiB
<background_execution>
Background Execution & Resumption
Mobile apps can be suspended or terminated at any time. Agents must handle this gracefully.
The Challenge
User starts research agent
↓
Agent begins web search
↓
User switches to another app
↓
iOS suspends your app
↓
Agent is mid-execution... what happens?
Checkpoint/Resume Pattern
Save agent state before backgrounding, restore on foreground:
class AgentOrchestrator: ObservableObject {
@Published var activeSessions: [AgentSession] = []
// Called when app is about to background
func handleAppWillBackground() {
for session in activeSessions {
saveCheckpoint(session)
session.transition(to: .backgrounded)
}
}
// Called when app returns to foreground
func handleAppDidForeground() {
for session in activeSessions where session.state == .backgrounded {
if let checkpoint = loadCheckpoint(session.id) {
resumeFromCheckpoint(session, checkpoint)
}
}
}
private func saveCheckpoint(_ session: AgentSession) {
let checkpoint = AgentCheckpoint(
sessionId: session.id,
conversationHistory: session.messages,
pendingToolCalls: session.pendingToolCalls,
partialResults: session.partialResults,
timestamp: Date()
)
storage.save(checkpoint, for: session.id)
}
private func resumeFromCheckpoint(_ session: AgentSession, _ checkpoint: AgentCheckpoint) {
session.messages = checkpoint.conversationHistory
session.pendingToolCalls = checkpoint.pendingToolCalls
// Resume execution if there were pending tool calls
if !checkpoint.pendingToolCalls.isEmpty {
session.transition(to: .running)
Task { await executeNextTool(session) }
}
}
}
State Machine for Agent Lifecycle
enum AgentState {
case idle // Not running
case running // Actively executing
case waitingForUser // Paused, waiting for user input
case backgrounded // App backgrounded, state saved
case completed // Finished successfully
case failed(Error) // Finished with error
}
class AgentSession: ObservableObject {
@Published var state: AgentState = .idle
func transition(to newState: AgentState) {
let validTransitions: [AgentState: Set<AgentState>] = [
.idle: [.running],
.running: [.waitingForUser, .backgrounded, .completed, .failed],
.waitingForUser: [.running, .backgrounded],
.backgrounded: [.running, .completed],
]
guard validTransitions[state]?.contains(newState) == true else {
logger.warning("Invalid transition: \(state) → \(newState)")
return
}
state = newState
}
}
Background Task Extension (iOS)
Request extra time when backgrounded during critical operations:
class AgentOrchestrator {
private var backgroundTask: UIBackgroundTaskIdentifier = .invalid
func handleAppWillBackground() {
// Request extra time for saving state
backgroundTask = UIApplication.shared.beginBackgroundTask { [weak self] in
self?.endBackgroundTask()
}
// Save all checkpoints
Task {
for session in activeSessions {
await saveCheckpoint(session)
}
endBackgroundTask()
}
}
private func endBackgroundTask() {
if backgroundTask != .invalid {
UIApplication.shared.endBackgroundTask(backgroundTask)
backgroundTask = .invalid
}
}
}
User Communication
Let users know what's happening:
struct AgentStatusView: View {
@ObservedObject var session: AgentSession
var body: some View {
switch session.state {
case .backgrounded:
Label("Paused (app in background)", systemImage: "pause.circle")
.foregroundColor(.orange)
case .running:
Label("Working...", systemImage: "ellipsis.circle")
.foregroundColor(.blue)
case .waitingForUser:
Label("Waiting for your input", systemImage: "person.circle")
.foregroundColor(.green)
// ...
}
}
}
</background_execution>
## Permission HandlingMobile agents may need access to system resources. Handle permission requests gracefully.
Common Permissions
| Resource | iOS Permission | Use Case |
|---|---|---|
| Photo Library | PHPhotoLibrary | Profile generation from photos |
| Files | Document picker | Reading user documents |
| Camera | AVCaptureDevice | Scanning book covers |
| Location | CLLocationManager | Location-aware recommendations |
| Network | (automatic) | Web search, API calls |
Permission-Aware Tools
Check permissions before executing:
struct PhotoTools {
static func readPhotos() -> AgentTool {
tool(
name: "read_photos",
description: "Read photos from the user's photo library",
parameters: [
"limit": .number("Maximum photos to read"),
"dateRange": .string("Date range filter").optional()
],
execute: { params, context in
// Check permission first
let status = await PHPhotoLibrary.requestAuthorization(for: .readWrite)
switch status {
case .authorized, .limited:
// Proceed with reading photos
let photos = await fetchPhotos(params)
return ToolResult(text: "Found \(photos.count) photos", images: photos)
case .denied, .restricted:
return ToolResult(
text: "Photo access needed. Please grant permission in Settings → Privacy → Photos.",
isError: true
)
case .notDetermined:
return ToolResult(
text: "Photo permission required. Please try again.",
isError: true
)
@unknown default:
return ToolResult(text: "Unknown permission status", isError: true)
}
}
)
}
}
Graceful Degradation
When permissions aren't granted, offer alternatives:
func readPhotos() async -> ToolResult {
let status = PHPhotoLibrary.authorizationStatus(for: .readWrite)
switch status {
case .denied, .restricted:
// Suggest alternative
return ToolResult(
text: """
I don't have access to your photos. You can either:
1. Grant access in Settings → Privacy → Photos
2. Share specific photos directly in our chat
Would you like me to help with something else instead?
""",
isError: false // Not a hard error, just a limitation
)
// ...
}
}
Permission Request Timing
Don't request permissions until needed:
// BAD: Request all permissions at launch
func applicationDidFinishLaunching() {
requestPhotoAccess()
requestCameraAccess()
requestLocationAccess()
// User is overwhelmed with permission dialogs
}
// GOOD: Request when the feature is used
tool("analyze_book_cover", async ({ image }) => {
// Only request camera access when user tries to scan a cover
let status = await AVCaptureDevice.requestAccess(for: .video)
if status {
return await scanCover(image)
} else {
return ToolResult(text: "Camera access needed for book scanning")
}
})
<cost_awareness>
Cost-Aware Design
Mobile users may be on cellular data or concerned about API costs. Design agents to be efficient.
Model Tier Selection
Use the cheapest model that achieves the outcome:
enum ModelTier {
case fast // claude-3-haiku: ~$0.25/1M tokens
case balanced // claude-3-sonnet: ~$3/1M tokens
case powerful // claude-3-opus: ~$15/1M tokens
var modelId: String {
switch self {
case .fast: return "claude-3-haiku-20240307"
case .balanced: return "claude-3-sonnet-20240229"
case .powerful: return "claude-3-opus-20240229"
}
}
}
// Match model to task complexity
let agentConfigs: [AgentType: ModelTier] = [
.quickLookup: .fast, // "What's in my library?"
.chatAssistant: .balanced, // General conversation
.researchAgent: .balanced, // Web search + synthesis
.profileGenerator: .powerful, // Complex photo analysis
.introductionWriter: .balanced,
]
Token Budgets
Limit tokens per agent session:
struct AgentConfig {
let modelTier: ModelTier
let maxInputTokens: Int
let maxOutputTokens: Int
let maxTurns: Int
static let research = AgentConfig(
modelTier: .balanced,
maxInputTokens: 50_000,
maxOutputTokens: 4_000,
maxTurns: 20
)
static let quickChat = AgentConfig(
modelTier: .fast,
maxInputTokens: 10_000,
maxOutputTokens: 1_000,
maxTurns: 5
)
}
class AgentSession {
var totalTokensUsed: Int = 0
func checkBudget() -> Bool {
if totalTokensUsed > config.maxInputTokens {
transition(to: .failed(AgentError.budgetExceeded))
return false
}
return true
}
}
Network-Aware Execution
Defer heavy operations to WiFi:
class NetworkMonitor: ObservableObject {
@Published var isOnWiFi: Bool = false
@Published var isExpensive: Bool = false // Cellular or hotspot
private let monitor = NWPathMonitor()
func startMonitoring() {
monitor.pathUpdateHandler = { [weak self] path in
DispatchQueue.main.async {
self?.isOnWiFi = path.usesInterfaceType(.wifi)
self?.isExpensive = path.isExpensive
}
}
monitor.start(queue: .global())
}
}
class AgentOrchestrator {
@ObservedObject var network = NetworkMonitor()
func startResearchAgent(for book: Book) async {
if network.isExpensive {
// Warn user or defer
let proceed = await showAlert(
"Research uses data",
message: "This will use approximately 1-2 MB of cellular data. Continue?"
)
if !proceed { return }
}
// Proceed with research
await runAgent(ResearchAgent.create(book: book))
}
}
Batch API Calls
Combine multiple small requests:
// BAD: Many small API calls
for book in books {
await agent.chat("Summarize \(book.title)")
}
// GOOD: Batch into one request
let bookList = books.map { $0.title }.joined(separator: ", ")
await agent.chat("Summarize each of these books briefly: \(bookList)")
Caching
Cache expensive operations:
class ResearchCache {
private var cache: [String: CachedResearch] = [:]
func getCachedResearch(for bookId: String) -> CachedResearch? {
guard let cached = cache[bookId] else { return nil }
// Expire after 24 hours
if Date().timeIntervalSince(cached.timestamp) > 86400 {
cache.removeValue(forKey: bookId)
return nil
}
return cached
}
func cacheResearch(_ research: Research, for bookId: String) {
cache[bookId] = CachedResearch(
research: research,
timestamp: Date()
)
}
}
// In research tool
tool("web_search", async ({ query, bookId }) => {
// Check cache first
if let cached = cache.getCachedResearch(for: bookId) {
return ToolResult(text: cached.research.summary, cached: true)
}
// Otherwise, perform search
let results = await webSearch(query)
cache.cacheResearch(results, for: bookId)
return ToolResult(text: results.summary)
})
Cost Visibility
Show users what they're spending:
struct AgentCostView: View {
@ObservedObject var session: AgentSession
var body: some View {
VStack(alignment: .leading) {
Text("Session Stats")
.font(.headline)
HStack {
Label("\(session.turnCount) turns", systemImage: "arrow.2.squarepath")
Spacer()
Label(formatTokens(session.totalTokensUsed), systemImage: "text.word.spacing")
}
if let estimatedCost = session.estimatedCost {
Text("Est. cost: \(estimatedCost, format: .currency(code: "USD"))")
.font(.caption)
.foregroundColor(.secondary)
}
}
}
}
</cost_awareness>
<offline_handling>
Offline Graceful Degradation
Handle offline scenarios gracefully:
class ConnectivityAwareAgent {
@ObservedObject var network = NetworkMonitor()
func executeToolCall(_ toolCall: ToolCall) async -> ToolResult {
// Check if tool requires network
let requiresNetwork = ["web_search", "web_fetch", "call_api"]
.contains(toolCall.name)
if requiresNetwork && !network.isConnected {
return ToolResult(
text: """
I can't access the internet right now. Here's what I can do offline:
- Read your library and existing research
- Answer questions from cached data
- Write notes and drafts for later
Would you like me to try something that works offline?
""",
isError: false
)
}
return await executeOnline(toolCall)
}
}
Offline-First Tools
Some tools should work entirely offline:
let offlineTools: Set<String> = [
"read_file",
"write_file",
"list_files",
"read_library", // Local database
"search_local", // Local search
]
let onlineTools: Set<String> = [
"web_search",
"web_fetch",
"publish_to_cloud",
]
let hybridTools: Set<String> = [
"publish_to_feed", // Works offline, syncs later
]
Queued Actions
Queue actions that require connectivity:
class OfflineQueue: ObservableObject {
@Published var pendingActions: [QueuedAction] = []
func queue(_ action: QueuedAction) {
pendingActions.append(action)
persist()
}
func processWhenOnline() {
network.$isConnected
.filter { $0 }
.sink { [weak self] _ in
self?.processPendingActions()
}
}
private func processPendingActions() {
for action in pendingActions {
Task {
try await execute(action)
remove(action)
}
}
}
}
</offline_handling>
<battery_awareness>
Battery-Aware Execution
Respect device battery state:
class BatteryMonitor: ObservableObject {
@Published var batteryLevel: Float = 1.0
@Published var isCharging: Bool = false
@Published var isLowPowerMode: Bool = false
var shouldDeferHeavyWork: Bool {
return batteryLevel < 0.2 && !isCharging
}
func startMonitoring() {
UIDevice.current.isBatteryMonitoringEnabled = true
NotificationCenter.default.addObserver(
forName: UIDevice.batteryLevelDidChangeNotification,
object: nil,
queue: .main
) { [weak self] _ in
self?.batteryLevel = UIDevice.current.batteryLevel
}
NotificationCenter.default.addObserver(
forName: NSNotification.Name.NSProcessInfoPowerStateDidChange,
object: nil,
queue: .main
) { [weak self] _ in
self?.isLowPowerMode = ProcessInfo.processInfo.isLowPowerModeEnabled
}
}
}
class AgentOrchestrator {
@ObservedObject var battery = BatteryMonitor()
func startAgent(_ config: AgentConfig) async {
if battery.shouldDeferHeavyWork && config.isHeavy {
let proceed = await showAlert(
"Low Battery",
message: "This task uses significant battery. Continue or defer until charging?"
)
if !proceed { return }
}
// Adjust model tier based on battery
let adjustedConfig = battery.isLowPowerMode
? config.withModelTier(.fast)
: config
await runAgent(adjustedConfig)
}
}
</battery_awareness>
## Mobile Agent-Native ChecklistBackground Execution:
- Checkpoint/resume implemented for all agent sessions
- State machine for agent lifecycle (idle, running, backgrounded, etc.)
- Background task extension for critical saves
- User-visible status for backgrounded agents
Permissions:
- Permissions requested only when needed, not at launch
- Graceful degradation when permissions denied
- Clear error messages with Settings deep links
- Alternative paths when permissions unavailable
Cost Awareness:
- Model tier matched to task complexity
- Token budgets per session
- Network-aware (defer heavy work to WiFi)
- Caching for expensive operations
- Cost visibility to users
Offline Handling:
- Offline-capable tools identified
- Graceful degradation for online-only features
- Action queue for sync when online
- Clear user communication about offline state
Battery Awareness:
- Battery monitoring for heavy operations
- Low power mode detection
- Defer or downgrade based on battery state