Agent execution patterns for building robust agent loops. This covers how agents signal completion, track partial progress for resume, select appropriate model tiers, and handle context limits.
## Completion Signals
Agents need an explicit way to say "I'm done."
### Anti-Pattern: Heuristic Detection
Detecting completion through heuristics is fragile:
- Consecutive iterations without tool calls
- Checking for expected output files
- Tracking "no progress" states
- Time-based timeouts
These break in edge cases and create unpredictable behavior.
### Pattern: Explicit Completion Tool
Provide a `complete_task` tool that:
- Takes a summary of what was accomplished
- Returns a signal that stops the loop
- Works identically across all agent types
```typescript
tool("complete_task", {
summary: z.string().describe("Summary of what was accomplished"),
status: z.enum(["success", "partial", "blocked"]).optional(),
}, async ({ summary, status = "success" }) => {
return {
text: summary,
shouldContinue: false, // Key: signals loop should stop
};
});
```
### The ToolResult Pattern
Structure tool results to separate success from continuation:
```swift
struct ToolResult {
let success: Bool // Did tool succeed?
let output: String // What happened?
let shouldContinue: Bool // Should agent loop continue?
}
// Three common cases:
extension ToolResult {
static func success(_ output: String) -> ToolResult {
// Tool succeeded, keep going
ToolResult(success: true, output: output, shouldContinue: true)
}
static func error(_ message: String) -> ToolResult {
// Tool failed but recoverable, agent can try something else
ToolResult(success: false, output: message, shouldContinue: true)
}
static func complete(_ summary: String) -> ToolResult {
// Task done, stop the loop
ToolResult(success: true, output: summary, shouldContinue: false)
}
}
```
### Key Insight
**This is different from success/failure:**
- A tool can **succeed** AND signal **stop** (task complete)
- A tool can **fail** AND signal **continue** (recoverable error, try something else)
```typescript
// Examples:
read_file("/missing.txt")
// → { success: false, output: "File not found", shouldContinue: true }
// Agent can try a different file or ask for clarification
complete_task("Organized all downloads into folders")
// → { success: true, output: "...", shouldContinue: false }
// Agent is done
write_file("/output.md", content)
// → { success: true, output: "Wrote file", shouldContinue: true }
// Agent keeps working toward the goal
```
### System Prompt Guidance
Tell the agent when to complete:
```markdown
## Completing Tasks
When you've accomplished the user's request:
1. Verify your work (read back files you created, check results)
2. Call `complete_task` with a summary of what you did
3. Don't keep working after the goal is achieved
If you're blocked and can't proceed:
- Call `complete_task` with status "blocked" and explain why
- Don't loop forever trying the same thing
```
## Partial Completion
For multi-step tasks, track progress at the task level for resume capability.
### Task State Tracking
```swift
enum TaskStatus {
case pending // Not yet started
case inProgress // Currently working on
case completed // Finished successfully
case failed // Couldn't complete (with reason)
case skipped // Intentionally not done
}
struct AgentTask {
let id: String
let description: String
var status: TaskStatus
var notes: String? // Why it failed, what was done
}
struct AgentSession {
var tasks: [AgentTask]
var isComplete: Bool {
tasks.allSatisfy { $0.status == .completed || $0.status == .skipped }
}
var progress: (completed: Int, total: Int) {
let done = tasks.filter { $0.status == .completed }.count
return (done, tasks.count)
}
}
```
### UI Progress Display
Show users what's happening:
```
Progress: 3/5 tasks complete (60%)
✅ [1] Find source materials
✅ [2] Download full text
✅ [3] Extract key passages
❌ [4] Generate summary - Error: context limit exceeded
⏳ [5] Create outline - Pending
```
### Partial Completion Scenarios
**Agent hits max iterations before finishing:**
- Some tasks completed, some pending
- Checkpoint saved with current state
- Resume continues from where it left off, not from beginning
**Agent fails on one task:**
- Task marked `.failed` with error in notes
- Other tasks may continue (agent decides)
- Orchestrator doesn't automatically abort entire session
**Network error mid-task:**
- Current iteration throws
- Session marked `.failed`
- Checkpoint preserves messages up to that point
- Resume possible from checkpoint
### Checkpoint Structure
```swift
struct AgentCheckpoint: Codable {
let sessionId: String
let agentType: String
let messages: [Message] // Full conversation history
let iterationCount: Int
let tasks: [AgentTask] // Task state
let customState: [String: Any] // Agent-specific state
let timestamp: Date
var isValid: Bool {
// Checkpoints expire (default 1 hour)
Date().timeIntervalSince(timestamp) < 3600
}
}
```
### Resume Flow
1. On app launch, scan for valid checkpoints
2. Show user: "You have an incomplete session. Resume?"
3. On resume:
- Restore messages to conversation
- Restore task states
- Continue agent loop from where it left off
4. On dismiss:
- Delete checkpoint
- Start fresh if user tries again
## Model Tier Selection
Different agents need different intelligence levels. Use the cheapest model that achieves the outcome.
### Tier Guidelines
| Agent Type | Recommended Tier | Reasoning |
|------------|-----------------|-----------|
| Chat/Conversation | Balanced (Sonnet) | Fast responses, good reasoning |
| Research | Balanced (Sonnet) | Tool loops, not ultra-complex synthesis |
| Content Generation | Balanced (Sonnet) | Creative but not synthesis-heavy |
| Complex Analysis | Powerful (Opus) | Multi-document synthesis, nuanced judgment |
| Profile Generation | Powerful (Opus) | Photo analysis, complex pattern recognition |
| Quick Queries | Fast (Haiku) | Simple lookups, quick transformations |
| Simple Classification | Fast (Haiku) | High volume, simple decisions |
### Implementation
```swift
enum ModelTier {
case fast // claude-3-haiku: Quick, cheap, simple tasks
case balanced // claude-sonnet: Good balance for most tasks
case powerful // claude-opus: Complex reasoning, synthesis
var modelId: String {
switch self {
case .fast: return "claude-3-haiku-20240307"
case .balanced: return "claude-sonnet-4-20250514"
case .powerful: return "claude-opus-4-20250514"
}
}
}
struct AgentConfig {
let name: String
let modelTier: ModelTier
let tools: [AgentTool]
let systemPrompt: String
let maxIterations: Int
}
// Examples
let researchConfig = AgentConfig(
name: "research",
modelTier: .balanced,
tools: researchTools,
systemPrompt: researchPrompt,
maxIterations: 20
)
let quickLookupConfig = AgentConfig(
name: "lookup",
modelTier: .fast,
tools: [readLibrary],
systemPrompt: "Answer quick questions about the user's library.",
maxIterations: 3
)
```
### Cost Optimization Strategies
1. **Start with balanced, upgrade if quality insufficient**
2. **Use fast tier for tool-heavy loops** where each turn is simple
3. **Reserve powerful tier for synthesis tasks** (comparing multiple sources)
4. **Consider token limits per turn** to control costs
5. **Cache expensive operations** to avoid repeated calls
## Context Limits
Agent sessions can extend indefinitely, but context windows don't. Design for bounded context from the start.
### The Problem
```
Turn 1: User asks question → 500 tokens
Turn 2: Agent reads file → 10,000 tokens
Turn 3: Agent reads another file → 10,000 tokens
Turn 4: Agent researches → 20,000 tokens
...
Turn 10: Context window exceeded
```
### Design Principles
**1. Tools should support iterative refinement**
Instead of all-or-nothing, design for summary → detail → full:
```typescript
// Good: Supports iterative refinement
tool("read_file", {
path: z.string(),
preview: z.boolean().default(true), // Return first 1000 chars by default
full: z.boolean().default(false), // Opt-in to full content
}, ...);
tool("search_files", {
query: z.string(),
summaryOnly: z.boolean().default(true), // Return matches, not full files
}, ...);
```
**2. Provide consolidation tools**
Give agents a way to consolidate learnings mid-session:
```typescript
tool("summarize_and_continue", {
keyPoints: z.array(z.string()),
nextSteps: z.array(z.string()),
}, async ({ keyPoints, nextSteps }) => {
// Store summary, potentially truncate earlier messages
await saveSessionSummary({ keyPoints, nextSteps });
return { text: "Summary saved. Continuing with focus on: " + nextSteps.join(", ") };
});
```
**3. Design for truncation**
Assume the orchestrator may truncate early messages. Important context should be:
- In the system prompt (always present)
- In files (can be re-read)
- Summarized in context.md
### Implementation Strategies
```swift
class AgentOrchestrator {
let maxContextTokens = 100_000
let targetContextTokens = 80_000 // Leave headroom
func shouldTruncate() -> Bool {
estimateTokens(messages) > targetContextTokens
}
func truncateIfNeeded() {
if shouldTruncate() {
// Keep system prompt + recent messages
// Summarize or drop older messages
messages = [systemMessage] + summarizeOldMessages() + recentMessages
}
}
}
```
### System Prompt Guidance
```markdown
## Managing Context
For long tasks, periodically consolidate what you've learned:
1. If you've gathered a lot of information, summarize key points
2. Save important findings to files (they persist beyond context)
3. Use `summarize_and_continue` if the conversation is getting long
Don't try to hold everything in memory. Write it down.
```
## Unified Agent Orchestrator
One execution engine, many agent types. All agents use the same orchestrator with different configurations.
```swift
class AgentOrchestrator {
static let shared = AgentOrchestrator()
func run(config: AgentConfig, userMessage: String) async -> AgentResult {
var messages: [Message] = [
.system(config.systemPrompt),
.user(userMessage)
]
var iteration = 0
while iteration < config.maxIterations {
// Get agent response
let response = await claude.message(
model: config.modelTier.modelId,
messages: messages,
tools: config.tools
)
messages.append(.assistant(response))
// Process tool calls
for toolCall in response.toolCalls {
let result = await executeToolCall(toolCall, config: config)
messages.append(.toolResult(result))
// Check for completion signal
if !result.shouldContinue {
return AgentResult(
status: .completed,
output: result.output,
iterations: iteration + 1
)
}
}
// No tool calls = agent is responding, might be done
if response.toolCalls.isEmpty {
// Could be done, or waiting for user
break
}
iteration += 1
}
return AgentResult(
status: iteration >= config.maxIterations ? .maxIterations : .responded,
output: messages.last?.content ?? "",
iterations: iteration
)
}
}
```
### Benefits
- Consistent lifecycle management across all agent types
- Automatic checkpoint/resume (critical for mobile)
- Shared tool protocol
- Easy to add new agent types
- Centralized error handling and logging
## Agent Execution Checklist
### Completion Signals
- [ ] `complete_task` tool provided (explicit completion)
- [ ] No heuristic completion detection
- [ ] Tool results include `shouldContinue` flag
- [ ] System prompt guides when to complete
### Partial Completion
- [ ] Tasks tracked with status (pending, in_progress, completed, failed)
- [ ] Checkpoints saved for resume
- [ ] Progress visible to user
- [ ] Resume continues from where left off
### Model Tiers
- [ ] Tier selected based on task complexity
- [ ] Cost optimization considered
- [ ] Fast tier for simple operations
- [ ] Powerful tier reserved for synthesis
### Context Limits
- [ ] Tools support iterative refinement (preview vs full)
- [ ] Consolidation mechanism available
- [ ] Important context persisted to files
- [ ] Truncation strategy defined