Start with pure primitives: bash, file operations, basic storage. This proves the architecture works and reveals what the agent actually needs. As patterns emerge, add domain-specific tools deliberately. This document covers when and how to evolve from primitives to domain tools, and when to graduate to optimized code.
## Start with Pure Primitives
Begin every agent-native system with the most atomic tools possible:
- `read_file` / `write_file` / `list_files`
- `bash` (for everything else)
- Basic storage (`store_item` / `get_item`)
- HTTP requests (`fetch_url`)
**Why start here:**
1. **Proves the architecture** - If it works with primitives, your prompts are doing their job
2. **Reveals actual needs** - You'll discover what domain concepts matter
3. **Maximum flexibility** - Agent can do anything, not just what you anticipated
4. **Forces good prompts** - You can't lean on tool logic as a crutch
### Example: Starting Primitive
```typescript
// Start with just these
const tools = [
tool("read_file", { path: z.string() }, ...),
tool("write_file", { path: z.string(), content: z.string() }, ...),
tool("list_files", { path: z.string() }, ...),
tool("bash", { command: z.string() }, ...),
];
// Prompt handles the domain logic
const prompt = `
When processing feedback:
1. Read existing feedback from data/feedback.json
2. Add the new feedback with your assessment of importance (1-5)
3. Write the updated file
4. If importance >= 4, create a notification file in data/alerts/
`;
```
## When to Add Domain Tools
As patterns emerge, you'll want to add domain-specific tools. This is good—but do it deliberately.
### Vocabulary Anchoring
**Add a domain tool when:** The agent needs to understand domain concepts.
A `create_note` tool teaches the agent what "note" means in your system better than "write a file to the notes directory with this format."
```typescript
// Without domain tool - agent must infer structure
await agent.chat("Create a note about the meeting");
// Agent: writes to... notes/? documents/? what format?
// With domain tool - vocabulary is anchored
tool("create_note", {
title: z.string(),
content: z.string(),
tags: z.array(z.string()).optional(),
}, async ({ title, content, tags }) => {
// Tool enforces structure, agent understands "note"
});
```
### Guardrails
**Add a domain tool when:** Some operations need validation or constraints that shouldn't be left to agent judgment.
```typescript
// publish_to_feed might enforce format requirements or content policies
tool("publish_to_feed", {
bookId: z.string(),
content: z.string(),
headline: z.string().max(100), // Enforce headline length
}, async ({ bookId, content, headline }) => {
// Validate content meets guidelines
if (containsProhibitedContent(content)) {
return { text: "Content doesn't meet guidelines", isError: true };
}
// Enforce proper structure
await feedService.publish({ bookId, content, headline, publishedAt: new Date() });
});
```
### Efficiency
**Add a domain tool when:** Common operations would take many primitive calls.
```typescript
// Primitive approach: multiple calls
await agent.chat("Get book details");
// Agent: read library.json, parse, find book, read full_text.txt, read introduction.md...
// Domain tool: one call for common operation
tool("get_book_with_content", { bookId: z.string() }, async ({ bookId }) => {
const book = await library.getBook(bookId);
const fullText = await readFile(`Research/${bookId}/full_text.txt`);
const intro = await readFile(`Research/${bookId}/introduction.md`);
return { text: JSON.stringify({ book, fullText, intro }) };
});
```
## The Rule for Domain Tools
**Domain tools should represent one conceptual action from the user's perspective.**
They can include mechanical validation, but **judgment about what to do or whether to do it belongs in the prompt**.
### Wrong: Bundles Judgment
```typescript
// WRONG - analyze_and_publish bundles judgment into the tool
tool("analyze_and_publish", async ({ input }) => {
const analysis = analyzeContent(input); // Tool decides how to analyze
const shouldPublish = analysis.score > 0.7; // Tool decides whether to publish
if (shouldPublish) {
await publish(analysis.summary); // Tool decides what to publish
}
});
```
### Right: One Action, Agent Decides
```typescript
// RIGHT - separate tools, agent decides
tool("analyze_content", { content: z.string() }, ...); // Returns analysis
tool("publish", { content: z.string() }, ...); // Publishes what agent provides
// Prompt: "Analyze the content. If it's high quality, publish a summary."
// Agent decides what "high quality" means and what summary to write.
```
### The Test
Ask: "Who is making the decision here?"
- If the answer is "the tool code" → you've encoded judgment, refactor
- If the answer is "the agent based on the prompt" → good
## Keep Primitives Available
**Domain tools are shortcuts, not gates.**
Unless there's a specific reason to restrict access (security, data integrity), the agent should still be able to use underlying primitives for edge cases.
```typescript
// Domain tool for common case
tool("create_note", { title, content }, ...);
// But primitives still available for edge cases
tool("read_file", { path }, ...);
tool("write_file", { path, content }, ...);
// Agent can use create_note normally, but for weird edge case:
// "Create a note in a non-standard location with custom metadata"
// → Agent uses write_file directly
```
### When to Gate
Gating (making domain tool the only way) is appropriate for:
- **Security:** User authentication, payment processing
- **Data integrity:** Operations that must maintain invariants
- **Audit requirements:** Actions that must be logged in specific ways
**The default is open.** When you do gate something, make it a conscious decision with a clear reason.
## Graduating to Code
Some operations will need to move from agent-orchestrated to optimized code for performance or reliability.
### The Progression
```
Stage 1: Agent uses primitives in a loop
→ Flexible, proves the concept
→ Slow, potentially expensive
Stage 2: Add domain tools for common operations
→ Faster, still agent-orchestrated
→ Agent still decides when/whether to use
Stage 3: For hot paths, implement in optimized code
→ Fast, deterministic
→ Agent can still trigger, but execution is code
```
### Example Progression
**Stage 1: Pure primitives**
```markdown
Prompt: "When user asks for a summary, read all notes in /notes,
analyze them, and write a summary to /summaries/{date}.md"
Agent: Calls read_file 20 times, reasons about content, writes summary
Time: 30 seconds, 50k tokens
```
**Stage 2: Domain tool**
```typescript
tool("get_all_notes", {}, async () => {
const notes = await readAllNotesFromDirectory();
return { text: JSON.stringify(notes) };
});
// Agent still decides how to summarize, but retrieval is faster
// Time: 10 seconds, 30k tokens
```
**Stage 3: Optimized code**
```typescript
tool("generate_weekly_summary", {}, async () => {
// Entire operation in code for hot path
const notes = await getNotes({ since: oneWeekAgo });
const summary = await generateSummary(notes); // Could use cheaper model
await writeSummary(summary);
return { text: "Summary generated" };
});
// Agent just triggers it
// Time: 2 seconds, 5k tokens
```
### The Caveat
**Even when an operation graduates to code, the agent should be able to:**
1. Trigger the optimized operation itself
2. Fall back to primitives for edge cases the optimized path doesn't handle
Graduation is about efficiency. **Parity still holds.** The agent doesn't lose capability when you optimize.
## Decision Framework
### Should I Add a Domain Tool?
| Question | If Yes |
|----------|--------|
| Is the agent confused about what this concept means? | Add for vocabulary anchoring |
| Does this operation need validation the agent shouldn't decide? | Add with guardrails |
| Is this a common multi-step operation? | Add for efficiency |
| Would changing behavior require code changes? | Keep as prompt instead |
### Should I Graduate to Code?
| Question | If Yes |
|----------|--------|
| Is this operation called very frequently? | Consider graduating |
| Does latency matter significantly? | Consider graduating |
| Are token costs problematic? | Consider graduating |
| Do you need deterministic behavior? | Graduate to code |
| Does the operation need complex state management? | Graduate to code |
### Should I Gate Access?
| Question | If Yes |
|----------|--------|
| Is there a security requirement? | Gate appropriately |
| Must this operation maintain data integrity? | Gate appropriately |
| Is there an audit/compliance requirement? | Gate appropriately |
| Is it just "safer" with no specific risk? | Keep primitives available |
## Examples
### Feedback Processing Evolution
**Stage 1: Primitives only**
```typescript
tools: [read_file, write_file, bash]
prompt: "Store feedback in data/feedback.json, notify if important"
// Agent figures out JSON structure, importance criteria, notification method
```
**Stage 2: Domain tools for vocabulary**
```typescript
tools: [
store_feedback, // Anchors "feedback" concept with proper structure
send_notification, // Anchors "notify" with correct channels
read_file, // Still available for edge cases
write_file,
]
prompt: "Store feedback using store_feedback. Notify if importance >= 4."
// Agent still decides importance, but vocabulary is anchored
```
**Stage 3: Graduated hot path**
```typescript
tools: [
process_feedback_batch, // Optimized for high-volume processing
store_feedback, // For individual items
send_notification,
read_file,
write_file,
]
// Batch processing is code, but agent can still use store_feedback for special cases
```
### When NOT to Add Domain Tools
**Don't add a domain tool just to make things "cleaner":**
```typescript
// Unnecessary - agent can compose primitives
tool("organize_files_by_date", ...) // Just use move_file + judgment
// Unnecessary - puts decision in wrong place
tool("decide_file_importance", ...) // This is prompt territory
```
**Don't add a domain tool if behavior might change:**
```typescript
// Bad - locked into code
tool("generate_standard_report", ...) // What if report format evolves?
// Better - keep in prompt
prompt: "Generate a report covering X, Y, Z. Format for readability."
// Can adjust format by editing prompt
```
## Checklist: Primitives to Domain Tools
### Starting Out
- [ ] Begin with pure primitives (read, write, list, bash)
- [ ] Write behavior in prompts, not tool logic
- [ ] Let patterns emerge from actual usage
### Adding Domain Tools
- [ ] Clear reason: vocabulary anchoring, guardrails, or efficiency
- [ ] Tool represents one conceptual action
- [ ] Judgment stays in prompts, not tool code
- [ ] Primitives remain available alongside domain tools
### Graduating to Code
- [ ] Hot path identified (frequent, latency-sensitive, or expensive)
- [ ] Optimized version doesn't remove agent capability
- [ ] Fallback to primitives for edge cases still works
### Gating Decisions
- [ ] Specific reason for each gate (security, integrity, audit)
- [ ] Default is open access
- [ ] Gates are conscious decisions, not defaults