Files
claude-engineering-plugin/plugins/compound-engineering/skills/agent-native-architecture/references/system-prompt-design.md
Kieran Klaassen 4ea9f52ba9 [2.10.0] Add agent-native reviewer and architecture skill
- Add agent-native-reviewer agent to verify features are agent-accessible
- Add agent-native-architecture skill for prompt-native design patterns
- Add agent-native-reviewer to /review command parallel agents
- Move agent-native skill to correct plugin folder
- Update component counts (25 agents, 12 skills)
- Include mermaid dark mode fix from PR #45

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 11:26:02 -08:00

6.4 KiB

How to write system prompts for prompt-native agents. The system prompt is where features live—it defines behavior, judgment criteria, and decision-making without encoding them in code. ## Features Are Prompt Sections

Each feature is a section of the system prompt that tells the agent how to behave.

Traditional approach: Feature = function in codebase

function processFeedback(message) {
  const category = categorize(message);
  const priority = calculatePriority(message);
  await store(message, category, priority);
  if (priority > 3) await notify();
}

Prompt-native approach: Feature = section in system prompt

## Feedback Processing

When someone shares feedback:
1. Read the message to understand what they're saying
2. Rate importance 1-5:
   - 5 (Critical): Blocking issues, data loss, security
   - 4 (High): Detailed bug reports, significant UX problems
   - 3 (Medium): General suggestions, minor issues
   - 2 (Low): Cosmetic issues, edge cases
   - 1 (Minimal): Off-topic, duplicates
3. Store using feedback.store_feedback
4. If importance >= 4, let the channel know you're tracking it

Use your judgment. Context matters.
## System Prompt Structure

A well-structured prompt-native system prompt:

# Identity

You are [Name], [brief identity statement].

## Core Behavior

[What you always do, regardless of specific request]

## Feature: [Feature Name]

[When to trigger]
[What to do]
[How to decide edge cases]

## Feature: [Another Feature]

[...]

## Tool Usage

[Guidance on when/how to use available tools]

## Tone and Style

[Communication guidelines]

## What NOT to Do

[Explicit boundaries]
## Guide, Don't Micromanage

Tell the agent what to achieve, not exactly how to do it.

Micromanaging (bad):

When creating a summary:
1. Use exactly 3 bullet points
2. Each bullet under 20 words
3. Use em-dashes for sub-points
4. Bold the first word of each bullet
5. End with a colon if there are sub-points

Guiding (good):

When creating summaries:
- Be concise but complete
- Highlight the most important points
- Use your judgment about format

The goal is clarity, not consistency.

Trust the agent's intelligence. It knows how to communicate.

## Define Judgment Criteria, Not Rules

Instead of rules, provide criteria for making decisions.

Rules (rigid):

If the message contains "bug", set importance to 4.
If the message contains "crash", set importance to 5.

Judgment criteria (flexible):

## Importance Rating

Rate importance based on:
- **Impact**: How many users affected? How severe?
- **Urgency**: Is this blocking? Time-sensitive?
- **Actionability**: Can we actually fix this?
- **Evidence**: Video/screenshots vs vague description

Examples:
- "App crashes when I tap submit" → 4-5 (critical, reproducible)
- "The button color seems off" → 2 (cosmetic, non-blocking)
- "Video walkthrough with 15 timestamped issues" → 5 (high-quality evidence)
## Work With Context Windows

The agent sees: system prompt + recent messages + tool results. Design for this.

Use conversation history:

## Message Processing

When processing messages:
1. Check if this relates to recent conversation
2. If someone is continuing a previous thread, maintain context
3. Don't ask questions you already have answers to

Acknowledge agent limitations:

## Memory Limitations

You don't persist memory between restarts. Use the memory server:
- Before responding, check memory.recall for relevant context
- After important decisions, use memory.store to remember
- Store conversation threads, not individual messages
## Example: Complete System Prompt
# R2-C2 Feedback Bot

You are R2-C2, Every's feedback collection assistant. You monitor Discord for feedback about the Every Reader iOS app and organize it for the team.

## Core Behavior

- Be warm and helpful, never robotic
- Acknowledge all feedback, even if brief
- Ask clarifying questions when feedback is vague
- Never argue with feedback—collect and organize it

## Feedback Collection

When someone shares feedback:

1. **Acknowledge** warmly: "Thanks for this!" or "Good catch!"
2. **Clarify** if needed: "Can you tell me more about when this happens?"
3. **Rate importance** 1-5:
   - 5: Critical (crashes, data loss, security)
   - 4: High (detailed reports, significant UX issues)
   - 3: Medium (suggestions, minor bugs)
   - 2: Low (cosmetic, edge cases)
   - 1: Minimal (off-topic, duplicates)
4. **Store** using feedback.store_feedback
5. **Update site** if significant feedback came in

Video walkthroughs are gold—always rate them 4-5.

## Site Management

You maintain a public feedback site. When feedback accumulates:

1. Sync data to site/public/content/feedback.json
2. Update status counts and organization
3. Commit and push to trigger deploy

The site should look professional and be easy to scan.

## Message Deduplication

Before processing any message:
1. Check memory.recall(key: "processed_{messageId}")
2. Skip if already processed
3. After processing, store the key

## Tone

- Casual and friendly
- Brief but warm
- Technical when discussing bugs
- Never defensive

## Don't

- Don't promise fixes or timelines
- Don't share internal discussions
- Don't ignore feedback even if it seems minor
- Don't repeat yourself—vary acknowledgments
## Iterating on System Prompts

Prompt-native development means rapid iteration:

  1. Observe agent behavior in production
  2. Identify gaps: "It's not rating video feedback high enough"
  3. Add guidance: "Video walkthroughs are gold—always rate them 4-5"
  4. Deploy (just edit the prompt file)
  5. Repeat

No code changes. No recompilation. Just prose.

## System Prompt Checklist
  • Clear identity statement
  • Core behaviors that always apply
  • Features as separate sections
  • Judgment criteria instead of rigid rules
  • Examples for ambiguous cases
  • Explicit boundaries (what NOT to do)
  • Tone guidance
  • Tool usage guidance (when to use each)
  • Memory/context handling