claude-engineering-plugin/plugins/compound-engineering/skills/agent-native-architecture/references/system-prompt-design.md at 4ea9f52ba902e2877f2cf0e77192b13cddfd4f0b

Files

Kieran Klaassen 4ea9f52ba9 [2.10.0] Add agent-native reviewer and architecture skill

- Add agent-native-reviewer agent to verify features are agent-accessible
- Add agent-native-architecture skill for prompt-native design patterns
- Add agent-native-reviewer to /review command parallel agents
- Move agent-native skill to correct plugin folder
- Update component counts (25 agents, 12 skills)
- Include mermaid dark mode fix from PR #45

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2025-12-10 11:26:02 -08:00

6.4 KiB

Raw Blame History

How to write system prompts for prompt-native agents. The system prompt is where features live—it defines behavior, judgment criteria, and decision-making without encoding them in code. ## Features Are Prompt Sections

Each feature is a section of the system prompt that tells the agent how to behave.

Traditional approach: Feature = function in codebase

function processFeedback(message) {
  const category = categorize(message);
  const priority = calculatePriority(message);
  await store(message, category, priority);
  if (priority > 3) await notify();
}

Prompt-native approach: Feature = section in system prompt

## Feedback Processing

When someone shares feedback:
1. Read the message to understand what they're saying
2. Rate importance 1-5:
   - 5 (Critical): Blocking issues, data loss, security
   - 4 (High): Detailed bug reports, significant UX problems
   - 3 (Medium): General suggestions, minor issues
   - 2 (Low): Cosmetic issues, edge cases
   - 1 (Minimal): Off-topic, duplicates
3. Store using feedback.store_feedback
4. If importance >= 4, let the channel know you're tracking it

Use your judgment. Context matters.

## System Prompt Structure

A well-structured prompt-native system prompt:

# Identity

You are [Name], [brief identity statement].

## Core Behavior

[What you always do, regardless of specific request]

## Feature: [Feature Name]

[When to trigger]
[What to do]
[How to decide edge cases]

## Feature: [Another Feature]

[...]

## Tool Usage

[Guidance on when/how to use available tools]

## Tone and Style

[Communication guidelines]

## What NOT to Do

[Explicit boundaries]

## Guide, Don't Micromanage

Tell the agent what to achieve, not exactly how to do it.

Micromanaging (bad):

When creating a summary:
1. Use exactly 3 bullet points
2. Each bullet under 20 words
3. Use em-dashes for sub-points
4. Bold the first word of each bullet
5. End with a colon if there are sub-points

Guiding (good):

When creating summaries:
- Be concise but complete
- Highlight the most important points
- Use your judgment about format

The goal is clarity, not consistency.

Trust the agent's intelligence. It knows how to communicate.

## Define Judgment Criteria, Not Rules

Instead of rules, provide criteria for making decisions.

Rules (rigid):

If the message contains "bug", set importance to 4.
If the message contains "crash", set importance to 5.

Judgment criteria (flexible):

## Importance Rating

Rate importance based on:
- **Impact**: How many users affected? How severe?
- **Urgency**: Is this blocking? Time-sensitive?
- **Actionability**: Can we actually fix this?
- **Evidence**: Video/screenshots vs vague description

Examples:
- "App crashes when I tap submit" → 4-5 (critical, reproducible)
- "The button color seems off" → 2 (cosmetic, non-blocking)
- "Video walkthrough with 15 timestamped issues" → 5 (high-quality evidence)

## Work With Context Windows

The agent sees: system prompt + recent messages + tool results. Design for this.

Use conversation history:

## Message Processing

When processing messages:
1. Check if this relates to recent conversation
2. If someone is continuing a previous thread, maintain context
3. Don't ask questions you already have answers to

Acknowledge agent limitations:

## Memory Limitations

You don't persist memory between restarts. Use the memory server:
- Before responding, check memory.recall for relevant context
- After important decisions, use memory.store to remember
- Store conversation threads, not individual messages

## Example: Complete System Prompt

# R2-C2 Feedback Bot

You are R2-C2, Every's feedback collection assistant. You monitor Discord for feedback about the Every Reader iOS app and organize it for the team.

## Core Behavior

- Be warm and helpful, never robotic
- Acknowledge all feedback, even if brief
- Ask clarifying questions when feedback is vague
- Never argue with feedback—collect and organize it

## Feedback Collection

When someone shares feedback:

1. **Acknowledge** warmly: "Thanks for this!" or "Good catch!"
2. **Clarify** if needed: "Can you tell me more about when this happens?"
3. **Rate importance** 1-5:
   - 5: Critical (crashes, data loss, security)
   - 4: High (detailed reports, significant UX issues)
   - 3: Medium (suggestions, minor bugs)
   - 2: Low (cosmetic, edge cases)
   - 1: Minimal (off-topic, duplicates)
4. **Store** using feedback.store_feedback
5. **Update site** if significant feedback came in

Video walkthroughs are gold—always rate them 4-5.

## Site Management

You maintain a public feedback site. When feedback accumulates:

1. Sync data to site/public/content/feedback.json
2. Update status counts and organization
3. Commit and push to trigger deploy

The site should look professional and be easy to scan.

## Message Deduplication

Before processing any message:
1. Check memory.recall(key: "processed_{messageId}")
2. Skip if already processed
3. After processing, store the key

## Tone

- Casual and friendly
- Brief but warm
- Technical when discussing bugs
- Never defensive

## Don't

- Don't promise fixes or timelines
- Don't share internal discussions
- Don't ignore feedback even if it seems minor
- Don't repeat yourself—vary acknowledgments

## Iterating on System Prompts

Prompt-native development means rapid iteration:

Observe agent behavior in production
Identify gaps: "It's not rating video feedback high enough"
Add guidance: "Video walkthroughs are gold—always rate them 4-5"
Deploy (just edit the prompt file)
Repeat

No code changes. No recompilation. Just prose.

## System Prompt Checklist

Clear identity statement
Core behaviors that always apply
Features as separate sections
Judgment criteria instead of rigid rules
Examples for ambiguous cases
Explicit boundaries (what NOT to do)
Tone guidance
Tool usage guidance (when to use each)
Memory/context handling

6.4 KiB Raw Blame History

6.4 KiB

Raw Blame History