Files
claude-engineering-plugin/plugins/compound-engineering/skills/dspy-python/SKILL.md
John Lamb fedf2ff8e4
Some checks failed
CI / test (push) Has been cancelled
rewrite ruby to python
2026-01-26 14:39:43 -06:00

18 KiB

name, description
name description
dspy-python This skill should be used when working with DSPy, the Python framework for programming language models instead of prompting them. Use this when implementing LLM-powered features, creating DSPy signatures and modules, configuring language model providers (OpenAI, Anthropic, Gemini, Ollama), building agent systems with tools, optimizing prompts with teleprompters, integrating with FastAPI endpoints, or testing DSPy modules with pytest.

DSPy Expert (Python)

Overview

DSPy is a Python framework that enables developers to program language models, not prompt them. Instead of manually crafting prompts, define application requirements through composable, optimizable modules that can be tested, improved, and version-controlled like regular code.

This skill provides comprehensive guidance on:

  • Creating signatures for LLM operations
  • Building composable modules and workflows
  • Configuring multiple LLM providers
  • Implementing agents with tools (ReAct)
  • Testing with pytest
  • Optimizing with teleprompters (MIPROv2, BootstrapFewShot)
  • Integrating with FastAPI for production APIs
  • Production deployment patterns

Core Capabilities

1. Signatures

Create input/output specifications for LLM operations using inline or class-based signatures.

When to use: Defining any LLM task, from simple classification to complex analysis.

Quick reference:

import dspy

# Inline signature (simple tasks)
classify = dspy.Predict("email: str -> category: str, priority: str")

# Class-based signature (complex tasks with documentation)
class EmailClassification(dspy.Signature):
    """Classify customer support emails into categories."""

    email_subject: str = dspy.InputField(desc="Subject line of the email")
    email_body: str = dspy.InputField(desc="Full body content of the email")
    category: str = dspy.OutputField(desc="One of: Technical, Billing, General")
    priority: str = dspy.OutputField(desc="One of: Low, Medium, High")

Templates: See signature-template.py for comprehensive examples including:

  • Inline signatures for quick tasks
  • Class-based signatures with type hints
  • Signatures with Pydantic model outputs
  • Multi-field complex signatures

Best practices:

  • Always provide clear docstrings for class-based signatures
  • Use desc parameter for field documentation
  • Prefer specific descriptions over generic ones
  • Use Pydantic models for structured complex outputs

Full documentation: See core-concepts.md sections on Signatures and Type Safety.

2. Modules

Build reusable, composable modules that encapsulate LLM operations.

When to use: Implementing any LLM-powered feature, especially complex multi-step workflows.

Quick reference:

import dspy

class EmailProcessor(dspy.Module):
    def __init__(self):
        super().__init__()
        self.classifier = dspy.ChainOfThought(EmailClassification)

    def forward(self, email_subject: str, email_body: str) -> dspy.Prediction:
        return self.classifier(
            email_subject=email_subject,
            email_body=email_body
        )

Templates: See module-template.py for comprehensive examples including:

  • Basic modules with single predictors
  • Multi-step pipelines that chain modules
  • Modules with conditional logic
  • Error handling and retry patterns
  • Async modules for FastAPI
  • Caching implementations

Module composition: Chain modules together to create complex workflows:

class Pipeline(dspy.Module):
    def __init__(self):
        super().__init__()
        self.step1 = Classifier()
        self.step2 = Analyzer()
        self.step3 = Responder()

    def forward(self, input_text):
        result1 = self.step1(text=input_text)
        result2 = self.step2(classification=result1.category)
        return self.step3(analysis=result2.analysis)

Full documentation: See core-concepts.md sections on Modules and Module Composition.

3. Predictor Types

Choose the right predictor for your task:

Predict: Basic LLM inference

predictor = dspy.Predict(TaskSignature)
result = predictor(input="data")

ChainOfThought: Adds automatic step-by-step reasoning

predictor = dspy.ChainOfThought(TaskSignature)
result = predictor(input="data")
# result.reasoning contains the thought process

ReAct: Tool-using agents with iterative reasoning

predictor = dspy.ReAct(
    TaskSignature,
    tools=[search_tool, calculator_tool],
    max_iters=5
)

ProgramOfThought: Generates and executes Python code

predictor = dspy.ProgramOfThought(TaskSignature)
result = predictor(task="Calculate factorial of 10")

When to use each:

  • Predict: Simple tasks, classification, extraction
  • ChainOfThought: Complex reasoning, analysis, multi-step thinking
  • ReAct: Tasks requiring external tools (search, calculation, API calls)
  • ProgramOfThought: Tasks best solved with generated code

Full documentation: See core-concepts.md section on Predictors.

4. LLM Provider Configuration

Support for OpenAI, Anthropic Claude, Google, Ollama, and many more via LiteLLM.

Quick configuration examples:

import dspy

# OpenAI
lm = dspy.LM('openai/gpt-4o-mini', api_key=os.environ['OPENAI_API_KEY'])
dspy.configure(lm=lm)

# Anthropic Claude
lm = dspy.LM('anthropic/claude-3-5-sonnet-20241022', api_key=os.environ['ANTHROPIC_API_KEY'])
dspy.configure(lm=lm)

# Google Gemini
lm = dspy.LM('google/gemini-1.5-pro', api_key=os.environ['GOOGLE_API_KEY'])
dspy.configure(lm=lm)

# Local Ollama (free, private)
lm = dspy.LM('ollama_chat/llama3.1', api_base='http://localhost:11434')
dspy.configure(lm=lm)

Templates: See config-template.py for comprehensive examples including:

  • Environment-based configuration
  • Multi-model setups for different tasks
  • Async LM configuration
  • Retry logic and fallback strategies
  • Caching with dspy.cache

Provider compatibility matrix:

Feature OpenAI Anthropic Google Ollama
Structured Output Full Full Full Partial
Vision (Images) Full Full Full Limited
Tool Calling Full Full Full Varies
Streaming Full Full Full Full

Cost optimization strategy:

  • Development: Ollama (free) or gpt-4o-mini (cheap)
  • Testing: gpt-4o-mini with temperature=0.0
  • Production simple tasks: gpt-4o-mini, claude-3-haiku, gemini-1.5-flash
  • Production complex tasks: gpt-4o, claude-3-5-sonnet, gemini-1.5-pro

Full documentation: See providers.md for all configuration options.

5. FastAPI Integration

Serve DSPy modules as production API endpoints.

Quick reference:

from fastapi import FastAPI
from pydantic import BaseModel
import dspy

app = FastAPI()

# Initialize DSPy
lm = dspy.LM('openai/gpt-4o-mini')
dspy.configure(lm=lm)

# Load optimized module
classifier = EmailProcessor()

class EmailRequest(BaseModel):
    subject: str
    body: str

class EmailResponse(BaseModel):
    category: str
    priority: str

@app.post("/classify", response_model=EmailResponse)
async def classify_email(request: EmailRequest):
    result = classifier(
        email_subject=request.subject,
        email_body=request.body
    )
    return EmailResponse(
        category=result.category,
        priority=result.priority
    )

Production patterns:

  • Load optimized modules at startup
  • Use Pydantic models for request/response validation
  • Implement proper error handling
  • Add observability with OpenTelemetry
  • Use async where possible

Full documentation: See fastapi-integration.md for complete patterns.

6. Testing DSPy Modules

Write standard pytest tests for LLM logic.

Quick reference:

import pytest
import dspy

@pytest.fixture(scope="module")
def configure_dspy():
    lm = dspy.LM('openai/gpt-4o-mini', api_key=os.environ['OPENAI_API_KEY'])
    dspy.configure(lm=lm)

def test_email_classifier(configure_dspy):
    classifier = EmailProcessor()
    result = classifier(
        email_subject="Can't log in",
        email_body="Unable to access account"
    )

    assert result.category in ['Technical', 'Billing', 'General']
    assert result.priority in ['High', 'Medium', 'Low']

def test_technical_email_classification(configure_dspy):
    classifier = EmailProcessor()
    result = classifier(
        email_subject="Error 500 on checkout",
        email_body="Getting server error when trying to complete purchase"
    )

    assert result.category == 'Technical'

Testing patterns:

  • Use pytest fixtures for DSPy configuration
  • Test type correctness of outputs
  • Test edge cases (empty inputs, special characters, long texts)
  • Use VCR/responses for deterministic API testing
  • Integration test complete workflows

Full documentation: See optimization.md section on Testing.

7. Optimization with Teleprompters

Automatically improve prompts and modules using optimization techniques.

MIPROv2 optimization:

import dspy
from dspy.teleprompt import MIPROv2

# Define evaluation metric
def accuracy_metric(example, pred, trace=None):
    return example.category == pred.category

# Prepare training data
trainset = [
    dspy.Example(
        email_subject="Can't log in",
        email_body="Password reset not working",
        category="Technical"
    ).with_inputs("email_subject", "email_body"),
    # More examples...
]

# Run optimization
optimizer = MIPROv2(
    metric=accuracy_metric,
    num_candidates=10,
    init_temperature=0.7
)

optimized_module = optimizer.compile(
    EmailProcessor(),
    trainset=trainset,
    max_bootstrapped_demos=3,
    max_labeled_demos=5
)

# Save optimized module
optimized_module.save("optimized_classifier.json")

BootstrapFewShot (simpler, faster):

from dspy.teleprompt import BootstrapFewShot

optimizer = BootstrapFewShot(
    metric=accuracy_metric,
    max_bootstrapped_demos=4
)

optimized = optimizer.compile(
    EmailProcessor(),
    trainset=trainset
)

Full documentation: See optimization.md section on Teleprompters.

8. Caching and Performance

Optimize performance with built-in caching.

Enable caching:

import dspy

# Enable global caching
dspy.configure(
    lm=lm,
    cache=True  # Uses SQLite by default
)

# Or with custom cache directory
dspy.configure(
    lm=lm,
    cache_dir="/path/to/cache"
)

Cache control:

# Clear cache
dspy.cache.clear()

# Disable cache for specific call
with dspy.settings.context(cache=False):
    result = module(input="data")

Full documentation: See optimization.md section on Caching.

Quick Start Workflow

For New Projects

  1. Install DSPy:
pip install dspy-ai
  1. Configure LLM provider (see config-template.py):
import dspy
import os

lm = dspy.LM('openai/gpt-4o-mini', api_key=os.environ['OPENAI_API_KEY'])
dspy.configure(lm=lm)
  1. Create a signature (see signature-template.py):
class MySignature(dspy.Signature):
    """Clear description of task."""

    input_field: str = dspy.InputField(desc="Description")
    output_field: str = dspy.OutputField(desc="Description")
  1. Create a module (see module-template.py):
class MyModule(dspy.Module):
    def __init__(self):
        super().__init__()
        self.predictor = dspy.Predict(MySignature)

    def forward(self, input_field: str):
        return self.predictor(input_field=input_field)
  1. Use the module:
module = MyModule()
result = module(input_field="test")
print(result.output_field)
  1. Add tests (see optimization.md):
def test_my_module():
    result = MyModule()(input_field="test")
    assert isinstance(result.output_field, str)

For FastAPI Applications

  1. Install dependencies:
pip install dspy-ai fastapi uvicorn pydantic
  1. Create app structure:
my_app/
├── app/
│   ├── __init__.py
│   ├── main.py           # FastAPI app
│   ├── dspy_modules/     # DSPy modules
│   │   ├── __init__.py
│   │   └── classifier.py
│   ├── models/           # Pydantic models
│   │   └── __init__.py
│   └── config.py         # DSPy configuration
├── tests/
│   └── test_classifier.py
└── requirements.txt
  1. Configure DSPy in config.py:
import dspy
import os

def configure_dspy():
    lm = dspy.LM(
        'openai/gpt-4o-mini',
        api_key=os.environ['OPENAI_API_KEY']
    )
    dspy.configure(lm=lm, cache=True)
  1. Create FastAPI app in main.py:
from fastapi import FastAPI
from contextlib import asynccontextmanager
from app.config import configure_dspy
from app.dspy_modules.classifier import EmailProcessor

@asynccontextmanager
async def lifespan(app: FastAPI):
    configure_dspy()
    yield

app = FastAPI(lifespan=lifespan)
classifier = EmailProcessor()

@app.post("/classify")
async def classify(request: EmailRequest):
    result = classifier(
        email_subject=request.subject,
        email_body=request.body
    )
    return {"category": result.category, "priority": result.priority}

Common Patterns

Pattern: Multi-Step Analysis Pipeline

class AnalysisPipeline(dspy.Module):
    def __init__(self):
        super().__init__()
        self.extract = dspy.Predict(ExtractSignature)
        self.analyze = dspy.ChainOfThought(AnalyzeSignature)
        self.summarize = dspy.Predict(SummarizeSignature)

    def forward(self, text: str):
        extracted = self.extract(text=text)
        analyzed = self.analyze(data=extracted.data)
        return self.summarize(analysis=analyzed.result)

Pattern: Agent with Tools

import dspy

def search_web(query: str) -> str:
    """Search the web for information."""
    # Implementation here
    return f"Results for: {query}"

def calculate(expression: str) -> str:
    """Evaluate a mathematical expression."""
    return str(eval(expression))

class ResearchAgent(dspy.Module):
    def __init__(self):
        super().__init__()
        self.agent = dspy.ReAct(
            ResearchSignature,
            tools=[search_web, calculate],
            max_iters=10
        )

    def forward(self, question: str):
        return self.agent(question=question)

Pattern: Conditional Routing

class SmartRouter(dspy.Module):
    def __init__(self):
        super().__init__()
        self.classifier = dspy.Predict(ClassifyComplexity)
        self.simple_handler = SimpleModule()
        self.complex_handler = ComplexModule()

    def forward(self, input_text: str):
        classification = self.classifier(text=input_text)

        if classification.complexity == "Simple":
            return self.simple_handler(input=input_text)
        else:
            return self.complex_handler(input=input_text)

Pattern: Retry with Fallback

import dspy
from tenacity import retry, stop_after_attempt, wait_exponential

class RobustModule(dspy.Module):
    def __init__(self):
        super().__init__()
        self.predictor = dspy.Predict(TaskSignature)

    @retry(
        stop=stop_after_attempt(3),
        wait=wait_exponential(multiplier=1, min=2, max=10)
    )
    def forward(self, input_text: str):
        result = self.predictor(input=input_text)
        self._validate(result)
        return result

    def _validate(self, result):
        if not result.output:
            raise ValueError("Empty output from LLM")

Pattern: Pydantic Output Models

from pydantic import BaseModel, Field
import dspy

class ClassificationResult(BaseModel):
    category: str = Field(description="Category: Technical, Billing, or General")
    priority: str = Field(description="Priority: Low, Medium, or High")
    confidence: float = Field(ge=0.0, le=1.0, description="Confidence score")

class TypedClassifier(dspy.Signature):
    """Classify with structured output."""

    text: str = dspy.InputField()
    result: ClassificationResult = dspy.OutputField()

Resources

This skill includes comprehensive reference materials and templates:

References (load as needed for detailed information)

  • core-concepts.md: Complete guide to signatures, modules, predictors, and best practices
  • providers.md: All LLM provider configurations, compatibility matrix, and troubleshooting
  • optimization.md: Testing patterns, teleprompters, caching, and monitoring
  • fastapi-integration.md: Production patterns for serving DSPy with FastAPI

Assets (templates for quick starts)

When to Use This Skill

Trigger this skill when:

  • Implementing LLM-powered features in Python applications
  • Creating programmatic interfaces for AI operations
  • Building agent systems with tool usage
  • Setting up or troubleshooting LLM providers with DSPy
  • Optimizing prompts using teleprompters
  • Testing LLM functionality with pytest
  • Integrating DSPy with FastAPI
  • Converting from manual prompt engineering to programmatic approach
  • Debugging DSPy code or configuration issues