[2.9.0] Rename plugin to compound-engineering

BREAKING: Plugin renamed from compounding-engineering to compound-engineering.
Users will need to reinstall with the new name:

  claude /plugin install compound-engineering

Changes:
- Renamed plugin directory and all references
- Updated documentation counts (24 agents, 19 commands)
- Added julik-frontend-races-reviewer to docs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Kieran Klaassen
2025-12-02 17:32:04 -08:00
parent 4b49e5344d
commit 6c5b3e40db
121 changed files with 136 additions and 117 deletions

View File

@@ -0,0 +1,594 @@
---
name: dspy-ruby
description: This skill should be used when working with DSPy.rb, a Ruby framework for building type-safe, composable LLM applications. Use this when implementing predictable AI features, creating LLM signatures and modules, configuring language model providers (OpenAI, Anthropic, Gemini, Ollama), building agent systems with tools, optimizing prompts, or testing LLM-powered functionality in Ruby applications.
---
# DSPy.rb Expert
## Overview
DSPy.rb is a Ruby framework that enables developers to **program LLMs, not prompt them**. Instead of manually crafting prompts, define application requirements through type-safe, composable modules that can be tested, optimized, and version-controlled like regular code.
This skill provides comprehensive guidance on:
- Creating type-safe signatures for LLM operations
- Building composable modules and workflows
- Configuring multiple LLM providers
- Implementing agents with tools
- Testing and optimizing LLM applications
- Production deployment patterns
## Core Capabilities
### 1. Type-Safe Signatures
Create input/output contracts for LLM operations with runtime type checking.
**When to use**: Defining any LLM task, from simple classification to complex analysis.
**Quick reference**:
```ruby
class EmailClassificationSignature < DSPy::Signature
description "Classify customer support emails"
input do
const :email_subject, String
const :email_body, String
end
output do
const :category, T.enum(["Technical", "Billing", "General"])
const :priority, T.enum(["Low", "Medium", "High"])
end
end
```
**Templates**: See `assets/signature-template.rb` for comprehensive examples including:
- Basic signatures with multiple field types
- Vision signatures for multimodal tasks
- Sentiment analysis signatures
- Code generation signatures
**Best practices**:
- Always provide clear, specific descriptions
- Use enums for constrained outputs
- Include field descriptions with `desc:` parameter
- Prefer specific types over generic String when possible
**Full documentation**: See `references/core-concepts.md` sections on Signatures and Type Safety.
### 2. Composable Modules
Build reusable, chainable modules that encapsulate LLM operations.
**When to use**: Implementing any LLM-powered feature, especially complex multi-step workflows.
**Quick reference**:
```ruby
class EmailProcessor < DSPy::Module
def initialize
super
@classifier = DSPy::Predict.new(EmailClassificationSignature)
end
def forward(email_subject:, email_body:)
@classifier.forward(
email_subject: email_subject,
email_body: email_body
)
end
end
```
**Templates**: See `assets/module-template.rb` for comprehensive examples including:
- Basic modules with single predictors
- Multi-step pipelines that chain modules
- Modules with conditional logic
- Error handling and retry patterns
- Stateful modules with history
- Caching implementations
**Module composition**: Chain modules together to create complex workflows:
```ruby
class Pipeline < DSPy::Module
def initialize
super
@step1 = Classifier.new
@step2 = Analyzer.new
@step3 = Responder.new
end
def forward(input)
result1 = @step1.forward(input)
result2 = @step2.forward(result1)
@step3.forward(result2)
end
end
```
**Full documentation**: See `references/core-concepts.md` sections on Modules and Module Composition.
### 3. Multiple Predictor Types
Choose the right predictor for your task:
**Predict**: Basic LLM inference with type-safe inputs/outputs
```ruby
predictor = DSPy::Predict.new(TaskSignature)
result = predictor.forward(input: "data")
```
**ChainOfThought**: Adds automatic reasoning for improved accuracy
```ruby
predictor = DSPy::ChainOfThought.new(TaskSignature)
result = predictor.forward(input: "data")
# Returns: { reasoning: "...", output: "..." }
```
**ReAct**: Tool-using agents with iterative reasoning
```ruby
predictor = DSPy::ReAct.new(
TaskSignature,
tools: [SearchTool.new, CalculatorTool.new],
max_iterations: 5
)
```
**CodeAct**: Dynamic code generation (requires `dspy-code_act` gem)
```ruby
predictor = DSPy::CodeAct.new(TaskSignature)
result = predictor.forward(task: "Calculate factorial of 5")
```
**When to use each**:
- **Predict**: Simple tasks, classification, extraction
- **ChainOfThought**: Complex reasoning, analysis, multi-step thinking
- **ReAct**: Tasks requiring external tools (search, calculation, API calls)
- **CodeAct**: Tasks best solved with generated code
**Full documentation**: See `references/core-concepts.md` section on Predictors.
### 4. LLM Provider Configuration
Support for OpenAI, Anthropic Claude, Google Gemini, Ollama, and OpenRouter.
**Quick configuration examples**:
```ruby
# OpenAI
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
end
# Anthropic Claude
DSPy.configure do |c|
c.lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'])
end
# Google Gemini
DSPy.configure do |c|
c.lm = DSPy::LM.new('gemini/gemini-1.5-pro',
api_key: ENV['GOOGLE_API_KEY'])
end
# Local Ollama (free, private)
DSPy.configure do |c|
c.lm = DSPy::LM.new('ollama/llama3.1')
end
```
**Templates**: See `assets/config-template.rb` for comprehensive examples including:
- Environment-based configuration
- Multi-model setups for different tasks
- Configuration with observability (OpenTelemetry, Langfuse)
- Retry logic and fallback strategies
- Budget tracking
- Rails initializer patterns
**Provider compatibility matrix**:
| Feature | OpenAI | Anthropic | Gemini | Ollama |
|---------|--------|-----------|--------|--------|
| Structured Output | ✅ | ✅ | ✅ | ✅ |
| Vision (Images) | ✅ | ✅ | ✅ | ⚠️ Limited |
| Image URLs | ✅ | ❌ | ❌ | ❌ |
| Tool Calling | ✅ | ✅ | ✅ | Varies |
**Cost optimization strategy**:
- Development: Ollama (free) or gpt-4o-mini (cheap)
- Testing: gpt-4o-mini with temperature=0.0
- Production simple tasks: gpt-4o-mini, claude-3-haiku, gemini-1.5-flash
- Production complex tasks: gpt-4o, claude-3-5-sonnet, gemini-1.5-pro
**Full documentation**: See `references/providers.md` for all configuration options, provider-specific features, and troubleshooting.
### 5. Multimodal & Vision Support
Process images alongside text using the unified `DSPy::Image` interface.
**Quick reference**:
```ruby
class VisionSignature < DSPy::Signature
description "Analyze image and answer questions"
input do
const :image, DSPy::Image
const :question, String
end
output do
const :answer, String
end
end
predictor = DSPy::Predict.new(VisionSignature)
result = predictor.forward(
image: DSPy::Image.from_file("path/to/image.jpg"),
question: "What objects are visible?"
)
```
**Image loading methods**:
```ruby
# From file
DSPy::Image.from_file("path/to/image.jpg")
# From URL (OpenAI only)
DSPy::Image.from_url("https://example.com/image.jpg")
# From base64
DSPy::Image.from_base64(base64_data, mime_type: "image/jpeg")
```
**Provider support**:
- OpenAI: Full support including URLs
- Anthropic, Gemini: Base64 or file loading only
- Ollama: Limited multimodal depending on model
**Full documentation**: See `references/core-concepts.md` section on Multimodal Support.
### 6. Testing LLM Applications
Write standard RSpec tests for LLM logic.
**Quick reference**:
```ruby
RSpec.describe EmailClassifier do
before do
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
end
end
it 'classifies technical emails correctly' do
classifier = EmailClassifier.new
result = classifier.forward(
email_subject: "Can't log in",
email_body: "Unable to access account"
)
expect(result[:category]).to eq('Technical')
expect(result[:priority]).to be_in(['High', 'Medium', 'Low'])
end
end
```
**Testing patterns**:
- Mock LLM responses for unit tests
- Use VCR for deterministic API testing
- Test type safety and validation
- Test edge cases (empty inputs, special characters, long texts)
- Integration test complete workflows
**Full documentation**: See `references/optimization.md` section on Testing.
### 7. Optimization & Improvement
Automatically improve prompts and modules using optimization techniques.
**MIPROv2 optimization**:
```ruby
require 'dspy/mipro'
# Define evaluation metric
def accuracy_metric(example, prediction)
example[:expected_output][:category] == prediction[:category] ? 1.0 : 0.0
end
# Prepare training data
training_examples = [
{
input: { email_subject: "...", email_body: "..." },
expected_output: { category: 'Technical' }
},
# More examples...
]
# Run optimization
optimizer = DSPy::MIPROv2.new(
metric: method(:accuracy_metric),
num_candidates: 10
)
optimized_module = optimizer.compile(
EmailClassifier.new,
trainset: training_examples
)
```
**A/B testing different approaches**:
```ruby
# Test ChainOfThought vs ReAct
approach_a_score = evaluate_approach(ChainOfThoughtModule, test_set)
approach_b_score = evaluate_approach(ReActModule, test_set)
```
**Full documentation**: See `references/optimization.md` section on Optimization.
### 8. Observability & Monitoring
Track performance, token usage, and behavior in production.
**OpenTelemetry integration**:
```ruby
require 'opentelemetry/sdk'
OpenTelemetry::SDK.configure do |c|
c.service_name = 'my-dspy-app'
c.use_all
end
# DSPy automatically creates traces
```
**Langfuse tracing**:
```ruby
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
c.langfuse = {
public_key: ENV['LANGFUSE_PUBLIC_KEY'],
secret_key: ENV['LANGFUSE_SECRET_KEY']
}
end
```
**Custom monitoring**:
- Token tracking
- Performance monitoring
- Error rate tracking
- Custom logging
**Full documentation**: See `references/optimization.md` section on Observability.
## Quick Start Workflow
### For New Projects
1. **Install DSPy.rb and provider gems**:
```bash
gem install dspy dspy-openai # or dspy-anthropic, dspy-gemini
```
2. **Configure LLM provider** (see `assets/config-template.rb`):
```ruby
require 'dspy'
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
end
```
3. **Create a signature** (see `assets/signature-template.rb`):
```ruby
class MySignature < DSPy::Signature
description "Clear description of task"
input do
const :input_field, String, desc: "Description"
end
output do
const :output_field, String, desc: "Description"
end
end
```
4. **Create a module** (see `assets/module-template.rb`):
```ruby
class MyModule < DSPy::Module
def initialize
super
@predictor = DSPy::Predict.new(MySignature)
end
def forward(input_field:)
@predictor.forward(input_field: input_field)
end
end
```
5. **Use the module**:
```ruby
module_instance = MyModule.new
result = module_instance.forward(input_field: "test")
puts result[:output_field]
```
6. **Add tests** (see `references/optimization.md`):
```ruby
RSpec.describe MyModule do
it 'produces expected output' do
result = MyModule.new.forward(input_field: "test")
expect(result[:output_field]).to be_a(String)
end
end
```
### For Rails Applications
1. **Add to Gemfile**:
```ruby
gem 'dspy'
gem 'dspy-openai' # or other provider
```
2. **Create initializer** at `config/initializers/dspy.rb` (see `assets/config-template.rb` for full example):
```ruby
require 'dspy'
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
end
```
3. **Create modules in** `app/llm/` directory:
```ruby
# app/llm/email_classifier.rb
class EmailClassifier < DSPy::Module
# Implementation here
end
```
4. **Use in controllers/services**:
```ruby
class EmailsController < ApplicationController
def classify
classifier = EmailClassifier.new
result = classifier.forward(
email_subject: params[:subject],
email_body: params[:body]
)
render json: result
end
end
```
## Common Patterns
### Pattern: Multi-Step Analysis Pipeline
```ruby
class AnalysisPipeline < DSPy::Module
def initialize
super
@extract = DSPy::Predict.new(ExtractSignature)
@analyze = DSPy::ChainOfThought.new(AnalyzeSignature)
@summarize = DSPy::Predict.new(SummarizeSignature)
end
def forward(text:)
extracted = @extract.forward(text: text)
analyzed = @analyze.forward(data: extracted[:data])
@summarize.forward(analysis: analyzed[:result])
end
end
```
### Pattern: Agent with Tools
```ruby
class ResearchAgent < DSPy::Module
def initialize
super
@agent = DSPy::ReAct.new(
ResearchSignature,
tools: [
WebSearchTool.new,
DatabaseQueryTool.new,
SummarizerTool.new
],
max_iterations: 10
)
end
def forward(question:)
@agent.forward(question: question)
end
end
class WebSearchTool < DSPy::Tool
def call(query:)
results = perform_search(query)
{ results: results }
end
end
```
### Pattern: Conditional Routing
```ruby
class SmartRouter < DSPy::Module
def initialize
super
@classifier = DSPy::Predict.new(ClassifySignature)
@simple_handler = SimpleModule.new
@complex_handler = ComplexModule.new
end
def forward(input:)
classification = @classifier.forward(text: input)
if classification[:complexity] == 'Simple'
@simple_handler.forward(input: input)
else
@complex_handler.forward(input: input)
end
end
end
```
### Pattern: Retry with Fallback
```ruby
class RobustModule < DSPy::Module
MAX_RETRIES = 3
def forward(input, retry_count: 0)
begin
@predictor.forward(input)
rescue DSPy::ValidationError => e
if retry_count < MAX_RETRIES
sleep(2 ** retry_count)
forward(input, retry_count: retry_count + 1)
else
# Fallback to default or raise
raise
end
end
end
end
```
## Resources
This skill includes comprehensive reference materials and templates:
### References (load as needed for detailed information)
- **`references/core-concepts.md`**: Complete guide to signatures, modules, predictors, multimodal support, and best practices
- **`references/providers.md`**: All LLM provider configurations, compatibility matrix, cost optimization, and troubleshooting
- **`references/optimization.md`**: Testing patterns, optimization techniques, observability setup, and monitoring
### Assets (templates for quick starts)
- **`assets/signature-template.rb`**: Examples of signatures including basic, vision, sentiment analysis, and code generation
- **`assets/module-template.rb`**: Module patterns including pipelines, agents, error handling, caching, and state management
- **`assets/config-template.rb`**: Configuration examples for all providers, environments, observability, and production patterns
## When to Use This Skill
Trigger this skill when:
- Implementing LLM-powered features in Ruby applications
- Creating type-safe interfaces for AI operations
- Building agent systems with tool usage
- Setting up or troubleshooting LLM providers
- Optimizing prompts and improving accuracy
- Testing LLM functionality
- Adding observability to AI applications
- Converting from manual prompt engineering to programmatic approach
- Debugging DSPy.rb code or configuration issues

View File

@@ -0,0 +1,359 @@
# frozen_string_literal: true
# DSPy.rb Configuration Examples
# This file demonstrates various configuration patterns for different use cases
require 'dspy'
# ============================================================================
# Basic Configuration
# ============================================================================
# Simple OpenAI configuration
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
end
# ============================================================================
# Multi-Provider Configuration
# ============================================================================
# Anthropic Claude
DSPy.configure do |c|
c.lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'])
end
# Google Gemini
DSPy.configure do |c|
c.lm = DSPy::LM.new('gemini/gemini-1.5-pro',
api_key: ENV['GOOGLE_API_KEY'])
end
# Local Ollama
DSPy.configure do |c|
c.lm = DSPy::LM.new('ollama/llama3.1',
base_url: 'http://localhost:11434')
end
# OpenRouter (access to 200+ models)
DSPy.configure do |c|
c.lm = DSPy::LM.new('openrouter/anthropic/claude-3.5-sonnet',
api_key: ENV['OPENROUTER_API_KEY'],
base_url: 'https://openrouter.ai/api/v1')
end
# ============================================================================
# Environment-Based Configuration
# ============================================================================
# Different models for different environments
if Rails.env.development?
# Use local Ollama for development (free, private)
DSPy.configure do |c|
c.lm = DSPy::LM.new('ollama/llama3.1')
end
elsif Rails.env.test?
# Use cheap model for testing
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
end
else
# Use powerful model for production
DSPy.configure do |c|
c.lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'])
end
end
# ============================================================================
# Configuration with Custom Parameters
# ============================================================================
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o',
api_key: ENV['OPENAI_API_KEY'],
temperature: 0.7, # Creativity (0.0-2.0, default: 1.0)
max_tokens: 2000, # Maximum response length
top_p: 0.9, # Nucleus sampling
frequency_penalty: 0.0, # Reduce repetition (-2.0 to 2.0)
presence_penalty: 0.0 # Encourage new topics (-2.0 to 2.0)
)
end
# ============================================================================
# Multiple Model Configuration (Task-Specific)
# ============================================================================
# Create different language models for different tasks
module MyApp
# Fast model for simple tasks
FAST_LM = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'],
temperature: 0.3 # More deterministic
)
# Powerful model for complex tasks
POWERFUL_LM = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'],
temperature: 0.7
)
# Creative model for content generation
CREATIVE_LM = DSPy::LM.new('openai/gpt-4o',
api_key: ENV['OPENAI_API_KEY'],
temperature: 1.2, # More creative
top_p: 0.95
)
# Vision-capable model
VISION_LM = DSPy::LM.new('openai/gpt-4o',
api_key: ENV['OPENAI_API_KEY'])
end
# Use in modules
class SimpleClassifier < DSPy::Module
def initialize
super
DSPy.configure { |c| c.lm = MyApp::FAST_LM }
@predictor = DSPy::Predict.new(SimpleSignature)
end
end
class ComplexAnalyzer < DSPy::Module
def initialize
super
DSPy.configure { |c| c.lm = MyApp::POWERFUL_LM }
@predictor = DSPy::ChainOfThought.new(ComplexSignature)
end
end
# ============================================================================
# Configuration with Observability (OpenTelemetry)
# ============================================================================
require 'opentelemetry/sdk'
# Configure OpenTelemetry
OpenTelemetry::SDK.configure do |c|
c.service_name = 'my-dspy-app'
c.use_all
end
# Configure DSPy (automatically integrates with OpenTelemetry)
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
end
# ============================================================================
# Configuration with Langfuse Tracing
# ============================================================================
require 'dspy/langfuse'
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
# Enable Langfuse tracing
c.langfuse = {
public_key: ENV['LANGFUSE_PUBLIC_KEY'],
secret_key: ENV['LANGFUSE_SECRET_KEY'],
host: ENV['LANGFUSE_HOST'] || 'https://cloud.langfuse.com'
}
end
# ============================================================================
# Configuration with Retry Logic
# ============================================================================
class RetryableConfig
MAX_RETRIES = 3
def self.configure
DSPy.configure do |c|
c.lm = create_lm_with_retry
end
end
def self.create_lm_with_retry
lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
# Wrap with retry logic
lm.extend(RetryBehavior)
lm
end
module RetryBehavior
def forward(input, retry_count: 0)
super(input)
rescue RateLimitError, TimeoutError => e
if retry_count < MAX_RETRIES
sleep(2 ** retry_count) # Exponential backoff
forward(input, retry_count: retry_count + 1)
else
raise
end
end
end
end
RetryableConfig.configure
# ============================================================================
# Configuration with Fallback Models
# ============================================================================
class FallbackConfig
def self.configure
DSPy.configure do |c|
c.lm = create_lm_with_fallback
end
end
def self.create_lm_with_fallback
primary = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'])
fallback = DSPy::LM.new('openai/gpt-4o',
api_key: ENV['OPENAI_API_KEY'])
FallbackLM.new(primary, fallback)
end
class FallbackLM
def initialize(primary, fallback)
@primary = primary
@fallback = fallback
end
def forward(input)
@primary.forward(input)
rescue => e
puts "Primary model failed: #{e.message}. Falling back..."
@fallback.forward(input)
end
end
end
FallbackConfig.configure
# ============================================================================
# Configuration with Budget Tracking
# ============================================================================
class BudgetTrackedConfig
def self.configure(monthly_budget_usd:)
DSPy.configure do |c|
c.lm = BudgetTracker.new(
DSPy::LM.new('openai/gpt-4o',
api_key: ENV['OPENAI_API_KEY']),
monthly_budget_usd: monthly_budget_usd
)
end
end
class BudgetTracker
def initialize(lm, monthly_budget_usd:)
@lm = lm
@monthly_budget_usd = monthly_budget_usd
@monthly_cost = 0.0
end
def forward(input)
result = @lm.forward(input)
# Track cost (simplified - actual costs vary by model)
tokens = result.metadata[:usage][:total_tokens]
cost = estimate_cost(tokens)
@monthly_cost += cost
if @monthly_cost > @monthly_budget_usd
raise "Monthly budget of $#{@monthly_budget_usd} exceeded!"
end
result
end
private
def estimate_cost(tokens)
# Simplified cost estimation (check provider pricing)
(tokens / 1_000_000.0) * 5.0 # $5 per 1M tokens
end
end
end
BudgetTrackedConfig.configure(monthly_budget_usd: 100)
# ============================================================================
# Configuration Initializer for Rails
# ============================================================================
# Save this as config/initializers/dspy.rb
#
# require 'dspy'
#
# DSPy.configure do |c|
# # Environment-specific configuration
# model_config = case Rails.env.to_sym
# when :development
# { provider: 'ollama', model: 'llama3.1' }
# when :test
# { provider: 'openai', model: 'gpt-4o-mini', temperature: 0.0 }
# when :production
# { provider: 'anthropic', model: 'claude-3-5-sonnet-20241022' }
# end
#
# # Configure language model
# c.lm = DSPy::LM.new(
# "#{model_config[:provider]}/#{model_config[:model]}",
# api_key: ENV["#{model_config[:provider].upcase}_API_KEY"],
# **model_config.except(:provider, :model)
# )
#
# # Optional: Add observability
# if Rails.env.production?
# c.langfuse = {
# public_key: ENV['LANGFUSE_PUBLIC_KEY'],
# secret_key: ENV['LANGFUSE_SECRET_KEY']
# }
# end
# end
# ============================================================================
# Testing Configuration
# ============================================================================
# In spec/spec_helper.rb or test/test_helper.rb
#
# RSpec.configure do |config|
# config.before(:suite) do
# DSPy.configure do |c|
# c.lm = DSPy::LM.new('openai/gpt-4o-mini',
# api_key: ENV['OPENAI_API_KEY'],
# temperature: 0.0 # Deterministic for testing
# )
# end
# end
# end
# ============================================================================
# Configuration Best Practices
# ============================================================================
# 1. Use environment variables for API keys (never hardcode)
# 2. Use different models for different environments
# 3. Use cheaper/faster models for development and testing
# 4. Configure temperature based on use case:
# - 0.0-0.3: Deterministic, factual tasks
# - 0.7-1.0: Balanced creativity
# - 1.0-2.0: High creativity, content generation
# 5. Add observability in production (OpenTelemetry, Langfuse)
# 6. Implement retry logic and fallbacks for reliability
# 7. Track costs and set budgets for production
# 8. Use max_tokens to control response length and costs

View File

@@ -0,0 +1,326 @@
# frozen_string_literal: true
# Example DSPy Module Template
# This template demonstrates best practices for creating composable modules
# Basic module with single predictor
class BasicModule < DSPy::Module
def initialize
super
# Initialize predictor with signature
@predictor = DSPy::Predict.new(ExampleSignature)
end
def forward(input_hash)
# Forward pass through the predictor
@predictor.forward(input_hash)
end
end
# Module with Chain of Thought reasoning
class ChainOfThoughtModule < DSPy::Module
def initialize
super
# ChainOfThought automatically adds reasoning to output
@predictor = DSPy::ChainOfThought.new(EmailClassificationSignature)
end
def forward(email_subject:, email_body:)
result = @predictor.forward(
email_subject: email_subject,
email_body: email_body
)
# Result includes :reasoning field automatically
{
category: result[:category],
priority: result[:priority],
reasoning: result[:reasoning],
confidence: calculate_confidence(result)
}
end
private
def calculate_confidence(result)
# Add custom logic to calculate confidence
# For example, based on reasoning length or specificity
result[:confidence] || 0.8
end
end
# Composable module that chains multiple steps
class MultiStepPipeline < DSPy::Module
def initialize
super
# Initialize multiple predictors for different steps
@step1 = DSPy::Predict.new(Step1Signature)
@step2 = DSPy::ChainOfThought.new(Step2Signature)
@step3 = DSPy::Predict.new(Step3Signature)
end
def forward(input)
# Chain predictors together
result1 = @step1.forward(input)
result2 = @step2.forward(result1)
result3 = @step3.forward(result2)
# Combine results as needed
{
step1_output: result1,
step2_output: result2,
final_result: result3
}
end
end
# Module with conditional logic
class ConditionalModule < DSPy::Module
def initialize
super
@simple_classifier = DSPy::Predict.new(SimpleClassificationSignature)
@complex_analyzer = DSPy::ChainOfThought.new(ComplexAnalysisSignature)
end
def forward(text:, complexity_threshold: 100)
# Use different predictors based on input characteristics
if text.length < complexity_threshold
@simple_classifier.forward(text: text)
else
@complex_analyzer.forward(text: text)
end
end
end
# Module with error handling and retry logic
class RobustModule < DSPy::Module
MAX_RETRIES = 3
def initialize
super
@predictor = DSPy::Predict.new(RobustSignature)
@logger = Logger.new(STDOUT)
end
def forward(input, retry_count: 0)
@logger.info "Processing input: #{input.inspect}"
begin
result = @predictor.forward(input)
validate_result!(result)
result
rescue DSPy::ValidationError => e
@logger.error "Validation error: #{e.message}"
if retry_count < MAX_RETRIES
@logger.info "Retrying (#{retry_count + 1}/#{MAX_RETRIES})..."
sleep(2 ** retry_count) # Exponential backoff
forward(input, retry_count: retry_count + 1)
else
@logger.error "Max retries exceeded"
raise
end
end
end
private
def validate_result!(result)
# Add custom validation logic
raise DSPy::ValidationError, "Invalid result" unless result[:category]
raise DSPy::ValidationError, "Low confidence" if result[:confidence] && result[:confidence] < 0.5
end
end
# Module with ReAct agent and tools
class AgentModule < DSPy::Module
def initialize
super
# Define tools for the agent
tools = [
SearchTool.new,
CalculatorTool.new,
DatabaseQueryTool.new
]
# ReAct provides iterative reasoning and tool usage
@agent = DSPy::ReAct.new(
AgentSignature,
tools: tools,
max_iterations: 5
)
end
def forward(task:)
# Agent will autonomously use tools to complete the task
@agent.forward(task: task)
end
end
# Tool definition example
class SearchTool < DSPy::Tool
def call(query:)
# Implement search functionality
results = perform_search(query)
{ results: results }
end
private
def perform_search(query)
# Actual search implementation
# Could call external API, database, etc.
["result1", "result2", "result3"]
end
end
# Module with state management
class StatefulModule < DSPy::Module
attr_reader :history
def initialize
super
@predictor = DSPy::ChainOfThought.new(StatefulSignature)
@history = []
end
def forward(input)
# Process with context from history
context = build_context_from_history
result = @predictor.forward(
input: input,
context: context
)
# Store in history
@history << {
input: input,
result: result,
timestamp: Time.now
}
result
end
def reset!
@history.clear
end
private
def build_context_from_history
@history.last(5).map { |h| h[:result][:summary] }.join("\n")
end
end
# Module that uses different LLMs for different tasks
class MultiModelModule < DSPy::Module
def initialize
super
# Fast, cheap model for simple classification
@fast_predictor = create_predictor(
'openai/gpt-4o-mini',
SimpleClassificationSignature
)
# Powerful model for complex analysis
@powerful_predictor = create_predictor(
'anthropic/claude-3-5-sonnet-20241022',
ComplexAnalysisSignature
)
end
def forward(input, use_complex: false)
if use_complex
@powerful_predictor.forward(input)
else
@fast_predictor.forward(input)
end
end
private
def create_predictor(model, signature)
lm = DSPy::LM.new(model, api_key: ENV["#{model.split('/').first.upcase}_API_KEY"])
DSPy::Predict.new(signature, lm: lm)
end
end
# Module with caching
class CachedModule < DSPy::Module
def initialize
super
@predictor = DSPy::Predict.new(CachedSignature)
@cache = {}
end
def forward(input)
# Create cache key from input
cache_key = create_cache_key(input)
# Return cached result if available
if @cache.key?(cache_key)
puts "Cache hit for #{cache_key}"
return @cache[cache_key]
end
# Compute and cache result
result = @predictor.forward(input)
@cache[cache_key] = result
result
end
def clear_cache!
@cache.clear
end
private
def create_cache_key(input)
# Create deterministic hash from input
Digest::MD5.hexdigest(input.to_s)
end
end
# Usage Examples:
#
# Basic usage:
# module = BasicModule.new
# result = module.forward(field_name: "value")
#
# Chain of Thought:
# module = ChainOfThoughtModule.new
# result = module.forward(
# email_subject: "Can't log in",
# email_body: "I'm unable to access my account"
# )
# puts result[:reasoning]
#
# Multi-step pipeline:
# pipeline = MultiStepPipeline.new
# result = pipeline.forward(input_data)
#
# With error handling:
# module = RobustModule.new
# begin
# result = module.forward(input_data)
# rescue DSPy::ValidationError => e
# puts "Failed after retries: #{e.message}"
# end
#
# Agent with tools:
# agent = AgentModule.new
# result = agent.forward(task: "Find the population of Tokyo")
#
# Stateful processing:
# module = StatefulModule.new
# result1 = module.forward("First input")
# result2 = module.forward("Second input") # Has context from first
# module.reset! # Clear history
#
# With caching:
# module = CachedModule.new
# result1 = module.forward(input) # Computes result
# result2 = module.forward(input) # Returns cached result

View File

@@ -0,0 +1,143 @@
# frozen_string_literal: true
# Example DSPy Signature Template
# This template demonstrates best practices for creating type-safe signatures
class ExampleSignature < DSPy::Signature
# Clear, specific description of what this signature does
# Good: "Classify customer support emails into Technical, Billing, or General categories"
# Avoid: "Classify emails"
description "Describe what this signature accomplishes and what output it produces"
# Input fields: Define what data the LLM receives
input do
# Basic field with description
const :field_name, String, desc: "Clear description of this input field"
# Numeric fields
const :count, Integer, desc: "Number of items to process"
const :score, Float, desc: "Confidence score between 0.0 and 1.0"
# Boolean fields
const :is_active, T::Boolean, desc: "Whether the item is currently active"
# Array fields
const :tags, T::Array[String], desc: "List of tags associated with the item"
# Optional: Enum for constrained values
const :priority, T.enum(["Low", "Medium", "High"]), desc: "Priority level"
end
# Output fields: Define what data the LLM produces
output do
# Primary output
const :result, String, desc: "The main result of the operation"
# Classification result with enum
const :category, T.enum(["Technical", "Billing", "General"]),
desc: "Category classification - must be one of: Technical, Billing, General"
# Confidence/metadata
const :confidence, Float, desc: "Confidence score (0.0-1.0) for this classification"
# Optional reasoning (automatically added by ChainOfThought)
# const :reasoning, String, desc: "Step-by-step reasoning for the classification"
end
end
# Example with multimodal input (vision)
class VisionExampleSignature < DSPy::Signature
description "Analyze an image and answer questions about its content"
input do
const :image, DSPy::Image, desc: "The image to analyze"
const :question, String, desc: "Question about the image content"
end
output do
const :answer, String, desc: "Detailed answer to the question about the image"
const :confidence, Float, desc: "Confidence in the answer (0.0-1.0)"
end
end
# Example for complex analysis task
class SentimentAnalysisSignature < DSPy::Signature
description "Analyze the sentiment of text with nuanced emotion detection"
input do
const :text, String, desc: "The text to analyze for sentiment"
const :context, String, desc: "Additional context about the text source or situation"
end
output do
const :sentiment, T.enum(["Positive", "Negative", "Neutral", "Mixed"]),
desc: "Overall sentiment - must be Positive, Negative, Neutral, or Mixed"
const :emotions, T::Array[String],
desc: "List of specific emotions detected (e.g., joy, anger, sadness, fear)"
const :intensity, T.enum(["Low", "Medium", "High"]),
desc: "Intensity of the detected sentiment"
const :confidence, Float,
desc: "Confidence in the sentiment classification (0.0-1.0)"
end
end
# Example for code generation task
class CodeGenerationSignature < DSPy::Signature
description "Generate Ruby code based on natural language requirements"
input do
const :requirements, String,
desc: "Natural language description of what the code should do"
const :constraints, String,
desc: "Any specific requirements or constraints (e.g., libraries to use, style preferences)"
end
output do
const :code, String,
desc: "Complete, working Ruby code that fulfills the requirements"
const :explanation, String,
desc: "Brief explanation of how the code works and any important design decisions"
const :dependencies, T::Array[String],
desc: "List of required gems or dependencies"
end
end
# Usage Examples:
#
# Basic usage with Predict:
# predictor = DSPy::Predict.new(ExampleSignature)
# result = predictor.forward(
# field_name: "example value",
# count: 5,
# score: 0.85,
# is_active: true,
# tags: ["tag1", "tag2"],
# priority: "High"
# )
# puts result[:result]
# puts result[:category]
# puts result[:confidence]
#
# With Chain of Thought reasoning:
# predictor = DSPy::ChainOfThought.new(SentimentAnalysisSignature)
# result = predictor.forward(
# text: "I absolutely love this product! It exceeded all my expectations.",
# context: "Product review on e-commerce site"
# )
# puts result[:reasoning] # See the LLM's step-by-step thinking
# puts result[:sentiment]
# puts result[:emotions]
#
# With Vision:
# predictor = DSPy::Predict.new(VisionExampleSignature)
# result = predictor.forward(
# image: DSPy::Image.from_file("path/to/image.jpg"),
# question: "What objects are visible in this image?"
# )
# puts result[:answer]

View File

@@ -0,0 +1,265 @@
# DSPy.rb Core Concepts
## Philosophy
DSPy.rb enables developers to **program LLMs, not prompt them**. Instead of manually crafting prompts, define application requirements through code using type-safe, composable modules.
## Signatures
Signatures define type-safe input/output contracts for LLM operations. They specify what data goes in and what data comes out, with runtime type checking.
### Basic Signature Structure
```ruby
class TaskSignature < DSPy::Signature
description "Brief description of what this signature does"
input do
const :field_name, String, desc: "Description of this input field"
const :another_field, Integer, desc: "Another input field"
end
output do
const :result_field, String, desc: "Description of the output"
const :confidence, Float, desc: "Confidence score (0.0-1.0)"
end
end
```
### Type Safety
Signatures support Sorbet types including:
- `String` - Text data
- `Integer`, `Float` - Numeric data
- `T::Boolean` - Boolean values
- `T::Array[Type]` - Arrays of specific types
- Custom enums and classes
### Field Descriptions
Always provide clear field descriptions using the `desc:` parameter. These descriptions:
- Guide the LLM on expected input/output format
- Serve as documentation for developers
- Improve prediction accuracy
## Modules
Modules are composable building blocks that use signatures to perform LLM operations. They can be chained together to create complex workflows.
### Basic Module Structure
```ruby
class MyModule < DSPy::Module
def initialize
super
@predictor = DSPy::Predict.new(MySignature)
end
def forward(input_hash)
@predictor.forward(input_hash)
end
end
```
### Module Composition
Modules can call other modules to create pipelines:
```ruby
class ComplexWorkflow < DSPy::Module
def initialize
super
@step1 = FirstModule.new
@step2 = SecondModule.new
end
def forward(input)
result1 = @step1.forward(input)
result2 = @step2.forward(result1)
result2
end
end
```
## Predictors
Predictors are the core execution engines that take signatures and perform LLM inference. DSPy.rb provides several predictor types.
### Predict
Basic LLM inference with type-safe inputs and outputs.
```ruby
predictor = DSPy::Predict.new(TaskSignature)
result = predictor.forward(field_name: "value", another_field: 42)
# Returns: { result_field: "...", confidence: 0.85 }
```
### ChainOfThought
Automatically adds a reasoning field to the output, improving accuracy for complex tasks.
```ruby
class EmailClassificationSignature < DSPy::Signature
description "Classify customer support emails"
input do
const :email_subject, String
const :email_body, String
end
output do
const :category, String # "Technical", "Billing", or "General"
const :priority, String # "High", "Medium", or "Low"
end
end
predictor = DSPy::ChainOfThought.new(EmailClassificationSignature)
result = predictor.forward(
email_subject: "Can't log in to my account",
email_body: "I've been trying to access my account for hours..."
)
# Returns: {
# reasoning: "This appears to be a technical issue...",
# category: "Technical",
# priority: "High"
# }
```
### ReAct
Tool-using agents with iterative reasoning. Enables autonomous problem-solving by allowing the LLM to use external tools.
```ruby
class SearchTool < DSPy::Tool
def call(query:)
# Perform search and return results
{ results: search_database(query) }
end
end
predictor = DSPy::ReAct.new(
TaskSignature,
tools: [SearchTool.new],
max_iterations: 5
)
```
### CodeAct
Dynamic code generation for solving problems programmatically. Requires the optional `dspy-code_act` gem.
```ruby
predictor = DSPy::CodeAct.new(TaskSignature)
result = predictor.forward(task: "Calculate the factorial of 5")
# The LLM generates and executes Ruby code to solve the task
```
## Multimodal Support
DSPy.rb supports vision capabilities across compatible models using the unified `DSPy::Image` interface.
```ruby
class VisionSignature < DSPy::Signature
description "Describe what's in an image"
input do
const :image, DSPy::Image
const :question, String
end
output do
const :description, String
end
end
predictor = DSPy::Predict.new(VisionSignature)
result = predictor.forward(
image: DSPy::Image.from_file("path/to/image.jpg"),
question: "What objects are visible in this image?"
)
```
### Image Input Methods
```ruby
# From file path
DSPy::Image.from_file("path/to/image.jpg")
# From URL (OpenAI only)
DSPy::Image.from_url("https://example.com/image.jpg")
# From base64-encoded data
DSPy::Image.from_base64(base64_string, mime_type: "image/jpeg")
```
## Best Practices
### 1. Clear Signature Descriptions
Always provide clear, specific descriptions for signatures and fields:
```ruby
# Good
description "Classify customer support emails into Technical, Billing, or General categories"
# Avoid
description "Classify emails"
```
### 2. Type Safety
Use specific types rather than generic String when possible:
```ruby
# Good - Use enums for constrained outputs
output do
const :category, T.enum(["Technical", "Billing", "General"])
end
# Less ideal - Generic string
output do
const :category, String, desc: "Must be Technical, Billing, or General"
end
```
### 3. Composable Architecture
Build complex workflows from simple, reusable modules:
```ruby
class EmailPipeline < DSPy::Module
def initialize
super
@classifier = EmailClassifier.new
@prioritizer = EmailPrioritizer.new
@responder = EmailResponder.new
end
def forward(email)
classification = @classifier.forward(email)
priority = @prioritizer.forward(classification)
@responder.forward(classification.merge(priority))
end
end
```
### 4. Error Handling
Always handle potential type validation errors:
```ruby
begin
result = predictor.forward(input_data)
rescue DSPy::ValidationError => e
# Handle validation error
logger.error "Invalid output from LLM: #{e.message}"
end
```
## Limitations
Current constraints to be aware of:
- No streaming support (single-request processing only)
- Limited multimodal support through Ollama for local deployments
- Vision capabilities vary by provider (see providers.md for compatibility matrix)

View File

@@ -0,0 +1,623 @@
# DSPy.rb Testing, Optimization & Observability
## Testing
DSPy.rb enables standard RSpec testing patterns for LLM logic, making your AI applications testable and maintainable.
### Basic Testing Setup
```ruby
require 'rspec'
require 'dspy'
RSpec.describe EmailClassifier do
before do
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
end
end
describe '#classify' do
it 'classifies technical support emails correctly' do
classifier = EmailClassifier.new
result = classifier.forward(
email_subject: "Can't log in",
email_body: "I'm unable to access my account"
)
expect(result[:category]).to eq('Technical')
expect(result[:priority]).to be_in(['High', 'Medium', 'Low'])
end
end
end
```
### Mocking LLM Responses
Test your modules without making actual API calls:
```ruby
RSpec.describe MyModule do
it 'handles mock responses correctly' do
# Create a mock predictor that returns predetermined results
mock_predictor = instance_double(DSPy::Predict)
allow(mock_predictor).to receive(:forward).and_return({
category: 'Technical',
priority: 'High',
confidence: 0.95
})
# Inject mock into your module
module_instance = MyModule.new
module_instance.instance_variable_set(:@predictor, mock_predictor)
result = module_instance.forward(input: 'test data')
expect(result[:category]).to eq('Technical')
end
end
```
### Testing Type Safety
Verify that signatures enforce type constraints:
```ruby
RSpec.describe EmailClassificationSignature do
it 'validates output types' do
predictor = DSPy::Predict.new(EmailClassificationSignature)
# This should work
result = predictor.forward(
email_subject: 'Test',
email_body: 'Test body'
)
expect(result[:category]).to be_a(String)
# Test that invalid types are caught
expect {
# Simulate LLM returning invalid type
predictor.send(:validate_output, { category: 123 })
}.to raise_error(DSPy::ValidationError)
end
end
```
### Testing Edge Cases
Always test boundary conditions and error scenarios:
```ruby
RSpec.describe EmailClassifier do
it 'handles empty emails' do
classifier = EmailClassifier.new
result = classifier.forward(
email_subject: '',
email_body: ''
)
# Define expected behavior for edge case
expect(result[:category]).to eq('General')
end
it 'handles very long emails' do
long_body = 'word ' * 10000
classifier = EmailClassifier.new
expect {
classifier.forward(
email_subject: 'Test',
email_body: long_body
)
}.not_to raise_error
end
it 'handles special characters' do
classifier = EmailClassifier.new
result = classifier.forward(
email_subject: 'Test <script>alert("xss")</script>',
email_body: 'Body with émojis 🎉 and spëcial çharacters'
)
expect(result[:category]).to be_in(['Technical', 'Billing', 'General'])
end
end
```
### Integration Testing
Test complete workflows end-to-end:
```ruby
RSpec.describe EmailProcessingPipeline do
it 'processes email through complete pipeline' do
pipeline = EmailProcessingPipeline.new
result = pipeline.forward(
email_subject: 'Billing question',
email_body: 'How do I update my payment method?'
)
# Verify the complete pipeline output
expect(result[:classification]).to eq('Billing')
expect(result[:priority]).to eq('Medium')
expect(result[:suggested_response]).to include('payment')
expect(result[:assigned_team]).to eq('billing_support')
end
end
```
### VCR for Deterministic Tests
Use VCR to record and replay API responses:
```ruby
require 'vcr'
VCR.configure do |config|
config.cassette_library_dir = 'spec/vcr_cassettes'
config.hook_into :webmock
config.filter_sensitive_data('<OPENAI_API_KEY>') { ENV['OPENAI_API_KEY'] }
end
RSpec.describe EmailClassifier do
it 'classifies emails consistently', :vcr do
VCR.use_cassette('email_classification') do
classifier = EmailClassifier.new
result = classifier.forward(
email_subject: 'Test subject',
email_body: 'Test body'
)
expect(result[:category]).to eq('Technical')
end
end
end
```
## Optimization
DSPy.rb provides powerful optimization capabilities to automatically improve your prompts and modules.
### MIPROv2 Optimization
MIPROv2 is an advanced multi-prompt optimization technique that uses bootstrap sampling, instruction generation, and Bayesian optimization.
```ruby
require 'dspy/mipro'
# Define your module to optimize
class EmailClassifier < DSPy::Module
def initialize
super
@predictor = DSPy::ChainOfThought.new(EmailClassificationSignature)
end
def forward(input)
@predictor.forward(input)
end
end
# Prepare training data
training_examples = [
{
input: { email_subject: "Can't log in", email_body: "Password reset not working" },
expected_output: { category: 'Technical', priority: 'High' }
},
{
input: { email_subject: "Billing question", email_body: "How much does premium cost?" },
expected_output: { category: 'Billing', priority: 'Medium' }
},
# Add more examples...
]
# Define evaluation metric
def accuracy_metric(example, prediction)
(example[:expected_output][:category] == prediction[:category]) ? 1.0 : 0.0
end
# Run optimization
optimizer = DSPy::MIPROv2.new(
metric: method(:accuracy_metric),
num_candidates: 10,
num_threads: 4
)
optimized_module = optimizer.compile(
EmailClassifier.new,
trainset: training_examples
)
# Use optimized module
result = optimized_module.forward(
email_subject: "New email",
email_body: "New email content"
)
```
### Bootstrap Few-Shot Learning
Automatically generate few-shot examples from your training data:
```ruby
require 'dspy/teleprompt'
# Create a teleprompter for few-shot optimization
teleprompter = DSPy::BootstrapFewShot.new(
metric: method(:accuracy_metric),
max_bootstrapped_demos: 5,
max_labeled_demos: 3
)
# Compile the optimized module
optimized = teleprompter.compile(
MyModule.new,
trainset: training_examples
)
```
### Custom Optimization Metrics
Define custom metrics for your specific use case:
```ruby
def custom_metric(example, prediction)
score = 0.0
# Category accuracy (60% weight)
score += 0.6 if example[:expected_output][:category] == prediction[:category]
# Priority accuracy (40% weight)
score += 0.4 if example[:expected_output][:priority] == prediction[:priority]
score
end
# Use in optimization
optimizer = DSPy::MIPROv2.new(
metric: method(:custom_metric),
num_candidates: 10
)
```
### A/B Testing Different Approaches
Compare different module implementations:
```ruby
# Approach A: ChainOfThought
class ApproachA < DSPy::Module
def initialize
super
@predictor = DSPy::ChainOfThought.new(EmailClassificationSignature)
end
def forward(input)
@predictor.forward(input)
end
end
# Approach B: ReAct with tools
class ApproachB < DSPy::Module
def initialize
super
@predictor = DSPy::ReAct.new(
EmailClassificationSignature,
tools: [KnowledgeBaseTool.new]
)
end
def forward(input)
@predictor.forward(input)
end
end
# Evaluate both approaches
def evaluate_approach(approach_class, test_set)
approach = approach_class.new
scores = test_set.map do |example|
prediction = approach.forward(example[:input])
accuracy_metric(example, prediction)
end
scores.sum / scores.size
end
approach_a_score = evaluate_approach(ApproachA, test_examples)
approach_b_score = evaluate_approach(ApproachB, test_examples)
puts "Approach A accuracy: #{approach_a_score}"
puts "Approach B accuracy: #{approach_b_score}"
```
## Observability
Track your LLM application's performance, token usage, and behavior in production.
### OpenTelemetry Integration
DSPy.rb automatically integrates with OpenTelemetry when configured:
```ruby
require 'opentelemetry/sdk'
require 'dspy'
# Configure OpenTelemetry
OpenTelemetry::SDK.configure do |c|
c.service_name = 'my-dspy-app'
c.use_all # Use all available instrumentation
end
# DSPy automatically creates traces for predictions
predictor = DSPy::Predict.new(MySignature)
result = predictor.forward(input: 'data')
# Traces are automatically sent to your OpenTelemetry collector
```
### Langfuse Integration
Track detailed LLM execution traces with Langfuse:
```ruby
require 'dspy/langfuse'
# Configure Langfuse
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
c.langfuse = {
public_key: ENV['LANGFUSE_PUBLIC_KEY'],
secret_key: ENV['LANGFUSE_SECRET_KEY'],
host: ENV['LANGFUSE_HOST'] || 'https://cloud.langfuse.com'
}
end
# All predictions are automatically traced
predictor = DSPy::Predict.new(MySignature)
result = predictor.forward(input: 'data')
# View detailed traces in Langfuse dashboard
```
### Manual Token Tracking
Track token usage without external services:
```ruby
class TokenTracker
def initialize
@total_tokens = 0
@request_count = 0
end
def track_prediction(predictor, input)
start_time = Time.now
result = predictor.forward(input)
duration = Time.now - start_time
# Get token usage from response metadata
tokens = result.metadata[:usage][:total_tokens] rescue 0
@total_tokens += tokens
@request_count += 1
puts "Request ##{@request_count}: #{tokens} tokens in #{duration}s"
puts "Total tokens used: #{@total_tokens}"
result
end
end
# Usage
tracker = TokenTracker.new
predictor = DSPy::Predict.new(MySignature)
result = tracker.track_prediction(predictor, { input: 'data' })
```
### Custom Logging
Add detailed logging to your modules:
```ruby
class EmailClassifier < DSPy::Module
def initialize
super
@predictor = DSPy::ChainOfThought.new(EmailClassificationSignature)
@logger = Logger.new(STDOUT)
end
def forward(input)
@logger.info "Classifying email: #{input[:email_subject]}"
start_time = Time.now
result = @predictor.forward(input)
duration = Time.now - start_time
@logger.info "Classification: #{result[:category]} (#{duration}s)"
if result[:reasoning]
@logger.debug "Reasoning: #{result[:reasoning]}"
end
result
rescue => e
@logger.error "Classification failed: #{e.message}"
raise
end
end
```
### Performance Monitoring
Monitor latency and performance metrics:
```ruby
class PerformanceMonitor
def initialize
@metrics = {
total_requests: 0,
total_duration: 0.0,
errors: 0,
success_count: 0
}
end
def monitor_request
start_time = Time.now
@metrics[:total_requests] += 1
begin
result = yield
@metrics[:success_count] += 1
result
rescue => e
@metrics[:errors] += 1
raise
ensure
duration = Time.now - start_time
@metrics[:total_duration] += duration
if @metrics[:total_requests] % 10 == 0
print_stats
end
end
end
def print_stats
avg_duration = @metrics[:total_duration] / @metrics[:total_requests]
success_rate = @metrics[:success_count].to_f / @metrics[:total_requests]
puts "\n=== Performance Stats ==="
puts "Total requests: #{@metrics[:total_requests]}"
puts "Average duration: #{avg_duration.round(3)}s"
puts "Success rate: #{(success_rate * 100).round(2)}%"
puts "Errors: #{@metrics[:errors]}"
puts "========================\n"
end
end
# Usage
monitor = PerformanceMonitor.new
predictor = DSPy::Predict.new(MySignature)
result = monitor.monitor_request do
predictor.forward(input: 'data')
end
```
### Error Rate Tracking
Monitor and alert on error rates:
```ruby
class ErrorRateMonitor
def initialize(alert_threshold: 0.1)
@alert_threshold = alert_threshold
@recent_results = []
@window_size = 100
end
def track_result(success:)
@recent_results << success
@recent_results.shift if @recent_results.size > @window_size
error_rate = calculate_error_rate
alert_if_needed(error_rate)
error_rate
end
private
def calculate_error_rate
failures = @recent_results.count(false)
failures.to_f / @recent_results.size
end
def alert_if_needed(error_rate)
if error_rate > @alert_threshold
puts "⚠️ ALERT: Error rate #{(error_rate * 100).round(2)}% exceeds threshold!"
# Send notification, page oncall, etc.
end
end
end
```
## Best Practices
### 1. Start with Tests
Write tests before optimizing:
```ruby
# Define test cases first
test_cases = [
{ input: {...}, expected: {...} },
# More test cases...
]
# Ensure baseline functionality
test_cases.each do |tc|
result = module.forward(tc[:input])
assert result[:category] == tc[:expected][:category]
end
# Then optimize
optimized = optimizer.compile(module, trainset: test_cases)
```
### 2. Use Meaningful Metrics
Define metrics that align with business goals:
```ruby
def business_aligned_metric(example, prediction)
# High-priority errors are more costly
if example[:expected_output][:priority] == 'High'
return prediction[:priority] == 'High' ? 1.0 : 0.0
else
return prediction[:category] == example[:expected_output][:category] ? 0.8 : 0.0
end
end
```
### 3. Monitor in Production
Always track production performance:
```ruby
class ProductionModule < DSPy::Module
def initialize
super
@predictor = DSPy::ChainOfThought.new(MySignature)
@monitor = PerformanceMonitor.new
@error_tracker = ErrorRateMonitor.new
end
def forward(input)
@monitor.monitor_request do
result = @predictor.forward(input)
@error_tracker.track_result(success: true)
result
rescue => e
@error_tracker.track_result(success: false)
raise
end
end
end
```
### 4. Version Your Modules
Track which version of your module is deployed:
```ruby
class EmailClassifierV2 < DSPy::Module
VERSION = '2.1.0'
def initialize
super
@predictor = DSPy::ChainOfThought.new(EmailClassificationSignature)
end
def forward(input)
result = @predictor.forward(input)
result.merge(model_version: VERSION)
end
end
```

View File

@@ -0,0 +1,338 @@
# DSPy.rb LLM Providers
## Supported Providers
DSPy.rb provides unified support across multiple LLM providers through adapter gems that automatically load when installed.
### Provider Overview
- **OpenAI**: GPT-4, GPT-4o, GPT-4o-mini, GPT-3.5-turbo
- **Anthropic**: Claude 3 family (Sonnet, Opus, Haiku), Claude 3.5 Sonnet
- **Google Gemini**: Gemini 1.5 Pro, Gemini 1.5 Flash, other versions
- **Ollama**: Local model support via OpenAI compatibility layer
- **OpenRouter**: Unified multi-provider API for 200+ models
## Configuration
### Basic Setup
```ruby
require 'dspy'
DSPy.configure do |c|
c.lm = DSPy::LM.new('provider/model-name', api_key: ENV['API_KEY'])
end
```
### OpenAI Configuration
**Required gem**: `dspy-openai`
```ruby
DSPy.configure do |c|
# GPT-4o Mini (recommended for development)
c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
# GPT-4o (more capable)
c.lm = DSPy::LM.new('openai/gpt-4o', api_key: ENV['OPENAI_API_KEY'])
# GPT-4 Turbo
c.lm = DSPy::LM.new('openai/gpt-4-turbo', api_key: ENV['OPENAI_API_KEY'])
end
```
**Environment variable**: `OPENAI_API_KEY`
### Anthropic Configuration
**Required gem**: `dspy-anthropic`
```ruby
DSPy.configure do |c|
# Claude 3.5 Sonnet (latest, most capable)
c.lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'])
# Claude 3 Opus (most capable in Claude 3 family)
c.lm = DSPy::LM.new('anthropic/claude-3-opus-20240229',
api_key: ENV['ANTHROPIC_API_KEY'])
# Claude 3 Sonnet (balanced)
c.lm = DSPy::LM.new('anthropic/claude-3-sonnet-20240229',
api_key: ENV['ANTHROPIC_API_KEY'])
# Claude 3 Haiku (fast, cost-effective)
c.lm = DSPy::LM.new('anthropic/claude-3-haiku-20240307',
api_key: ENV['ANTHROPIC_API_KEY'])
end
```
**Environment variable**: `ANTHROPIC_API_KEY`
### Google Gemini Configuration
**Required gem**: `dspy-gemini`
```ruby
DSPy.configure do |c|
# Gemini 1.5 Pro (most capable)
c.lm = DSPy::LM.new('gemini/gemini-1.5-pro',
api_key: ENV['GOOGLE_API_KEY'])
# Gemini 1.5 Flash (faster, cost-effective)
c.lm = DSPy::LM.new('gemini/gemini-1.5-flash',
api_key: ENV['GOOGLE_API_KEY'])
end
```
**Environment variable**: `GOOGLE_API_KEY` or `GEMINI_API_KEY`
### Ollama Configuration
**Required gem**: None (uses OpenAI compatibility layer)
```ruby
DSPy.configure do |c|
# Local Ollama instance
c.lm = DSPy::LM.new('ollama/llama3.1',
base_url: 'http://localhost:11434')
# Other Ollama models
c.lm = DSPy::LM.new('ollama/mistral')
c.lm = DSPy::LM.new('ollama/codellama')
end
```
**Note**: Ensure Ollama is running locally: `ollama serve`
### OpenRouter Configuration
**Required gem**: `dspy-openai` (uses OpenAI adapter)
```ruby
DSPy.configure do |c|
# Access 200+ models through OpenRouter
c.lm = DSPy::LM.new('openrouter/anthropic/claude-3.5-sonnet',
api_key: ENV['OPENROUTER_API_KEY'],
base_url: 'https://openrouter.ai/api/v1')
# Other examples
c.lm = DSPy::LM.new('openrouter/google/gemini-pro')
c.lm = DSPy::LM.new('openrouter/meta-llama/llama-3.1-70b-instruct')
end
```
**Environment variable**: `OPENROUTER_API_KEY`
## Provider Compatibility Matrix
### Feature Support
| Feature | OpenAI | Anthropic | Gemini | Ollama |
|---------|--------|-----------|--------|--------|
| Structured Output | ✅ | ✅ | ✅ | ✅ |
| Vision (Images) | ✅ | ✅ | ✅ | ⚠️ Limited |
| Image URLs | ✅ | ❌ | ❌ | ❌ |
| Tool Calling | ✅ | ✅ | ✅ | Varies |
| Streaming | ❌ | ❌ | ❌ | ❌ |
| Function Calling | ✅ | ✅ | ✅ | Varies |
**Legend**: ✅ Full support | ⚠️ Partial support | ❌ Not supported
### Vision Capabilities
**Image URLs**: Only OpenAI supports direct URL references. For other providers, load images as base64 or from files.
```ruby
# OpenAI - supports URLs
DSPy::Image.from_url("https://example.com/image.jpg")
# Anthropic, Gemini - use file or base64
DSPy::Image.from_file("path/to/image.jpg")
DSPy::Image.from_base64(base64_data, mime_type: "image/jpeg")
```
**Ollama**: Limited multimodal functionality. Check specific model capabilities.
## Advanced Configuration
### Custom Parameters
Pass provider-specific parameters during configuration:
```ruby
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o',
api_key: ENV['OPENAI_API_KEY'],
temperature: 0.7,
max_tokens: 2000,
top_p: 0.9
)
end
```
### Multiple Providers
Use different models for different tasks:
```ruby
# Fast model for simple tasks
fast_lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
# Powerful model for complex tasks
powerful_lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'])
# Use different models in different modules
class SimpleClassifier < DSPy::Module
def initialize
super
DSPy.configure { |c| c.lm = fast_lm }
@predictor = DSPy::Predict.new(SimpleSignature)
end
end
class ComplexAnalyzer < DSPy::Module
def initialize
super
DSPy.configure { |c| c.lm = powerful_lm }
@predictor = DSPy::ChainOfThought.new(ComplexSignature)
end
end
```
### Per-Request Configuration
Override configuration for specific predictions:
```ruby
predictor = DSPy::Predict.new(MySignature)
# Use default configuration
result1 = predictor.forward(input: "data")
# Override temperature for this request
result2 = predictor.forward(
input: "data",
config: { temperature: 0.2 } # More deterministic
)
```
## Cost Optimization
### Model Selection Strategy
1. **Development**: Use cheaper, faster models (gpt-4o-mini, claude-3-haiku, gemini-1.5-flash)
2. **Production Simple Tasks**: Continue with cheaper models if quality is sufficient
3. **Production Complex Tasks**: Upgrade to more capable models (gpt-4o, claude-3.5-sonnet, gemini-1.5-pro)
4. **Local Development**: Use Ollama for privacy and zero API costs
### Example Cost-Conscious Setup
```ruby
# Development environment
if Rails.env.development?
DSPy.configure do |c|
c.lm = DSPy::LM.new('ollama/llama3.1') # Free, local
end
elsif Rails.env.test?
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini', # Cheap for testing
api_key: ENV['OPENAI_API_KEY'])
end
else # production
DSPy.configure do |c|
c.lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'])
end
end
```
## Provider-Specific Best Practices
### OpenAI
- Use `gpt-4o-mini` for development and simple tasks
- Use `gpt-4o` for production complex tasks
- Best vision support including URL loading
- Excellent function calling capabilities
### Anthropic
- Claude 3.5 Sonnet is currently the most capable model
- Excellent for complex reasoning and analysis
- Strong safety features and helpful outputs
- Requires base64 for images (no URL support)
### Google Gemini
- Gemini 1.5 Pro for complex tasks, Flash for speed
- Strong multimodal capabilities
- Good balance of cost and performance
- Requires base64 for images
### Ollama
- Best for privacy-sensitive applications
- Zero API costs
- Requires local hardware resources
- Limited multimodal support depending on model
- Good for development and testing
## Troubleshooting
### API Key Issues
```ruby
# Verify API key is set
if ENV['OPENAI_API_KEY'].nil?
raise "OPENAI_API_KEY environment variable not set"
end
# Test connection
begin
DSPy.configure { |c| c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY']) }
predictor = DSPy::Predict.new(TestSignature)
predictor.forward(test: "data")
puts "✅ Connection successful"
rescue => e
puts "❌ Connection failed: #{e.message}"
end
```
### Rate Limiting
Handle rate limits gracefully:
```ruby
def call_with_retry(predictor, input, max_retries: 3)
retries = 0
begin
predictor.forward(input)
rescue RateLimitError => e
retries += 1
if retries < max_retries
sleep(2 ** retries) # Exponential backoff
retry
else
raise
end
end
end
```
### Model Not Found
Ensure the correct gem is installed:
```bash
# For OpenAI
gem install dspy-openai
# For Anthropic
gem install dspy-anthropic
# For Gemini
gem install dspy-gemini
```