[2.9.0] Rename plugin to compound-engineering
BREAKING: Plugin renamed from compounding-engineering to compound-engineering. Users will need to reinstall with the new name: claude /plugin install compound-engineering Changes: - Renamed plugin directory and all references - Updated documentation counts (24 agents, 19 commands) - Added julik-frontend-races-reviewer to docs 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,265 @@
|
||||
# DSPy.rb Core Concepts
|
||||
|
||||
## Philosophy
|
||||
|
||||
DSPy.rb enables developers to **program LLMs, not prompt them**. Instead of manually crafting prompts, define application requirements through code using type-safe, composable modules.
|
||||
|
||||
## Signatures
|
||||
|
||||
Signatures define type-safe input/output contracts for LLM operations. They specify what data goes in and what data comes out, with runtime type checking.
|
||||
|
||||
### Basic Signature Structure
|
||||
|
||||
```ruby
|
||||
class TaskSignature < DSPy::Signature
|
||||
description "Brief description of what this signature does"
|
||||
|
||||
input do
|
||||
const :field_name, String, desc: "Description of this input field"
|
||||
const :another_field, Integer, desc: "Another input field"
|
||||
end
|
||||
|
||||
output do
|
||||
const :result_field, String, desc: "Description of the output"
|
||||
const :confidence, Float, desc: "Confidence score (0.0-1.0)"
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
### Type Safety
|
||||
|
||||
Signatures support Sorbet types including:
|
||||
- `String` - Text data
|
||||
- `Integer`, `Float` - Numeric data
|
||||
- `T::Boolean` - Boolean values
|
||||
- `T::Array[Type]` - Arrays of specific types
|
||||
- Custom enums and classes
|
||||
|
||||
### Field Descriptions
|
||||
|
||||
Always provide clear field descriptions using the `desc:` parameter. These descriptions:
|
||||
- Guide the LLM on expected input/output format
|
||||
- Serve as documentation for developers
|
||||
- Improve prediction accuracy
|
||||
|
||||
## Modules
|
||||
|
||||
Modules are composable building blocks that use signatures to perform LLM operations. They can be chained together to create complex workflows.
|
||||
|
||||
### Basic Module Structure
|
||||
|
||||
```ruby
|
||||
class MyModule < DSPy::Module
|
||||
def initialize
|
||||
super
|
||||
@predictor = DSPy::Predict.new(MySignature)
|
||||
end
|
||||
|
||||
def forward(input_hash)
|
||||
@predictor.forward(input_hash)
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
### Module Composition
|
||||
|
||||
Modules can call other modules to create pipelines:
|
||||
|
||||
```ruby
|
||||
class ComplexWorkflow < DSPy::Module
|
||||
def initialize
|
||||
super
|
||||
@step1 = FirstModule.new
|
||||
@step2 = SecondModule.new
|
||||
end
|
||||
|
||||
def forward(input)
|
||||
result1 = @step1.forward(input)
|
||||
result2 = @step2.forward(result1)
|
||||
result2
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
## Predictors
|
||||
|
||||
Predictors are the core execution engines that take signatures and perform LLM inference. DSPy.rb provides several predictor types.
|
||||
|
||||
### Predict
|
||||
|
||||
Basic LLM inference with type-safe inputs and outputs.
|
||||
|
||||
```ruby
|
||||
predictor = DSPy::Predict.new(TaskSignature)
|
||||
result = predictor.forward(field_name: "value", another_field: 42)
|
||||
# Returns: { result_field: "...", confidence: 0.85 }
|
||||
```
|
||||
|
||||
### ChainOfThought
|
||||
|
||||
Automatically adds a reasoning field to the output, improving accuracy for complex tasks.
|
||||
|
||||
```ruby
|
||||
class EmailClassificationSignature < DSPy::Signature
|
||||
description "Classify customer support emails"
|
||||
|
||||
input do
|
||||
const :email_subject, String
|
||||
const :email_body, String
|
||||
end
|
||||
|
||||
output do
|
||||
const :category, String # "Technical", "Billing", or "General"
|
||||
const :priority, String # "High", "Medium", or "Low"
|
||||
end
|
||||
end
|
||||
|
||||
predictor = DSPy::ChainOfThought.new(EmailClassificationSignature)
|
||||
result = predictor.forward(
|
||||
email_subject: "Can't log in to my account",
|
||||
email_body: "I've been trying to access my account for hours..."
|
||||
)
|
||||
# Returns: {
|
||||
# reasoning: "This appears to be a technical issue...",
|
||||
# category: "Technical",
|
||||
# priority: "High"
|
||||
# }
|
||||
```
|
||||
|
||||
### ReAct
|
||||
|
||||
Tool-using agents with iterative reasoning. Enables autonomous problem-solving by allowing the LLM to use external tools.
|
||||
|
||||
```ruby
|
||||
class SearchTool < DSPy::Tool
|
||||
def call(query:)
|
||||
# Perform search and return results
|
||||
{ results: search_database(query) }
|
||||
end
|
||||
end
|
||||
|
||||
predictor = DSPy::ReAct.new(
|
||||
TaskSignature,
|
||||
tools: [SearchTool.new],
|
||||
max_iterations: 5
|
||||
)
|
||||
```
|
||||
|
||||
### CodeAct
|
||||
|
||||
Dynamic code generation for solving problems programmatically. Requires the optional `dspy-code_act` gem.
|
||||
|
||||
```ruby
|
||||
predictor = DSPy::CodeAct.new(TaskSignature)
|
||||
result = predictor.forward(task: "Calculate the factorial of 5")
|
||||
# The LLM generates and executes Ruby code to solve the task
|
||||
```
|
||||
|
||||
## Multimodal Support
|
||||
|
||||
DSPy.rb supports vision capabilities across compatible models using the unified `DSPy::Image` interface.
|
||||
|
||||
```ruby
|
||||
class VisionSignature < DSPy::Signature
|
||||
description "Describe what's in an image"
|
||||
|
||||
input do
|
||||
const :image, DSPy::Image
|
||||
const :question, String
|
||||
end
|
||||
|
||||
output do
|
||||
const :description, String
|
||||
end
|
||||
end
|
||||
|
||||
predictor = DSPy::Predict.new(VisionSignature)
|
||||
result = predictor.forward(
|
||||
image: DSPy::Image.from_file("path/to/image.jpg"),
|
||||
question: "What objects are visible in this image?"
|
||||
)
|
||||
```
|
||||
|
||||
### Image Input Methods
|
||||
|
||||
```ruby
|
||||
# From file path
|
||||
DSPy::Image.from_file("path/to/image.jpg")
|
||||
|
||||
# From URL (OpenAI only)
|
||||
DSPy::Image.from_url("https://example.com/image.jpg")
|
||||
|
||||
# From base64-encoded data
|
||||
DSPy::Image.from_base64(base64_string, mime_type: "image/jpeg")
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Clear Signature Descriptions
|
||||
|
||||
Always provide clear, specific descriptions for signatures and fields:
|
||||
|
||||
```ruby
|
||||
# Good
|
||||
description "Classify customer support emails into Technical, Billing, or General categories"
|
||||
|
||||
# Avoid
|
||||
description "Classify emails"
|
||||
```
|
||||
|
||||
### 2. Type Safety
|
||||
|
||||
Use specific types rather than generic String when possible:
|
||||
|
||||
```ruby
|
||||
# Good - Use enums for constrained outputs
|
||||
output do
|
||||
const :category, T.enum(["Technical", "Billing", "General"])
|
||||
end
|
||||
|
||||
# Less ideal - Generic string
|
||||
output do
|
||||
const :category, String, desc: "Must be Technical, Billing, or General"
|
||||
end
|
||||
```
|
||||
|
||||
### 3. Composable Architecture
|
||||
|
||||
Build complex workflows from simple, reusable modules:
|
||||
|
||||
```ruby
|
||||
class EmailPipeline < DSPy::Module
|
||||
def initialize
|
||||
super
|
||||
@classifier = EmailClassifier.new
|
||||
@prioritizer = EmailPrioritizer.new
|
||||
@responder = EmailResponder.new
|
||||
end
|
||||
|
||||
def forward(email)
|
||||
classification = @classifier.forward(email)
|
||||
priority = @prioritizer.forward(classification)
|
||||
@responder.forward(classification.merge(priority))
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
### 4. Error Handling
|
||||
|
||||
Always handle potential type validation errors:
|
||||
|
||||
```ruby
|
||||
begin
|
||||
result = predictor.forward(input_data)
|
||||
rescue DSPy::ValidationError => e
|
||||
# Handle validation error
|
||||
logger.error "Invalid output from LLM: #{e.message}"
|
||||
end
|
||||
```
|
||||
|
||||
## Limitations
|
||||
|
||||
Current constraints to be aware of:
|
||||
- No streaming support (single-request processing only)
|
||||
- Limited multimodal support through Ollama for local deployments
|
||||
- Vision capabilities vary by provider (see providers.md for compatibility matrix)
|
||||
@@ -0,0 +1,623 @@
|
||||
# DSPy.rb Testing, Optimization & Observability
|
||||
|
||||
## Testing
|
||||
|
||||
DSPy.rb enables standard RSpec testing patterns for LLM logic, making your AI applications testable and maintainable.
|
||||
|
||||
### Basic Testing Setup
|
||||
|
||||
```ruby
|
||||
require 'rspec'
|
||||
require 'dspy'
|
||||
|
||||
RSpec.describe EmailClassifier do
|
||||
before do
|
||||
DSPy.configure do |c|
|
||||
c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
|
||||
end
|
||||
end
|
||||
|
||||
describe '#classify' do
|
||||
it 'classifies technical support emails correctly' do
|
||||
classifier = EmailClassifier.new
|
||||
result = classifier.forward(
|
||||
email_subject: "Can't log in",
|
||||
email_body: "I'm unable to access my account"
|
||||
)
|
||||
|
||||
expect(result[:category]).to eq('Technical')
|
||||
expect(result[:priority]).to be_in(['High', 'Medium', 'Low'])
|
||||
end
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
### Mocking LLM Responses
|
||||
|
||||
Test your modules without making actual API calls:
|
||||
|
||||
```ruby
|
||||
RSpec.describe MyModule do
|
||||
it 'handles mock responses correctly' do
|
||||
# Create a mock predictor that returns predetermined results
|
||||
mock_predictor = instance_double(DSPy::Predict)
|
||||
allow(mock_predictor).to receive(:forward).and_return({
|
||||
category: 'Technical',
|
||||
priority: 'High',
|
||||
confidence: 0.95
|
||||
})
|
||||
|
||||
# Inject mock into your module
|
||||
module_instance = MyModule.new
|
||||
module_instance.instance_variable_set(:@predictor, mock_predictor)
|
||||
|
||||
result = module_instance.forward(input: 'test data')
|
||||
expect(result[:category]).to eq('Technical')
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
### Testing Type Safety
|
||||
|
||||
Verify that signatures enforce type constraints:
|
||||
|
||||
```ruby
|
||||
RSpec.describe EmailClassificationSignature do
|
||||
it 'validates output types' do
|
||||
predictor = DSPy::Predict.new(EmailClassificationSignature)
|
||||
|
||||
# This should work
|
||||
result = predictor.forward(
|
||||
email_subject: 'Test',
|
||||
email_body: 'Test body'
|
||||
)
|
||||
expect(result[:category]).to be_a(String)
|
||||
|
||||
# Test that invalid types are caught
|
||||
expect {
|
||||
# Simulate LLM returning invalid type
|
||||
predictor.send(:validate_output, { category: 123 })
|
||||
}.to raise_error(DSPy::ValidationError)
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
### Testing Edge Cases
|
||||
|
||||
Always test boundary conditions and error scenarios:
|
||||
|
||||
```ruby
|
||||
RSpec.describe EmailClassifier do
|
||||
it 'handles empty emails' do
|
||||
classifier = EmailClassifier.new
|
||||
result = classifier.forward(
|
||||
email_subject: '',
|
||||
email_body: ''
|
||||
)
|
||||
# Define expected behavior for edge case
|
||||
expect(result[:category]).to eq('General')
|
||||
end
|
||||
|
||||
it 'handles very long emails' do
|
||||
long_body = 'word ' * 10000
|
||||
classifier = EmailClassifier.new
|
||||
|
||||
expect {
|
||||
classifier.forward(
|
||||
email_subject: 'Test',
|
||||
email_body: long_body
|
||||
)
|
||||
}.not_to raise_error
|
||||
end
|
||||
|
||||
it 'handles special characters' do
|
||||
classifier = EmailClassifier.new
|
||||
result = classifier.forward(
|
||||
email_subject: 'Test <script>alert("xss")</script>',
|
||||
email_body: 'Body with émojis 🎉 and spëcial çharacters'
|
||||
)
|
||||
|
||||
expect(result[:category]).to be_in(['Technical', 'Billing', 'General'])
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
### Integration Testing
|
||||
|
||||
Test complete workflows end-to-end:
|
||||
|
||||
```ruby
|
||||
RSpec.describe EmailProcessingPipeline do
|
||||
it 'processes email through complete pipeline' do
|
||||
pipeline = EmailProcessingPipeline.new
|
||||
|
||||
result = pipeline.forward(
|
||||
email_subject: 'Billing question',
|
||||
email_body: 'How do I update my payment method?'
|
||||
)
|
||||
|
||||
# Verify the complete pipeline output
|
||||
expect(result[:classification]).to eq('Billing')
|
||||
expect(result[:priority]).to eq('Medium')
|
||||
expect(result[:suggested_response]).to include('payment')
|
||||
expect(result[:assigned_team]).to eq('billing_support')
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
### VCR for Deterministic Tests
|
||||
|
||||
Use VCR to record and replay API responses:
|
||||
|
||||
```ruby
|
||||
require 'vcr'
|
||||
|
||||
VCR.configure do |config|
|
||||
config.cassette_library_dir = 'spec/vcr_cassettes'
|
||||
config.hook_into :webmock
|
||||
config.filter_sensitive_data('<OPENAI_API_KEY>') { ENV['OPENAI_API_KEY'] }
|
||||
end
|
||||
|
||||
RSpec.describe EmailClassifier do
|
||||
it 'classifies emails consistently', :vcr do
|
||||
VCR.use_cassette('email_classification') do
|
||||
classifier = EmailClassifier.new
|
||||
result = classifier.forward(
|
||||
email_subject: 'Test subject',
|
||||
email_body: 'Test body'
|
||||
)
|
||||
|
||||
expect(result[:category]).to eq('Technical')
|
||||
end
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
## Optimization
|
||||
|
||||
DSPy.rb provides powerful optimization capabilities to automatically improve your prompts and modules.
|
||||
|
||||
### MIPROv2 Optimization
|
||||
|
||||
MIPROv2 is an advanced multi-prompt optimization technique that uses bootstrap sampling, instruction generation, and Bayesian optimization.
|
||||
|
||||
```ruby
|
||||
require 'dspy/mipro'
|
||||
|
||||
# Define your module to optimize
|
||||
class EmailClassifier < DSPy::Module
|
||||
def initialize
|
||||
super
|
||||
@predictor = DSPy::ChainOfThought.new(EmailClassificationSignature)
|
||||
end
|
||||
|
||||
def forward(input)
|
||||
@predictor.forward(input)
|
||||
end
|
||||
end
|
||||
|
||||
# Prepare training data
|
||||
training_examples = [
|
||||
{
|
||||
input: { email_subject: "Can't log in", email_body: "Password reset not working" },
|
||||
expected_output: { category: 'Technical', priority: 'High' }
|
||||
},
|
||||
{
|
||||
input: { email_subject: "Billing question", email_body: "How much does premium cost?" },
|
||||
expected_output: { category: 'Billing', priority: 'Medium' }
|
||||
},
|
||||
# Add more examples...
|
||||
]
|
||||
|
||||
# Define evaluation metric
|
||||
def accuracy_metric(example, prediction)
|
||||
(example[:expected_output][:category] == prediction[:category]) ? 1.0 : 0.0
|
||||
end
|
||||
|
||||
# Run optimization
|
||||
optimizer = DSPy::MIPROv2.new(
|
||||
metric: method(:accuracy_metric),
|
||||
num_candidates: 10,
|
||||
num_threads: 4
|
||||
)
|
||||
|
||||
optimized_module = optimizer.compile(
|
||||
EmailClassifier.new,
|
||||
trainset: training_examples
|
||||
)
|
||||
|
||||
# Use optimized module
|
||||
result = optimized_module.forward(
|
||||
email_subject: "New email",
|
||||
email_body: "New email content"
|
||||
)
|
||||
```
|
||||
|
||||
### Bootstrap Few-Shot Learning
|
||||
|
||||
Automatically generate few-shot examples from your training data:
|
||||
|
||||
```ruby
|
||||
require 'dspy/teleprompt'
|
||||
|
||||
# Create a teleprompter for few-shot optimization
|
||||
teleprompter = DSPy::BootstrapFewShot.new(
|
||||
metric: method(:accuracy_metric),
|
||||
max_bootstrapped_demos: 5,
|
||||
max_labeled_demos: 3
|
||||
)
|
||||
|
||||
# Compile the optimized module
|
||||
optimized = teleprompter.compile(
|
||||
MyModule.new,
|
||||
trainset: training_examples
|
||||
)
|
||||
```
|
||||
|
||||
### Custom Optimization Metrics
|
||||
|
||||
Define custom metrics for your specific use case:
|
||||
|
||||
```ruby
|
||||
def custom_metric(example, prediction)
|
||||
score = 0.0
|
||||
|
||||
# Category accuracy (60% weight)
|
||||
score += 0.6 if example[:expected_output][:category] == prediction[:category]
|
||||
|
||||
# Priority accuracy (40% weight)
|
||||
score += 0.4 if example[:expected_output][:priority] == prediction[:priority]
|
||||
|
||||
score
|
||||
end
|
||||
|
||||
# Use in optimization
|
||||
optimizer = DSPy::MIPROv2.new(
|
||||
metric: method(:custom_metric),
|
||||
num_candidates: 10
|
||||
)
|
||||
```
|
||||
|
||||
### A/B Testing Different Approaches
|
||||
|
||||
Compare different module implementations:
|
||||
|
||||
```ruby
|
||||
# Approach A: ChainOfThought
|
||||
class ApproachA < DSPy::Module
|
||||
def initialize
|
||||
super
|
||||
@predictor = DSPy::ChainOfThought.new(EmailClassificationSignature)
|
||||
end
|
||||
|
||||
def forward(input)
|
||||
@predictor.forward(input)
|
||||
end
|
||||
end
|
||||
|
||||
# Approach B: ReAct with tools
|
||||
class ApproachB < DSPy::Module
|
||||
def initialize
|
||||
super
|
||||
@predictor = DSPy::ReAct.new(
|
||||
EmailClassificationSignature,
|
||||
tools: [KnowledgeBaseTool.new]
|
||||
)
|
||||
end
|
||||
|
||||
def forward(input)
|
||||
@predictor.forward(input)
|
||||
end
|
||||
end
|
||||
|
||||
# Evaluate both approaches
|
||||
def evaluate_approach(approach_class, test_set)
|
||||
approach = approach_class.new
|
||||
scores = test_set.map do |example|
|
||||
prediction = approach.forward(example[:input])
|
||||
accuracy_metric(example, prediction)
|
||||
end
|
||||
scores.sum / scores.size
|
||||
end
|
||||
|
||||
approach_a_score = evaluate_approach(ApproachA, test_examples)
|
||||
approach_b_score = evaluate_approach(ApproachB, test_examples)
|
||||
|
||||
puts "Approach A accuracy: #{approach_a_score}"
|
||||
puts "Approach B accuracy: #{approach_b_score}"
|
||||
```
|
||||
|
||||
## Observability
|
||||
|
||||
Track your LLM application's performance, token usage, and behavior in production.
|
||||
|
||||
### OpenTelemetry Integration
|
||||
|
||||
DSPy.rb automatically integrates with OpenTelemetry when configured:
|
||||
|
||||
```ruby
|
||||
require 'opentelemetry/sdk'
|
||||
require 'dspy'
|
||||
|
||||
# Configure OpenTelemetry
|
||||
OpenTelemetry::SDK.configure do |c|
|
||||
c.service_name = 'my-dspy-app'
|
||||
c.use_all # Use all available instrumentation
|
||||
end
|
||||
|
||||
# DSPy automatically creates traces for predictions
|
||||
predictor = DSPy::Predict.new(MySignature)
|
||||
result = predictor.forward(input: 'data')
|
||||
# Traces are automatically sent to your OpenTelemetry collector
|
||||
```
|
||||
|
||||
### Langfuse Integration
|
||||
|
||||
Track detailed LLM execution traces with Langfuse:
|
||||
|
||||
```ruby
|
||||
require 'dspy/langfuse'
|
||||
|
||||
# Configure Langfuse
|
||||
DSPy.configure do |c|
|
||||
c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
|
||||
c.langfuse = {
|
||||
public_key: ENV['LANGFUSE_PUBLIC_KEY'],
|
||||
secret_key: ENV['LANGFUSE_SECRET_KEY'],
|
||||
host: ENV['LANGFUSE_HOST'] || 'https://cloud.langfuse.com'
|
||||
}
|
||||
end
|
||||
|
||||
# All predictions are automatically traced
|
||||
predictor = DSPy::Predict.new(MySignature)
|
||||
result = predictor.forward(input: 'data')
|
||||
# View detailed traces in Langfuse dashboard
|
||||
```
|
||||
|
||||
### Manual Token Tracking
|
||||
|
||||
Track token usage without external services:
|
||||
|
||||
```ruby
|
||||
class TokenTracker
|
||||
def initialize
|
||||
@total_tokens = 0
|
||||
@request_count = 0
|
||||
end
|
||||
|
||||
def track_prediction(predictor, input)
|
||||
start_time = Time.now
|
||||
result = predictor.forward(input)
|
||||
duration = Time.now - start_time
|
||||
|
||||
# Get token usage from response metadata
|
||||
tokens = result.metadata[:usage][:total_tokens] rescue 0
|
||||
@total_tokens += tokens
|
||||
@request_count += 1
|
||||
|
||||
puts "Request ##{@request_count}: #{tokens} tokens in #{duration}s"
|
||||
puts "Total tokens used: #{@total_tokens}"
|
||||
|
||||
result
|
||||
end
|
||||
end
|
||||
|
||||
# Usage
|
||||
tracker = TokenTracker.new
|
||||
predictor = DSPy::Predict.new(MySignature)
|
||||
|
||||
result = tracker.track_prediction(predictor, { input: 'data' })
|
||||
```
|
||||
|
||||
### Custom Logging
|
||||
|
||||
Add detailed logging to your modules:
|
||||
|
||||
```ruby
|
||||
class EmailClassifier < DSPy::Module
|
||||
def initialize
|
||||
super
|
||||
@predictor = DSPy::ChainOfThought.new(EmailClassificationSignature)
|
||||
@logger = Logger.new(STDOUT)
|
||||
end
|
||||
|
||||
def forward(input)
|
||||
@logger.info "Classifying email: #{input[:email_subject]}"
|
||||
|
||||
start_time = Time.now
|
||||
result = @predictor.forward(input)
|
||||
duration = Time.now - start_time
|
||||
|
||||
@logger.info "Classification: #{result[:category]} (#{duration}s)"
|
||||
|
||||
if result[:reasoning]
|
||||
@logger.debug "Reasoning: #{result[:reasoning]}"
|
||||
end
|
||||
|
||||
result
|
||||
rescue => e
|
||||
@logger.error "Classification failed: #{e.message}"
|
||||
raise
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
### Performance Monitoring
|
||||
|
||||
Monitor latency and performance metrics:
|
||||
|
||||
```ruby
|
||||
class PerformanceMonitor
|
||||
def initialize
|
||||
@metrics = {
|
||||
total_requests: 0,
|
||||
total_duration: 0.0,
|
||||
errors: 0,
|
||||
success_count: 0
|
||||
}
|
||||
end
|
||||
|
||||
def monitor_request
|
||||
start_time = Time.now
|
||||
@metrics[:total_requests] += 1
|
||||
|
||||
begin
|
||||
result = yield
|
||||
@metrics[:success_count] += 1
|
||||
result
|
||||
rescue => e
|
||||
@metrics[:errors] += 1
|
||||
raise
|
||||
ensure
|
||||
duration = Time.now - start_time
|
||||
@metrics[:total_duration] += duration
|
||||
|
||||
if @metrics[:total_requests] % 10 == 0
|
||||
print_stats
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
def print_stats
|
||||
avg_duration = @metrics[:total_duration] / @metrics[:total_requests]
|
||||
success_rate = @metrics[:success_count].to_f / @metrics[:total_requests]
|
||||
|
||||
puts "\n=== Performance Stats ==="
|
||||
puts "Total requests: #{@metrics[:total_requests]}"
|
||||
puts "Average duration: #{avg_duration.round(3)}s"
|
||||
puts "Success rate: #{(success_rate * 100).round(2)}%"
|
||||
puts "Errors: #{@metrics[:errors]}"
|
||||
puts "========================\n"
|
||||
end
|
||||
end
|
||||
|
||||
# Usage
|
||||
monitor = PerformanceMonitor.new
|
||||
predictor = DSPy::Predict.new(MySignature)
|
||||
|
||||
result = monitor.monitor_request do
|
||||
predictor.forward(input: 'data')
|
||||
end
|
||||
```
|
||||
|
||||
### Error Rate Tracking
|
||||
|
||||
Monitor and alert on error rates:
|
||||
|
||||
```ruby
|
||||
class ErrorRateMonitor
|
||||
def initialize(alert_threshold: 0.1)
|
||||
@alert_threshold = alert_threshold
|
||||
@recent_results = []
|
||||
@window_size = 100
|
||||
end
|
||||
|
||||
def track_result(success:)
|
||||
@recent_results << success
|
||||
@recent_results.shift if @recent_results.size > @window_size
|
||||
|
||||
error_rate = calculate_error_rate
|
||||
alert_if_needed(error_rate)
|
||||
|
||||
error_rate
|
||||
end
|
||||
|
||||
private
|
||||
|
||||
def calculate_error_rate
|
||||
failures = @recent_results.count(false)
|
||||
failures.to_f / @recent_results.size
|
||||
end
|
||||
|
||||
def alert_if_needed(error_rate)
|
||||
if error_rate > @alert_threshold
|
||||
puts "⚠️ ALERT: Error rate #{(error_rate * 100).round(2)}% exceeds threshold!"
|
||||
# Send notification, page oncall, etc.
|
||||
end
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Start with Tests
|
||||
|
||||
Write tests before optimizing:
|
||||
|
||||
```ruby
|
||||
# Define test cases first
|
||||
test_cases = [
|
||||
{ input: {...}, expected: {...} },
|
||||
# More test cases...
|
||||
]
|
||||
|
||||
# Ensure baseline functionality
|
||||
test_cases.each do |tc|
|
||||
result = module.forward(tc[:input])
|
||||
assert result[:category] == tc[:expected][:category]
|
||||
end
|
||||
|
||||
# Then optimize
|
||||
optimized = optimizer.compile(module, trainset: test_cases)
|
||||
```
|
||||
|
||||
### 2. Use Meaningful Metrics
|
||||
|
||||
Define metrics that align with business goals:
|
||||
|
||||
```ruby
|
||||
def business_aligned_metric(example, prediction)
|
||||
# High-priority errors are more costly
|
||||
if example[:expected_output][:priority] == 'High'
|
||||
return prediction[:priority] == 'High' ? 1.0 : 0.0
|
||||
else
|
||||
return prediction[:category] == example[:expected_output][:category] ? 0.8 : 0.0
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
### 3. Monitor in Production
|
||||
|
||||
Always track production performance:
|
||||
|
||||
```ruby
|
||||
class ProductionModule < DSPy::Module
|
||||
def initialize
|
||||
super
|
||||
@predictor = DSPy::ChainOfThought.new(MySignature)
|
||||
@monitor = PerformanceMonitor.new
|
||||
@error_tracker = ErrorRateMonitor.new
|
||||
end
|
||||
|
||||
def forward(input)
|
||||
@monitor.monitor_request do
|
||||
result = @predictor.forward(input)
|
||||
@error_tracker.track_result(success: true)
|
||||
result
|
||||
rescue => e
|
||||
@error_tracker.track_result(success: false)
|
||||
raise
|
||||
end
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
### 4. Version Your Modules
|
||||
|
||||
Track which version of your module is deployed:
|
||||
|
||||
```ruby
|
||||
class EmailClassifierV2 < DSPy::Module
|
||||
VERSION = '2.1.0'
|
||||
|
||||
def initialize
|
||||
super
|
||||
@predictor = DSPy::ChainOfThought.new(EmailClassificationSignature)
|
||||
end
|
||||
|
||||
def forward(input)
|
||||
result = @predictor.forward(input)
|
||||
result.merge(model_version: VERSION)
|
||||
end
|
||||
end
|
||||
```
|
||||
@@ -0,0 +1,338 @@
|
||||
# DSPy.rb LLM Providers
|
||||
|
||||
## Supported Providers
|
||||
|
||||
DSPy.rb provides unified support across multiple LLM providers through adapter gems that automatically load when installed.
|
||||
|
||||
### Provider Overview
|
||||
|
||||
- **OpenAI**: GPT-4, GPT-4o, GPT-4o-mini, GPT-3.5-turbo
|
||||
- **Anthropic**: Claude 3 family (Sonnet, Opus, Haiku), Claude 3.5 Sonnet
|
||||
- **Google Gemini**: Gemini 1.5 Pro, Gemini 1.5 Flash, other versions
|
||||
- **Ollama**: Local model support via OpenAI compatibility layer
|
||||
- **OpenRouter**: Unified multi-provider API for 200+ models
|
||||
|
||||
## Configuration
|
||||
|
||||
### Basic Setup
|
||||
|
||||
```ruby
|
||||
require 'dspy'
|
||||
|
||||
DSPy.configure do |c|
|
||||
c.lm = DSPy::LM.new('provider/model-name', api_key: ENV['API_KEY'])
|
||||
end
|
||||
```
|
||||
|
||||
### OpenAI Configuration
|
||||
|
||||
**Required gem**: `dspy-openai`
|
||||
|
||||
```ruby
|
||||
DSPy.configure do |c|
|
||||
# GPT-4o Mini (recommended for development)
|
||||
c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
|
||||
|
||||
# GPT-4o (more capable)
|
||||
c.lm = DSPy::LM.new('openai/gpt-4o', api_key: ENV['OPENAI_API_KEY'])
|
||||
|
||||
# GPT-4 Turbo
|
||||
c.lm = DSPy::LM.new('openai/gpt-4-turbo', api_key: ENV['OPENAI_API_KEY'])
|
||||
end
|
||||
```
|
||||
|
||||
**Environment variable**: `OPENAI_API_KEY`
|
||||
|
||||
### Anthropic Configuration
|
||||
|
||||
**Required gem**: `dspy-anthropic`
|
||||
|
||||
```ruby
|
||||
DSPy.configure do |c|
|
||||
# Claude 3.5 Sonnet (latest, most capable)
|
||||
c.lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
|
||||
api_key: ENV['ANTHROPIC_API_KEY'])
|
||||
|
||||
# Claude 3 Opus (most capable in Claude 3 family)
|
||||
c.lm = DSPy::LM.new('anthropic/claude-3-opus-20240229',
|
||||
api_key: ENV['ANTHROPIC_API_KEY'])
|
||||
|
||||
# Claude 3 Sonnet (balanced)
|
||||
c.lm = DSPy::LM.new('anthropic/claude-3-sonnet-20240229',
|
||||
api_key: ENV['ANTHROPIC_API_KEY'])
|
||||
|
||||
# Claude 3 Haiku (fast, cost-effective)
|
||||
c.lm = DSPy::LM.new('anthropic/claude-3-haiku-20240307',
|
||||
api_key: ENV['ANTHROPIC_API_KEY'])
|
||||
end
|
||||
```
|
||||
|
||||
**Environment variable**: `ANTHROPIC_API_KEY`
|
||||
|
||||
### Google Gemini Configuration
|
||||
|
||||
**Required gem**: `dspy-gemini`
|
||||
|
||||
```ruby
|
||||
DSPy.configure do |c|
|
||||
# Gemini 1.5 Pro (most capable)
|
||||
c.lm = DSPy::LM.new('gemini/gemini-1.5-pro',
|
||||
api_key: ENV['GOOGLE_API_KEY'])
|
||||
|
||||
# Gemini 1.5 Flash (faster, cost-effective)
|
||||
c.lm = DSPy::LM.new('gemini/gemini-1.5-flash',
|
||||
api_key: ENV['GOOGLE_API_KEY'])
|
||||
end
|
||||
```
|
||||
|
||||
**Environment variable**: `GOOGLE_API_KEY` or `GEMINI_API_KEY`
|
||||
|
||||
### Ollama Configuration
|
||||
|
||||
**Required gem**: None (uses OpenAI compatibility layer)
|
||||
|
||||
```ruby
|
||||
DSPy.configure do |c|
|
||||
# Local Ollama instance
|
||||
c.lm = DSPy::LM.new('ollama/llama3.1',
|
||||
base_url: 'http://localhost:11434')
|
||||
|
||||
# Other Ollama models
|
||||
c.lm = DSPy::LM.new('ollama/mistral')
|
||||
c.lm = DSPy::LM.new('ollama/codellama')
|
||||
end
|
||||
```
|
||||
|
||||
**Note**: Ensure Ollama is running locally: `ollama serve`
|
||||
|
||||
### OpenRouter Configuration
|
||||
|
||||
**Required gem**: `dspy-openai` (uses OpenAI adapter)
|
||||
|
||||
```ruby
|
||||
DSPy.configure do |c|
|
||||
# Access 200+ models through OpenRouter
|
||||
c.lm = DSPy::LM.new('openrouter/anthropic/claude-3.5-sonnet',
|
||||
api_key: ENV['OPENROUTER_API_KEY'],
|
||||
base_url: 'https://openrouter.ai/api/v1')
|
||||
|
||||
# Other examples
|
||||
c.lm = DSPy::LM.new('openrouter/google/gemini-pro')
|
||||
c.lm = DSPy::LM.new('openrouter/meta-llama/llama-3.1-70b-instruct')
|
||||
end
|
||||
```
|
||||
|
||||
**Environment variable**: `OPENROUTER_API_KEY`
|
||||
|
||||
## Provider Compatibility Matrix
|
||||
|
||||
### Feature Support
|
||||
|
||||
| Feature | OpenAI | Anthropic | Gemini | Ollama |
|
||||
|---------|--------|-----------|--------|--------|
|
||||
| Structured Output | ✅ | ✅ | ✅ | ✅ |
|
||||
| Vision (Images) | ✅ | ✅ | ✅ | ⚠️ Limited |
|
||||
| Image URLs | ✅ | ❌ | ❌ | ❌ |
|
||||
| Tool Calling | ✅ | ✅ | ✅ | Varies |
|
||||
| Streaming | ❌ | ❌ | ❌ | ❌ |
|
||||
| Function Calling | ✅ | ✅ | ✅ | Varies |
|
||||
|
||||
**Legend**: ✅ Full support | ⚠️ Partial support | ❌ Not supported
|
||||
|
||||
### Vision Capabilities
|
||||
|
||||
**Image URLs**: Only OpenAI supports direct URL references. For other providers, load images as base64 or from files.
|
||||
|
||||
```ruby
|
||||
# OpenAI - supports URLs
|
||||
DSPy::Image.from_url("https://example.com/image.jpg")
|
||||
|
||||
# Anthropic, Gemini - use file or base64
|
||||
DSPy::Image.from_file("path/to/image.jpg")
|
||||
DSPy::Image.from_base64(base64_data, mime_type: "image/jpeg")
|
||||
```
|
||||
|
||||
**Ollama**: Limited multimodal functionality. Check specific model capabilities.
|
||||
|
||||
## Advanced Configuration
|
||||
|
||||
### Custom Parameters
|
||||
|
||||
Pass provider-specific parameters during configuration:
|
||||
|
||||
```ruby
|
||||
DSPy.configure do |c|
|
||||
c.lm = DSPy::LM.new('openai/gpt-4o',
|
||||
api_key: ENV['OPENAI_API_KEY'],
|
||||
temperature: 0.7,
|
||||
max_tokens: 2000,
|
||||
top_p: 0.9
|
||||
)
|
||||
end
|
||||
```
|
||||
|
||||
### Multiple Providers
|
||||
|
||||
Use different models for different tasks:
|
||||
|
||||
```ruby
|
||||
# Fast model for simple tasks
|
||||
fast_lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
|
||||
|
||||
# Powerful model for complex tasks
|
||||
powerful_lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
|
||||
api_key: ENV['ANTHROPIC_API_KEY'])
|
||||
|
||||
# Use different models in different modules
|
||||
class SimpleClassifier < DSPy::Module
|
||||
def initialize
|
||||
super
|
||||
DSPy.configure { |c| c.lm = fast_lm }
|
||||
@predictor = DSPy::Predict.new(SimpleSignature)
|
||||
end
|
||||
end
|
||||
|
||||
class ComplexAnalyzer < DSPy::Module
|
||||
def initialize
|
||||
super
|
||||
DSPy.configure { |c| c.lm = powerful_lm }
|
||||
@predictor = DSPy::ChainOfThought.new(ComplexSignature)
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
### Per-Request Configuration
|
||||
|
||||
Override configuration for specific predictions:
|
||||
|
||||
```ruby
|
||||
predictor = DSPy::Predict.new(MySignature)
|
||||
|
||||
# Use default configuration
|
||||
result1 = predictor.forward(input: "data")
|
||||
|
||||
# Override temperature for this request
|
||||
result2 = predictor.forward(
|
||||
input: "data",
|
||||
config: { temperature: 0.2 } # More deterministic
|
||||
)
|
||||
```
|
||||
|
||||
## Cost Optimization
|
||||
|
||||
### Model Selection Strategy
|
||||
|
||||
1. **Development**: Use cheaper, faster models (gpt-4o-mini, claude-3-haiku, gemini-1.5-flash)
|
||||
2. **Production Simple Tasks**: Continue with cheaper models if quality is sufficient
|
||||
3. **Production Complex Tasks**: Upgrade to more capable models (gpt-4o, claude-3.5-sonnet, gemini-1.5-pro)
|
||||
4. **Local Development**: Use Ollama for privacy and zero API costs
|
||||
|
||||
### Example Cost-Conscious Setup
|
||||
|
||||
```ruby
|
||||
# Development environment
|
||||
if Rails.env.development?
|
||||
DSPy.configure do |c|
|
||||
c.lm = DSPy::LM.new('ollama/llama3.1') # Free, local
|
||||
end
|
||||
elsif Rails.env.test?
|
||||
DSPy.configure do |c|
|
||||
c.lm = DSPy::LM.new('openai/gpt-4o-mini', # Cheap for testing
|
||||
api_key: ENV['OPENAI_API_KEY'])
|
||||
end
|
||||
else # production
|
||||
DSPy.configure do |c|
|
||||
c.lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
|
||||
api_key: ENV['ANTHROPIC_API_KEY'])
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
## Provider-Specific Best Practices
|
||||
|
||||
### OpenAI
|
||||
|
||||
- Use `gpt-4o-mini` for development and simple tasks
|
||||
- Use `gpt-4o` for production complex tasks
|
||||
- Best vision support including URL loading
|
||||
- Excellent function calling capabilities
|
||||
|
||||
### Anthropic
|
||||
|
||||
- Claude 3.5 Sonnet is currently the most capable model
|
||||
- Excellent for complex reasoning and analysis
|
||||
- Strong safety features and helpful outputs
|
||||
- Requires base64 for images (no URL support)
|
||||
|
||||
### Google Gemini
|
||||
|
||||
- Gemini 1.5 Pro for complex tasks, Flash for speed
|
||||
- Strong multimodal capabilities
|
||||
- Good balance of cost and performance
|
||||
- Requires base64 for images
|
||||
|
||||
### Ollama
|
||||
|
||||
- Best for privacy-sensitive applications
|
||||
- Zero API costs
|
||||
- Requires local hardware resources
|
||||
- Limited multimodal support depending on model
|
||||
- Good for development and testing
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### API Key Issues
|
||||
|
||||
```ruby
|
||||
# Verify API key is set
|
||||
if ENV['OPENAI_API_KEY'].nil?
|
||||
raise "OPENAI_API_KEY environment variable not set"
|
||||
end
|
||||
|
||||
# Test connection
|
||||
begin
|
||||
DSPy.configure { |c| c.lm = DSPy::LM.new('openai/gpt-4o-mini',
|
||||
api_key: ENV['OPENAI_API_KEY']) }
|
||||
predictor = DSPy::Predict.new(TestSignature)
|
||||
predictor.forward(test: "data")
|
||||
puts "✅ Connection successful"
|
||||
rescue => e
|
||||
puts "❌ Connection failed: #{e.message}"
|
||||
end
|
||||
```
|
||||
|
||||
### Rate Limiting
|
||||
|
||||
Handle rate limits gracefully:
|
||||
|
||||
```ruby
|
||||
def call_with_retry(predictor, input, max_retries: 3)
|
||||
retries = 0
|
||||
begin
|
||||
predictor.forward(input)
|
||||
rescue RateLimitError => e
|
||||
retries += 1
|
||||
if retries < max_retries
|
||||
sleep(2 ** retries) # Exponential backoff
|
||||
retry
|
||||
else
|
||||
raise
|
||||
end
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
### Model Not Found
|
||||
|
||||
Ensure the correct gem is installed:
|
||||
|
||||
```bash
|
||||
# For OpenAI
|
||||
gem install dspy-openai
|
||||
|
||||
# For Anthropic
|
||||
gem install dspy-anthropic
|
||||
|
||||
# For Gemini
|
||||
gem install dspy-gemini
|
||||
```
|
||||
Reference in New Issue
Block a user