refactor(skills): update dspy-ruby skill to DSPy.rb v0.34.3 API (#162)

Rewrite all reference files, asset templates, and SKILL.md to use
current API patterns (.call(), result.field, T::Enum classes,
Tools::Base). Add two new reference files (toolsets, observability)
covering tools DSL, event system, and Langfuse integration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Vicente Reig Rincón de Arellano
2026-02-09 19:01:43 +01:00
committed by GitHub
parent f3b7d111f1
commit e8f3bbcb35
12 changed files with 3716 additions and 2246 deletions

View File

@@ -0,0 +1,104 @@
---
title: "refactor: Update dspy-ruby skill to DSPy.rb v0.34.3 API"
type: refactor
date: 2026-02-09
---
# Update dspy-ruby Skill to DSPy.rb v0.34.3 API
## Problem
The `dspy-ruby` skill uses outdated API patterns (`.forward()`, `result[:field]`, inline `T.enum([...])`, `DSPy::Tool`) and is missing 10+ features (events, lifecycle callbacks, GEPA, evaluation framework, BAML/TOON, storage, etc.).
## Solution
Use the engineering skill as base (already has correct API), enhance with official docs content, rewrite all reference files and templates.
### Source Priority (when conflicts arise)
1. **Official docs** (`../dspy.rb/docs/src/`) — source of truth for API correctness
2. **Engineering skill** (`../engineering/.../dspy-rb/SKILL.md`) — source of truth for structure/style
3. **NavigationContext brainstorm** — for Typed Context pattern only
## Files to Update
### Core (SKILL.md)
1. **`skills/dspy-ruby/SKILL.md`** — Copy from engineering base, then:
- Fix frontmatter: `name: dspy-rb``name: dspy-ruby`, keep long description format
- Add sections before "Guidelines for Claude": Events System, Lifecycle Callbacks, Fiber-Local LM Context, Evaluation Framework, GEPA Optimization, Typed Context Pattern, Schema Formats (BAML/TOON)
- Update Resources section with 5 references + 3 assets using markdown links
- Fix any backtick references to markdown link format
### References (rewrite from themed doc batches)
2. **`references/core-concepts.md`** — Rewrite
- Source: `core-concepts/signatures.md`, `modules.md`, `predictors.md`, `advanced/complex-types.md`
- Cover: signatures (Date/Time types, T::Enum, defaults, field descriptions, BAML/TOON, recursive types), modules (.call() API, lifecycle callbacks, instruction update contract), predictors (all 4 types, concurrent predictions), type system (discriminators, union types)
3. **`references/toolsets.md`** — NEW
- Source: `core-concepts/toolsets.md`, `toolsets-guide.md`
- Cover: Tools::Base, Tools::Toolset DSL, type safety with Sorbet sigs, schema generation, built-in toolsets, testing
4. **`references/providers.md`** — Rewrite
- Source: `llms.txt.erb`, engineering SKILL.md, `core-concepts/module-runtime-context.md`
- Cover: per-provider adapters, RubyLLM unified adapter, Rails initializer, fiber-local LM context (`DSPy.with_lm`), feature-flagged model selection, compatibility matrix
5. **`references/optimization.md`** — Rewrite
- Source: `optimization/miprov2.md`, `gepa.md`, `evaluation.md`, `production/storage.md`
- Cover: MIPROv2 (dspy-miprov2 gem, AutoMode presets), GEPA (dspy-gepa gem, feedback maps), Evaluation (DSPy::Evals, built-in metrics, DSPy::Example), Storage (ProgramStorage)
6. **`references/observability.md`** — NEW
- Source: `production/observability.md`, `core-concepts/events.md`, `advanced/observability-interception.md`
- Cover: event system (module-scoped + global), dspy-o11y gems, Langfuse (env vars), score reporting (DSPy.score()), observation types, DSPy::Context.with_span
### Assets (rewrite to current API)
7. **`assets/signature-template.rb`** — T::Enum classes, `description:` kwarg, Date/Time types, defaults, union types, `.call()` / `result.field` usage examples
8. **`assets/module-template.rb`** — `.call()` API, `result.field`, Tools::Base, lifecycle callbacks, `DSPy.with_lm`, `configure_predictor`
9. **`assets/config-template.rb`** — RubyLLM adapter, `structured_outputs: true`, `after_initialize` Rails pattern, dspy-o11y env vars, feature-flagged model selection
### Metadata
10. **`.claude-plugin/plugin.json`** — Version `2.31.0``2.31.1`
11. **`CHANGELOG.md`** — Add `[2.31.1] - 2026-02-09` entry under `### Changed`
## Verification
```bash
# No old API patterns
grep -n '\.forward(\|result\[:\|T\.enum(\[\|DSPy::Tool[^s]' plugins/compound-engineering/skills/dspy-ruby/SKILL.md
# No backtick references
grep -E '`(references|assets|scripts)/' plugins/compound-engineering/skills/dspy-ruby/SKILL.md
# Frontmatter correct
head -4 plugins/compound-engineering/skills/dspy-ruby/SKILL.md
# JSON valid
cat plugins/compound-engineering/.claude-plugin/plugin.json | jq .
# All files exist
ls plugins/compound-engineering/skills/dspy-ruby/{references,assets}/
```
## Success Criteria
- [x] All API patterns updated (`.call()`, `result.field`, `T::Enum`, `Tools::Base`)
- [x] New features covered: events, callbacks, fiber-local LM, GEPA, evals, BAML/TOON, storage, score API, RubyLLM, typed context
- [x] 5 reference files present (core-concepts, toolsets, providers, optimization, observability)
- [x] 3 asset templates updated to current API
- [x] YAML frontmatter: `name: dspy-ruby`, description has "what" and "when"
- [x] All reference links use `[file.md](./references/file.md)` format
- [x] Writing style: imperative form, no "you should"
- [x] Version bumped to `2.31.1`, CHANGELOG updated
- [x] Verification commands all pass
## Source Materials
- Engineering skill: `/Users/vicente/Workspaces/vicente.services/engineering/plugins/engineering-skills/skills/dspy-rb/SKILL.md`
- Official docs: `/Users/vicente/Workspaces/vicente.services/dspy.rb/docs/src/`
- NavigationContext brainstorm: `/Users/vicente/Workspaces/vicente.services/observo/observo-server/docs/brainstorms/2026-02-09-typed-navigation-context-brainstorm.md`

View File

@@ -1,6 +1,6 @@
{ {
"name": "compound-engineering", "name": "compound-engineering",
"version": "2.31.0", "version": "2.31.1",
"description": "AI-powered development tools. 29 agents, 24 commands, 18 skills, 1 MCP server for code review, research, design, and workflow automation.", "description": "AI-powered development tools. 29 agents, 24 commands, 18 skills, 1 MCP server for code review, research, design, and workflow automation.",
"author": { "author": {
"name": "Kieran Klaassen", "name": "Kieran Klaassen",

View File

@@ -5,6 +5,12 @@ All notable changes to the compound-engineering plugin will be documented in thi
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [2.31.1] - 2026-02-09
### Changed
- **`dspy-ruby` skill** — Complete rewrite to DSPy.rb v0.34.3 API: `.call()` / `result.field` patterns, `T::Enum` classes, `DSPy::Tools::Base` / `Toolset`. Added events system, lifecycle callbacks, fiber-local LM context, GEPA optimization, evaluation framework, typed context pattern, BAML/TOON schema formats, storage system, score reporting, RubyLLM adapter. 5 reference files (2 new: toolsets, observability), 3 asset templates rewritten.
## [2.31.0] - 2026-02-08 ## [2.31.0] - 2026-02-08
### Added ### Added

File diff suppressed because it is too large Load Diff

View File

@@ -1,359 +1,187 @@
# frozen_string_literal: true # frozen_string_literal: true
# DSPy.rb Configuration Examples # =============================================================================
# This file demonstrates various configuration patterns for different use cases # DSPy.rb Configuration Template — v0.34.3 API
require 'dspy'
# ============================================================================
# Basic Configuration
# ============================================================================
# Simple OpenAI configuration
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
end
# ============================================================================
# Multi-Provider Configuration
# ============================================================================
# Anthropic Claude
DSPy.configure do |c|
c.lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'])
end
# Google Gemini
DSPy.configure do |c|
c.lm = DSPy::LM.new('gemini/gemini-1.5-pro',
api_key: ENV['GOOGLE_API_KEY'])
end
# Local Ollama
DSPy.configure do |c|
c.lm = DSPy::LM.new('ollama/llama3.1',
base_url: 'http://localhost:11434')
end
# OpenRouter (access to 200+ models)
DSPy.configure do |c|
c.lm = DSPy::LM.new('openrouter/anthropic/claude-3.5-sonnet',
api_key: ENV['OPENROUTER_API_KEY'],
base_url: 'https://openrouter.ai/api/v1')
end
# ============================================================================
# Environment-Based Configuration
# ============================================================================
# Different models for different environments
if Rails.env.development?
# Use local Ollama for development (free, private)
DSPy.configure do |c|
c.lm = DSPy::LM.new('ollama/llama3.1')
end
elsif Rails.env.test?
# Use cheap model for testing
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
end
else
# Use powerful model for production
DSPy.configure do |c|
c.lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'])
end
end
# ============================================================================
# Configuration with Custom Parameters
# ============================================================================
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o',
api_key: ENV['OPENAI_API_KEY'],
temperature: 0.7, # Creativity (0.0-2.0, default: 1.0)
max_tokens: 2000, # Maximum response length
top_p: 0.9, # Nucleus sampling
frequency_penalty: 0.0, # Reduce repetition (-2.0 to 2.0)
presence_penalty: 0.0 # Encourage new topics (-2.0 to 2.0)
)
end
# ============================================================================
# Multiple Model Configuration (Task-Specific)
# ============================================================================
# Create different language models for different tasks
module MyApp
# Fast model for simple tasks
FAST_LM = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'],
temperature: 0.3 # More deterministic
)
# Powerful model for complex tasks
POWERFUL_LM = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'],
temperature: 0.7
)
# Creative model for content generation
CREATIVE_LM = DSPy::LM.new('openai/gpt-4o',
api_key: ENV['OPENAI_API_KEY'],
temperature: 1.2, # More creative
top_p: 0.95
)
# Vision-capable model
VISION_LM = DSPy::LM.new('openai/gpt-4o',
api_key: ENV['OPENAI_API_KEY'])
end
# Use in modules
class SimpleClassifier < DSPy::Module
def initialize
super
DSPy.configure { |c| c.lm = MyApp::FAST_LM }
@predictor = DSPy::Predict.new(SimpleSignature)
end
end
class ComplexAnalyzer < DSPy::Module
def initialize
super
DSPy.configure { |c| c.lm = MyApp::POWERFUL_LM }
@predictor = DSPy::ChainOfThought.new(ComplexSignature)
end
end
# ============================================================================
# Configuration with Observability (OpenTelemetry)
# ============================================================================
require 'opentelemetry/sdk'
# Configure OpenTelemetry
OpenTelemetry::SDK.configure do |c|
c.service_name = 'my-dspy-app'
c.use_all
end
# Configure DSPy (automatically integrates with OpenTelemetry)
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
end
# ============================================================================
# Configuration with Langfuse Tracing
# ============================================================================
require 'dspy/langfuse'
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
# Enable Langfuse tracing
c.langfuse = {
public_key: ENV['LANGFUSE_PUBLIC_KEY'],
secret_key: ENV['LANGFUSE_SECRET_KEY'],
host: ENV['LANGFUSE_HOST'] || 'https://cloud.langfuse.com'
}
end
# ============================================================================
# Configuration with Retry Logic
# ============================================================================
class RetryableConfig
MAX_RETRIES = 3
def self.configure
DSPy.configure do |c|
c.lm = create_lm_with_retry
end
end
def self.create_lm_with_retry
lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'])
# Wrap with retry logic
lm.extend(RetryBehavior)
lm
end
module RetryBehavior
def forward(input, retry_count: 0)
super(input)
rescue RateLimitError, TimeoutError => e
if retry_count < MAX_RETRIES
sleep(2 ** retry_count) # Exponential backoff
forward(input, retry_count: retry_count + 1)
else
raise
end
end
end
end
RetryableConfig.configure
# ============================================================================
# Configuration with Fallback Models
# ============================================================================
class FallbackConfig
def self.configure
DSPy.configure do |c|
c.lm = create_lm_with_fallback
end
end
def self.create_lm_with_fallback
primary = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'])
fallback = DSPy::LM.new('openai/gpt-4o',
api_key: ENV['OPENAI_API_KEY'])
FallbackLM.new(primary, fallback)
end
class FallbackLM
def initialize(primary, fallback)
@primary = primary
@fallback = fallback
end
def forward(input)
@primary.forward(input)
rescue => e
puts "Primary model failed: #{e.message}. Falling back..."
@fallback.forward(input)
end
end
end
FallbackConfig.configure
# ============================================================================
# Configuration with Budget Tracking
# ============================================================================
class BudgetTrackedConfig
def self.configure(monthly_budget_usd:)
DSPy.configure do |c|
c.lm = BudgetTracker.new(
DSPy::LM.new('openai/gpt-4o',
api_key: ENV['OPENAI_API_KEY']),
monthly_budget_usd: monthly_budget_usd
)
end
end
class BudgetTracker
def initialize(lm, monthly_budget_usd:)
@lm = lm
@monthly_budget_usd = monthly_budget_usd
@monthly_cost = 0.0
end
def forward(input)
result = @lm.forward(input)
# Track cost (simplified - actual costs vary by model)
tokens = result.metadata[:usage][:total_tokens]
cost = estimate_cost(tokens)
@monthly_cost += cost
if @monthly_cost > @monthly_budget_usd
raise "Monthly budget of $#{@monthly_budget_usd} exceeded!"
end
result
end
private
def estimate_cost(tokens)
# Simplified cost estimation (check provider pricing)
(tokens / 1_000_000.0) * 5.0 # $5 per 1M tokens
end
end
end
BudgetTrackedConfig.configure(monthly_budget_usd: 100)
# ============================================================================
# Configuration Initializer for Rails
# ============================================================================
# Save this as config/initializers/dspy.rb
# #
# require 'dspy' # Rails initializer patterns for DSPy.rb with RubyLLM, observability,
# and feature-flagged model selection.
# #
# DSPy.configure do |c| # Key patterns:
# # Environment-specific configuration # - Use after_initialize for Rails setup
# model_config = case Rails.env.to_sym # - Use dspy-ruby_llm for multi-provider routing
# when :development # - Use structured_outputs: true for reliable parsing
# { provider: 'ollama', model: 'llama3.1' } # - Use dspy-o11y + dspy-o11y-langfuse for observability
# when :test # - Use ENV-based feature flags for model selection
# { provider: 'openai', model: 'gpt-4o-mini', temperature: 0.0 } # =============================================================================
# when :production
# { provider: 'anthropic', model: 'claude-3-5-sonnet-20241022' }
# end
#
# # Configure language model
# c.lm = DSPy::LM.new(
# "#{model_config[:provider]}/#{model_config[:model]}",
# api_key: ENV["#{model_config[:provider].upcase}_API_KEY"],
# **model_config.except(:provider, :model)
# )
#
# # Optional: Add observability
# if Rails.env.production?
# c.langfuse = {
# public_key: ENV['LANGFUSE_PUBLIC_KEY'],
# secret_key: ENV['LANGFUSE_SECRET_KEY']
# }
# end
# end
# ============================================================================ # =============================================================================
# Testing Configuration # Gemfile Dependencies
# ============================================================================ # =============================================================================
# In spec/spec_helper.rb or test/test_helper.rb
# #
# RSpec.configure do |config| # # Core
# config.before(:suite) do # gem 'dspy'
# DSPy.configure do |c| #
# c.lm = DSPy::LM.new('openai/gpt-4o-mini', # # Provider adapter (choose one strategy):
# api_key: ENV['OPENAI_API_KEY'], #
# temperature: 0.0 # Deterministic for testing # # Strategy A: Unified adapter via RubyLLM (recommended)
# ) # gem 'dspy-ruby_llm'
# gem 'ruby_llm'
#
# # Strategy B: Per-provider adapters (direct SDK access)
# gem 'dspy-openai' # OpenAI, OpenRouter, Ollama
# gem 'dspy-anthropic' # Claude
# gem 'dspy-gemini' # Gemini
#
# # Observability (optional)
# gem 'dspy-o11y'
# gem 'dspy-o11y-langfuse'
#
# # Optimization (optional)
# gem 'dspy-miprov2' # MIPROv2 optimizer
# gem 'dspy-gepa' # GEPA optimizer
#
# # Schema formats (optional)
# gem 'sorbet-baml' # BAML schema format (84% token reduction)
# =============================================================================
# Rails Initializer — config/initializers/dspy.rb
# =============================================================================
Rails.application.config.after_initialize do
# Skip in test unless explicitly enabled
next if Rails.env.test? && ENV["DSPY_ENABLE_IN_TEST"].blank?
# Configure RubyLLM provider credentials
RubyLLM.configure do |config|
config.gemini_api_key = ENV["GEMINI_API_KEY"] if ENV["GEMINI_API_KEY"].present?
config.anthropic_api_key = ENV["ANTHROPIC_API_KEY"] if ENV["ANTHROPIC_API_KEY"].present?
config.openai_api_key = ENV["OPENAI_API_KEY"] if ENV["OPENAI_API_KEY"].present?
end
# Configure DSPy with unified RubyLLM adapter
model = ENV.fetch("DSPY_MODEL", "ruby_llm/gemini-2.5-flash")
DSPy.configure do |config|
config.lm = DSPy::LM.new(model, structured_outputs: true)
config.logger = Rails.logger
end
# Enable Langfuse observability (optional)
if ENV["LANGFUSE_PUBLIC_KEY"].present? && ENV["LANGFUSE_SECRET_KEY"].present?
DSPy::Observability.configure!
end
end
# =============================================================================
# Feature Flags — config/initializers/feature_flags.rb
# =============================================================================
# Use different models for different roles:
# - Fast/cheap for classification, routing, simple tasks
# - Powerful for synthesis, reasoning, complex analysis
module FeatureFlags
SELECTOR_MODEL = ENV.fetch("DSPY_SELECTOR_MODEL", "ruby_llm/gemini-2.5-flash-lite")
SYNTHESIZER_MODEL = ENV.fetch("DSPY_SYNTHESIZER_MODEL", "ruby_llm/gemini-2.5-flash")
REASONING_MODEL = ENV.fetch("DSPY_REASONING_MODEL", "ruby_llm/claude-sonnet-4-20250514")
end
# Usage in tools/modules:
#
# class ClassifyTool < DSPy::Tools::Base
# def call(query:)
# predictor = DSPy::Predict.new(ClassifySignature)
# predictor.configure { |c| c.lm = DSPy::LM.new(FeatureFlags::SELECTOR_MODEL, structured_outputs: true) }
# predictor.call(query: query)
# end # end
# end # end
# =============================================================================
# Environment Variables — .env
# =============================================================================
#
# # Provider API keys (set the ones you need)
# GEMINI_API_KEY=...
# ANTHROPIC_API_KEY=...
# OPENAI_API_KEY=...
#
# # DSPy model configuration
# DSPY_MODEL=ruby_llm/gemini-2.5-flash
# DSPY_SELECTOR_MODEL=ruby_llm/gemini-2.5-flash-lite
# DSPY_SYNTHESIZER_MODEL=ruby_llm/gemini-2.5-flash
# DSPY_REASONING_MODEL=ruby_llm/claude-sonnet-4-20250514
#
# # Langfuse observability (optional)
# LANGFUSE_PUBLIC_KEY=pk-...
# LANGFUSE_SECRET_KEY=sk-...
# DSPY_TELEMETRY_BATCH_SIZE=5
#
# # Test environment
# DSPY_ENABLE_IN_TEST=1 # Set to enable DSPy in test env
# =============================================================================
# Per-Provider Configuration (without RubyLLM)
# =============================================================================
# OpenAI (dspy-openai gem)
# DSPy.configure do |c|
# c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
# end # end
# ============================================================================ # Anthropic (dspy-anthropic gem)
# Configuration Best Practices # DSPy.configure do |c|
# ============================================================================ # c.lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY'])
# end
# 1. Use environment variables for API keys (never hardcode) # Gemini (dspy-gemini gem)
# 2. Use different models for different environments # DSPy.configure do |c|
# 3. Use cheaper/faster models for development and testing # c.lm = DSPy::LM.new('gemini/gemini-2.5-flash', api_key: ENV['GEMINI_API_KEY'])
# 4. Configure temperature based on use case: # end
# - 0.0-0.3: Deterministic, factual tasks
# - 0.7-1.0: Balanced creativity # Ollama (dspy-openai gem, local models)
# - 1.0-2.0: High creativity, content generation # DSPy.configure do |c|
# 5. Add observability in production (OpenTelemetry, Langfuse) # c.lm = DSPy::LM.new('ollama/llama3.2', base_url: 'http://localhost:11434')
# 6. Implement retry logic and fallbacks for reliability # end
# 7. Track costs and set budgets for production
# 8. Use max_tokens to control response length and costs # OpenRouter (dspy-openai gem, 200+ models)
# DSPy.configure do |c|
# c.lm = DSPy::LM.new('openrouter/anthropic/claude-3.5-sonnet',
# api_key: ENV['OPENROUTER_API_KEY'],
# base_url: 'https://openrouter.ai/api/v1')
# end
# =============================================================================
# VCR Test Configuration — spec/support/dspy.rb
# =============================================================================
# VCR.configure do |config|
# config.cassette_library_dir = "spec/vcr_cassettes"
# config.hook_into :webmock
# config.configure_rspec_metadata!
# config.filter_sensitive_data('<GEMINI_API_KEY>') { ENV['GEMINI_API_KEY'] }
# config.filter_sensitive_data('<OPENAI_API_KEY>') { ENV['OPENAI_API_KEY'] }
# config.filter_sensitive_data('<ANTHROPIC_API_KEY>') { ENV['ANTHROPIC_API_KEY'] }
# end
# =============================================================================
# Schema Format Configuration (optional)
# =============================================================================
# BAML schema format — 84% token reduction for Enhanced Prompting mode
# DSPy.configure do |c|
# c.lm = DSPy::LM.new('openai/gpt-4o-mini',
# api_key: ENV['OPENAI_API_KEY'],
# schema_format: :baml # Requires sorbet-baml gem
# )
# end
# TOON schema + data format — table-oriented format
# DSPy.configure do |c|
# c.lm = DSPy::LM.new('openai/gpt-4o-mini',
# api_key: ENV['OPENAI_API_KEY'],
# schema_format: :toon, # How DSPy describes the signature
# data_format: :toon # How inputs/outputs are rendered in prompts
# )
# end
#
# Note: BAML and TOON apply only when structured_outputs: false.
# With structured_outputs: true, the provider receives JSON Schema directly.

View File

@@ -1,326 +1,300 @@
# frozen_string_literal: true # frozen_string_literal: true
# Example DSPy Module Template # =============================================================================
# This template demonstrates best practices for creating composable modules # DSPy.rb Module Template — v0.34.3 API
#
# Modules orchestrate predictors, tools, and business logic.
#
# Key patterns:
# - Use .call() to invoke (not .forward())
# - Access results with result.field (not result[:field])
# - Use DSPy::Tools::Base for tools (not DSPy::Tool)
# - Use lifecycle callbacks (before/around/after) for cross-cutting concerns
# - Use DSPy.with_lm for temporary model overrides
# - Use configure_predictor for fine-grained agent control
# =============================================================================
# Basic module with single predictor # --- Basic Module ---
class BasicModule < DSPy::Module
class BasicClassifier < DSPy::Module
def initialize def initialize
super super
# Initialize predictor with signature @predictor = DSPy::Predict.new(ClassificationSignature)
@predictor = DSPy::Predict.new(ExampleSignature)
end end
def forward(input_hash) def forward(text:)
# Forward pass through the predictor @predictor.call(text: text)
@predictor.forward(input_hash)
end end
end end
# Module with Chain of Thought reasoning # Usage:
class ChainOfThoughtModule < DSPy::Module # classifier = BasicClassifier.new
# result = classifier.call(text: "This is a test")
# result.category # => "technical"
# result.confidence # => 0.95
# --- Module with Chain of Thought ---
class ReasoningClassifier < DSPy::Module
def initialize def initialize
super super
# ChainOfThought automatically adds reasoning to output @predictor = DSPy::ChainOfThought.new(ClassificationSignature)
@predictor = DSPy::ChainOfThought.new(EmailClassificationSignature)
end end
def forward(email_subject:, email_body:) def forward(text:)
result = @predictor.forward( result = @predictor.call(text: text)
email_subject: email_subject, # ChainOfThought adds result.reasoning automatically
email_body: email_body result
) end
end
# Result includes :reasoning field automatically # --- Module with Lifecycle Callbacks ---
{
category: result[:category], class InstrumentedModule < DSPy::Module
priority: result[:priority], before :setup_metrics
reasoning: result[:reasoning], around :manage_context
confidence: calculate_confidence(result) after :log_completion
}
def initialize
super
@predictor = DSPy::Predict.new(AnalysisSignature)
@start_time = nil
end
def forward(query:)
@predictor.call(query: query)
end end
private private
def calculate_confidence(result) # Runs before forward
# Add custom logic to calculate confidence def setup_metrics
# For example, based on reasoning length or specificity @start_time = Time.now
result[:confidence] || 0.8 Rails.logger.info "Starting prediction"
end
# Wraps forward — must call yield
def manage_context
load_user_context
result = yield
save_updated_context(result)
result
end
# Runs after forward completes
def log_completion
duration = Time.now - @start_time
Rails.logger.info "Prediction completed in #{duration}s"
end
def load_user_context = nil
def save_updated_context(_result) = nil
end
# Execution order: before → around (before yield) → forward → around (after yield) → after
# Callbacks are inherited from parent classes and execute in registration order.
# --- Module with Tools ---
class SearchTool < DSPy::Tools::Base
tool_name "search"
tool_description "Search for information by query"
sig { params(query: String, max_results: Integer).returns(T::Array[T::Hash[Symbol, String]]) }
def call(query:, max_results: 5)
# Implementation here
[{ title: "Result 1", url: "https://example.com" }]
end end
end end
# Composable module that chains multiple steps class FinishTool < DSPy::Tools::Base
class MultiStepPipeline < DSPy::Module tool_name "finish"
def initialize tool_description "Submit the final answer"
super
# Initialize multiple predictors for different steps
@step1 = DSPy::Predict.new(Step1Signature)
@step2 = DSPy::ChainOfThought.new(Step2Signature)
@step3 = DSPy::Predict.new(Step3Signature)
end
def forward(input) sig { params(answer: String).returns(String) }
# Chain predictors together def call(answer:)
result1 = @step1.forward(input) answer
result2 = @step2.forward(result1)
result3 = @step3.forward(result2)
# Combine results as needed
{
step1_output: result1,
step2_output: result2,
final_result: result3
}
end end
end end
# Module with conditional logic class ResearchAgent < DSPy::Module
class ConditionalModule < DSPy::Module
def initialize def initialize
super super
@simple_classifier = DSPy::Predict.new(SimpleClassificationSignature) tools = [SearchTool.new, FinishTool.new]
@complex_analyzer = DSPy::ChainOfThought.new(ComplexAnalysisSignature) @agent = DSPy::ReAct.new(
ResearchSignature,
tools: tools,
max_iterations: 5
)
end end
def forward(text:, complexity_threshold: 100) def forward(question:)
# Use different predictors based on input characteristics @agent.call(question: question)
if text.length < complexity_threshold
@simple_classifier.forward(text: text)
else
@complex_analyzer.forward(text: text)
end
end end
end end
# Module with error handling and retry logic # --- Module with Per-Task Model Selection ---
class RobustModule < DSPy::Module
MAX_RETRIES = 3
class SmartRouter < DSPy::Module
def initialize def initialize
super super
@predictor = DSPy::Predict.new(RobustSignature) @classifier = DSPy::Predict.new(RouteSignature)
@logger = Logger.new(STDOUT) @analyzer = DSPy::ChainOfThought.new(AnalysisSignature)
end end
def forward(input, retry_count: 0) def forward(text:)
@logger.info "Processing input: #{input.inspect}" # Use fast model for classification
DSPy.with_lm(fast_model) do
route = @classifier.call(text: text)
begin if route.requires_deep_analysis
result = @predictor.forward(input) # Switch to powerful model for analysis
validate_result!(result) DSPy.with_lm(powerful_model) do
result @analyzer.call(text: text)
rescue DSPy::ValidationError => e end
@logger.error "Validation error: #{e.message}"
if retry_count < MAX_RETRIES
@logger.info "Retrying (#{retry_count + 1}/#{MAX_RETRIES})..."
sleep(2 ** retry_count) # Exponential backoff
forward(input, retry_count: retry_count + 1)
else else
@logger.error "Max retries exceeded" route
raise
end end
end end
end end
private private
def validate_result!(result) def fast_model
# Add custom validation logic @fast_model ||= DSPy::LM.new(
raise DSPy::ValidationError, "Invalid result" unless result[:category] ENV.fetch("DSPY_SELECTOR_MODEL", "ruby_llm/gemini-2.5-flash-lite"),
raise DSPy::ValidationError, "Low confidence" if result[:confidence] && result[:confidence] < 0.5 structured_outputs: true
)
end
def powerful_model
@powerful_model ||= DSPy::LM.new(
ENV.fetch("DSPY_SYNTHESIZER_MODEL", "ruby_llm/gemini-2.5-flash"),
structured_outputs: true
)
end end
end end
# Module with ReAct agent and tools # --- Module with configure_predictor ---
class AgentModule < DSPy::Module
class ConfiguredAgent < DSPy::Module
def initialize def initialize
super super
tools = [SearchTool.new, FinishTool.new]
@agent = DSPy::ReAct.new(ResearchSignature, tools: tools)
# Define tools for the agent # Set default model for all internal predictors
tools = [ @agent.configure { |c| c.lm = DSPy::LM.new('ruby_llm/gemini-2.5-flash', structured_outputs: true) }
SearchTool.new,
CalculatorTool.new,
DatabaseQueryTool.new
]
# ReAct provides iterative reasoning and tool usage # Override specific predictor with a more capable model
@agent = DSPy::ReAct.new( @agent.configure_predictor('thought_generator') do |c|
AgentSignature, c.lm = DSPy::LM.new('ruby_llm/claude-sonnet-4-20250514', structured_outputs: true)
tools: tools,
max_iterations: 5
)
end
def forward(task:)
# Agent will autonomously use tools to complete the task
@agent.forward(task: task)
end
end
# Tool definition example
class SearchTool < DSPy::Tool
def call(query:)
# Implement search functionality
results = perform_search(query)
{ results: results }
end
private
def perform_search(query)
# Actual search implementation
# Could call external API, database, etc.
["result1", "result2", "result3"]
end
end
# Module with state management
class StatefulModule < DSPy::Module
attr_reader :history
def initialize
super
@predictor = DSPy::ChainOfThought.new(StatefulSignature)
@history = []
end
def forward(input)
# Process with context from history
context = build_context_from_history
result = @predictor.forward(
input: input,
context: context
)
# Store in history
@history << {
input: input,
result: result,
timestamp: Time.now
}
result
end
def reset!
@history.clear
end
private
def build_context_from_history
@history.last(5).map { |h| h[:result][:summary] }.join("\n")
end
end
# Module that uses different LLMs for different tasks
class MultiModelModule < DSPy::Module
def initialize
super
# Fast, cheap model for simple classification
@fast_predictor = create_predictor(
'openai/gpt-4o-mini',
SimpleClassificationSignature
)
# Powerful model for complex analysis
@powerful_predictor = create_predictor(
'anthropic/claude-3-5-sonnet-20241022',
ComplexAnalysisSignature
)
end
def forward(input, use_complex: false)
if use_complex
@powerful_predictor.forward(input)
else
@fast_predictor.forward(input)
end end
end end
private def forward(question:)
@agent.call(question: question)
def create_predictor(model, signature)
lm = DSPy::LM.new(model, api_key: ENV["#{model.split('/').first.upcase}_API_KEY"])
DSPy::Predict.new(signature, lm: lm)
end end
end end
# Module with caching # Available internal predictors by agent type:
class CachedModule < DSPy::Module # DSPy::ReAct → thought_generator, observation_processor
# DSPy::CodeAct → code_generator, observation_processor
# DSPy::DeepSearch → seed_predictor, search_predictor, reader_predictor, reason_predictor
# --- Module with Event Subscriptions ---
class TokenTrackingModule < DSPy::Module
subscribe 'lm.tokens', :track_tokens, scope: :descendants
def initialize def initialize
super super
@predictor = DSPy::Predict.new(CachedSignature) @predictor = DSPy::Predict.new(AnalysisSignature)
@cache = {} @total_tokens = 0
end end
def forward(input) def forward(query:)
# Create cache key from input @predictor.call(query: query)
cache_key = create_cache_key(input)
# Return cached result if available
if @cache.key?(cache_key)
puts "Cache hit for #{cache_key}"
return @cache[cache_key]
end
# Compute and cache result
result = @predictor.forward(input)
@cache[cache_key] = result
result
end end
def clear_cache! def track_tokens(_event, attrs)
@cache.clear @total_tokens += attrs.fetch(:total_tokens, 0)
end end
private def token_usage
@total_tokens
def create_cache_key(input)
# Create deterministic hash from input
Digest::MD5.hexdigest(input.to_s)
end end
end end
# Usage Examples: # Module-scoped subscriptions automatically scope to the module instance and descendants.
# # Use scope: :self_only to restrict delivery to the module itself (ignoring children).
# Basic usage:
# module = BasicModule.new # --- Tool That Wraps a Prediction ---
# result = module.forward(field_name: "value")
# class RerankTool < DSPy::Tools::Base
# Chain of Thought: tool_name "rerank"
# module = ChainOfThoughtModule.new tool_description "Score and rank search results by relevance"
# result = module.forward(
# email_subject: "Can't log in", MAX_ITEMS = 200
# email_body: "I'm unable to access my account" MIN_ITEMS_FOR_LLM = 5
# )
# puts result[:reasoning] sig { params(query: String, items: T::Array[T::Hash[Symbol, T.untyped]]).returns(T::Hash[Symbol, T.untyped]) }
# def call(query:, items: [])
# Multi-step pipeline: # Short-circuit: skip LLM for small sets
# pipeline = MultiStepPipeline.new return { scored_items: items, reranked: false } if items.size < MIN_ITEMS_FOR_LLM
# result = pipeline.forward(input_data)
# # Cap to prevent token overflow
# With error handling: capped_items = items.first(MAX_ITEMS)
# module = RobustModule.new
# begin predictor = DSPy::Predict.new(RerankSignature)
# result = module.forward(input_data) predictor.configure { |c| c.lm = DSPy::LM.new("ruby_llm/gemini-2.5-flash", structured_outputs: true) }
# rescue DSPy::ValidationError => e
# puts "Failed after retries: #{e.message}" result = predictor.call(query: query, items: capped_items)
# end { scored_items: result.scored_items, reranked: true }
# rescue => e
# Agent with tools: Rails.logger.warn "[RerankTool] LLM rerank failed: #{e.message}"
# agent = AgentModule.new { error: "Rerank failed: #{e.message}", scored_items: items, reranked: false }
# result = agent.forward(task: "Find the population of Tokyo") end
# end
# Stateful processing:
# module = StatefulModule.new # Key patterns for tools wrapping predictions:
# result1 = module.forward("First input") # - Short-circuit LLM calls when unnecessary (small data, trivial cases)
# result2 = module.forward("Second input") # Has context from first # - Cap input size to prevent token overflow
# module.reset! # Clear history # - Per-tool model selection via configure
# # - Graceful error handling with fallback data
# With caching:
# module = CachedModule.new # --- Multi-Step Pipeline ---
# result1 = module.forward(input) # Computes result
# result2 = module.forward(input) # Returns cached result class AnalysisPipeline < DSPy::Module
def initialize
super
@classifier = DSPy::Predict.new(ClassifySignature)
@analyzer = DSPy::ChainOfThought.new(AnalyzeSignature)
@summarizer = DSPy::Predict.new(SummarizeSignature)
end
def forward(text:)
classification = @classifier.call(text: text)
analysis = @analyzer.call(text: text, category: classification.category)
@summarizer.call(analysis: analysis.reasoning, category: classification.category)
end
end
# --- Observability with Spans ---
class TracedModule < DSPy::Module
def initialize
super
@predictor = DSPy::Predict.new(AnalysisSignature)
end
def forward(query:)
DSPy::Context.with_span(
operation: "traced_module.analyze",
"dspy.module" => self.class.name,
"query.length" => query.length.to_s
) do
@predictor.call(query: query)
end
end
end

View File

@@ -1,143 +1,221 @@
# frozen_string_literal: true # frozen_string_literal: true
# Example DSPy Signature Template # =============================================================================
# This template demonstrates best practices for creating type-safe signatures # DSPy.rb Signature Template — v0.34.3 API
#
# Signatures define the interface between your application and LLMs.
# They specify inputs, outputs, and task descriptions using Sorbet types.
#
# Key patterns:
# - Use T::Enum classes for controlled outputs (not inline T.enum([...]))
# - Use description: kwarg on fields to guide the LLM
# - Use default values for optional fields
# - Use Date/DateTime/Time for temporal data (auto-converted)
# - Access results with result.field (not result[:field])
# - Invoke with predictor.call() (not predictor.forward())
# =============================================================================
class ExampleSignature < DSPy::Signature # --- Basic Signature ---
# Clear, specific description of what this signature does
# Good: "Classify customer support emails into Technical, Billing, or General categories"
# Avoid: "Classify emails"
description "Describe what this signature accomplishes and what output it produces"
# Input fields: Define what data the LLM receives class SentimentAnalysis < DSPy::Signature
input do description "Analyze sentiment of text"
# Basic field with description
const :field_name, String, desc: "Clear description of this input field"
# Numeric fields class Sentiment < T::Enum
const :count, Integer, desc: "Number of items to process" enums do
const :score, Float, desc: "Confidence score between 0.0 and 1.0" Positive = new('positive')
Negative = new('negative')
# Boolean fields Neutral = new('neutral')
const :is_active, T::Boolean, desc: "Whether the item is currently active" end
end
# Array fields
const :tags, T::Array[String], desc: "List of tags associated with the item" input do
const :text, String
# Optional: Enum for constrained values
const :priority, T.enum(["Low", "Medium", "High"]), desc: "Priority level"
end end
# Output fields: Define what data the LLM produces
output do output do
# Primary output const :sentiment, Sentiment
const :result, String, desc: "The main result of the operation" const :score, Float, description: "Confidence score from 0.0 to 1.0"
# Classification result with enum
const :category, T.enum(["Technical", "Billing", "General"]),
desc: "Category classification - must be one of: Technical, Billing, General"
# Confidence/metadata
const :confidence, Float, desc: "Confidence score (0.0-1.0) for this classification"
# Optional reasoning (automatically added by ChainOfThought)
# const :reasoning, String, desc: "Step-by-step reasoning for the classification"
end end
end end
# Example with multimodal input (vision) # Usage:
class VisionExampleSignature < DSPy::Signature # predictor = DSPy::Predict.new(SentimentAnalysis)
# result = predictor.call(text: "This product is amazing!")
# result.sentiment # => Sentiment::Positive
# result.score # => 0.92
# --- Signature with Date/Time Types ---
class EventScheduler < DSPy::Signature
description "Schedule events based on requirements"
input do
const :event_name, String
const :start_date, Date # ISO 8601: YYYY-MM-DD
const :end_date, T.nilable(Date) # Optional date
const :preferred_time, DateTime # ISO 8601 with timezone
const :deadline, Time # Stored as UTC
end
output do
const :scheduled_date, Date # LLM returns ISO string, auto-converted
const :event_datetime, DateTime # Preserves timezone
const :created_at, Time # Converted to UTC
end
end
# Date/Time format handling:
# Date → ISO 8601 (YYYY-MM-DD)
# DateTime → ISO 8601 with timezone (YYYY-MM-DDTHH:MM:SS+00:00)
# Time → ISO 8601, automatically converted to UTC
# --- Signature with Default Values ---
class SmartSearch < DSPy::Signature
description "Search with intelligent defaults"
input do
const :query, String
const :max_results, Integer, default: 10
const :language, String, default: "English"
const :include_metadata, T::Boolean, default: false
end
output do
const :results, T::Array[String]
const :total_found, Integer
const :search_time_ms, Float, default: 0.0 # Fallback if LLM omits
const :cached, T::Boolean, default: false
end
end
# Input defaults reduce boilerplate:
# search = DSPy::Predict.new(SmartSearch)
# result = search.call(query: "Ruby programming")
# # max_results=10, language="English", include_metadata=false are applied
# --- Signature with Nested Structs and Field Descriptions ---
class EntityExtraction < DSPy::Signature
description "Extract named entities from text"
class EntityType < T::Enum
enums do
Person = new('person')
Organization = new('organization')
Location = new('location')
DateEntity = new('date')
end
end
class Entity < T::Struct
const :name, String, description: "The entity text as it appears in the source"
const :type, EntityType
const :confidence, Float, description: "Extraction confidence from 0.0 to 1.0"
const :start_offset, Integer, default: 0
end
input do
const :text, String
const :entity_types, T::Array[EntityType], default: [],
description: "Filter to these entity types; empty means all types"
end
output do
const :entities, T::Array[Entity]
const :total_found, Integer
end
end
# --- Signature with Union Types ---
class FlexibleClassification < DSPy::Signature
description "Classify input with flexible result type"
class Category < T::Enum
enums do
Technical = new('technical')
Business = new('business')
Personal = new('personal')
end
end
input do
const :text, String
end
output do
const :category, Category
const :result, T.any(Float, String),
description: "Numeric score or text explanation depending on classification"
const :confidence, Float
end
end
# --- Signature with Recursive Types ---
class DocumentParser < DSPy::Signature
description "Parse document into tree structure"
class NodeType < T::Enum
enums do
Heading = new('heading')
Paragraph = new('paragraph')
List = new('list')
CodeBlock = new('code_block')
end
end
class TreeNode < T::Struct
const :node_type, NodeType, description: "The type of document element"
const :text, String, default: "", description: "Text content of the node"
const :level, Integer, default: 0
const :children, T::Array[TreeNode], default: [] # Self-reference → $defs in JSON Schema
end
input do
const :html, String, description: "Raw HTML to parse"
end
output do
const :root, TreeNode
const :word_count, Integer
end
end
# The schema generator creates #/$defs/TreeNode references for recursive types,
# compatible with OpenAI and Gemini structured outputs.
# Use `default: []` instead of `T.nilable(T::Array[...])` for OpenAI compatibility.
# --- Vision Signature ---
class ImageAnalysis < DSPy::Signature
description "Analyze an image and answer questions about its content" description "Analyze an image and answer questions about its content"
input do input do
const :image, DSPy::Image, desc: "The image to analyze" const :image, DSPy::Image, description: "The image to analyze"
const :question, String, desc: "Question about the image content" const :question, String, description: "Question about the image content"
end end
output do output do
const :answer, String, desc: "Detailed answer to the question about the image" const :answer, String
const :confidence, Float, desc: "Confidence in the answer (0.0-1.0)" const :confidence, Float, description: "Confidence in the answer (0.0-1.0)"
end end
end end
# Example for complex analysis task # Vision usage:
class SentimentAnalysisSignature < DSPy::Signature # predictor = DSPy::Predict.new(ImageAnalysis)
description "Analyze the sentiment of text with nuanced emotion detection" # result = predictor.call(
input do
const :text, String, desc: "The text to analyze for sentiment"
const :context, String, desc: "Additional context about the text source or situation"
end
output do
const :sentiment, T.enum(["Positive", "Negative", "Neutral", "Mixed"]),
desc: "Overall sentiment - must be Positive, Negative, Neutral, or Mixed"
const :emotions, T::Array[String],
desc: "List of specific emotions detected (e.g., joy, anger, sadness, fear)"
const :intensity, T.enum(["Low", "Medium", "High"]),
desc: "Intensity of the detected sentiment"
const :confidence, Float,
desc: "Confidence in the sentiment classification (0.0-1.0)"
end
end
# Example for code generation task
class CodeGenerationSignature < DSPy::Signature
description "Generate Ruby code based on natural language requirements"
input do
const :requirements, String,
desc: "Natural language description of what the code should do"
const :constraints, String,
desc: "Any specific requirements or constraints (e.g., libraries to use, style preferences)"
end
output do
const :code, String,
desc: "Complete, working Ruby code that fulfills the requirements"
const :explanation, String,
desc: "Brief explanation of how the code works and any important design decisions"
const :dependencies, T::Array[String],
desc: "List of required gems or dependencies"
end
end
# Usage Examples:
#
# Basic usage with Predict:
# predictor = DSPy::Predict.new(ExampleSignature)
# result = predictor.forward(
# field_name: "example value",
# count: 5,
# score: 0.85,
# is_active: true,
# tags: ["tag1", "tag2"],
# priority: "High"
# )
# puts result[:result]
# puts result[:category]
# puts result[:confidence]
#
# With Chain of Thought reasoning:
# predictor = DSPy::ChainOfThought.new(SentimentAnalysisSignature)
# result = predictor.forward(
# text: "I absolutely love this product! It exceeded all my expectations.",
# context: "Product review on e-commerce site"
# )
# puts result[:reasoning] # See the LLM's step-by-step thinking
# puts result[:sentiment]
# puts result[:emotions]
#
# With Vision:
# predictor = DSPy::Predict.new(VisionExampleSignature)
# result = predictor.forward(
# image: DSPy::Image.from_file("path/to/image.jpg"), # image: DSPy::Image.from_file("path/to/image.jpg"),
# question: "What objects are visible in this image?" # question: "What objects are visible?"
# ) # )
# puts result[:answer] # result.answer # => "The image shows..."
# --- Accessing Schemas Programmatically ---
#
# SentimentAnalysis.input_json_schema # => { type: "object", properties: { ... } }
# SentimentAnalysis.output_json_schema # => { type: "object", properties: { ... } }
#
# # Field descriptions propagate to JSON Schema
# Entity.field_descriptions[:name] # => "The entity text as it appears in the source"
# Entity.field_descriptions[:confidence] # => "Extraction confidence from 0.0 to 1.0"

View File

@@ -1,265 +1,674 @@
# DSPy.rb Core Concepts # DSPy.rb Core Concepts
## Philosophy
DSPy.rb enables developers to **program LLMs, not prompt them**. Instead of manually crafting prompts, define application requirements through code using type-safe, composable modules.
## Signatures ## Signatures
Signatures define type-safe input/output contracts for LLM operations. They specify what data goes in and what data comes out, with runtime type checking. Signatures define the interface between application code and language models. They specify inputs, outputs, and a task description using Sorbet types for compile-time and runtime type safety.
### Basic Signature Structure ### Structure
```ruby ```ruby
class TaskSignature < DSPy::Signature class ClassifyEmail < DSPy::Signature
description "Brief description of what this signature does" description "Classify customer support emails by urgency and category"
input do input do
const :field_name, String, desc: "Description of this input field" const :subject, String
const :another_field, Integer, desc: "Another input field" const :body, String
end end
output do output do
const :result_field, String, desc: "Description of the output" const :category, String
const :confidence, Float, desc: "Confidence score (0.0-1.0)" const :urgency, String
end end
end end
``` ```
### Type Safety ### Supported Types
Signatures support Sorbet types including: | Type | JSON Schema | Notes |
- `String` - Text data |------|-------------|-------|
- `Integer`, `Float` - Numeric data | `String` | `string` | Required string |
- `T::Boolean` - Boolean values | `Integer` | `integer` | Whole numbers |
- `T::Array[Type]` - Arrays of specific types | `Float` | `number` | Decimal numbers |
- Custom enums and classes | `T::Boolean` | `boolean` | true/false |
| `T::Array[X]` | `array` | Typed arrays |
| `T::Hash[K, V]` | `object` | Typed key-value maps |
| `T.nilable(X)` | nullable | Optional fields |
| `Date` | `string` (ISO 8601) | Auto-converted |
| `DateTime` | `string` (ISO 8601) | Preserves timezone |
| `Time` | `string` (ISO 8601) | Converted to UTC |
### Date and Time Types
Date, DateTime, and Time fields serialize to ISO 8601 strings and auto-convert back to Ruby objects on output.
```ruby
class EventScheduler < DSPy::Signature
description "Schedule events based on requirements"
input do
const :start_date, Date # ISO 8601: YYYY-MM-DD
const :preferred_time, DateTime # ISO 8601 with timezone
const :deadline, Time # Converted to UTC
const :end_date, T.nilable(Date) # Optional date
end
output do
const :scheduled_date, Date # String from LLM, auto-converted to Date
const :event_datetime, DateTime # Preserves timezone info
const :created_at, Time # Converted to UTC
end
end
predictor = DSPy::Predict.new(EventScheduler)
result = predictor.call(
start_date: "2024-01-15",
preferred_time: "2024-01-15T10:30:45Z",
deadline: Time.now,
end_date: nil
)
result.scheduled_date.class # => Date
result.event_datetime.class # => DateTime
```
Timezone conventions follow ActiveRecord: Time objects convert to UTC, DateTime objects preserve timezone, Date objects are timezone-agnostic.
### Enums with T::Enum
Define constrained output values using `T::Enum` classes. Do not use inline `T.enum([...])` syntax.
```ruby
class SentimentAnalysis < DSPy::Signature
description "Analyze sentiment of text"
class Sentiment < T::Enum
enums do
Positive = new('positive')
Negative = new('negative')
Neutral = new('neutral')
end
end
input do
const :text, String
end
output do
const :sentiment, Sentiment
const :confidence, Float
end
end
predictor = DSPy::Predict.new(SentimentAnalysis)
result = predictor.call(text: "This product is amazing!")
result.sentiment # => #<Sentiment::Positive>
result.sentiment.serialize # => "positive"
result.confidence # => 0.92
```
Enum matching is case-insensitive. The LLM returning `"POSITIVE"` matches `new('positive')`.
### Default Values
Default values work on both inputs and outputs. Input defaults reduce caller boilerplate. Output defaults provide fallbacks when the LLM omits optional fields.
```ruby
class SmartSearch < DSPy::Signature
description "Search with intelligent defaults"
input do
const :query, String
const :max_results, Integer, default: 10
const :language, String, default: "English"
end
output do
const :results, T::Array[String]
const :total_found, Integer
const :cached, T::Boolean, default: false
end
end
search = DSPy::Predict.new(SmartSearch)
result = search.call(query: "Ruby programming")
# max_results defaults to 10, language defaults to "English"
# If LLM omits `cached`, it defaults to false
```
### Field Descriptions ### Field Descriptions
Always provide clear field descriptions using the `desc:` parameter. These descriptions: Add `description:` to any field to guide the LLM on expected content. These descriptions appear in the generated JSON schema sent to the model.
- Guide the LLM on expected input/output format
- Serve as documentation for developers ```ruby
- Improve prediction accuracy class ASTNode < T::Struct
const :node_type, String, description: "The type of AST node (heading, paragraph, code_block)"
const :text, String, default: "", description: "Text content of the node"
const :level, Integer, default: 0, description: "Heading level 1-6, only for heading nodes"
const :children, T::Array[ASTNode], default: []
end
ASTNode.field_descriptions[:node_type] # => "The type of AST node ..."
ASTNode.field_descriptions[:children] # => nil (no description set)
```
Field descriptions also work inside signature `input` and `output` blocks:
```ruby
class ExtractEntities < DSPy::Signature
description "Extract named entities from text"
input do
const :text, String, description: "Raw text to analyze"
const :language, String, default: "en", description: "ISO 639-1 language code"
end
output do
const :entities, T::Array[String], description: "List of extracted entity names"
const :count, Integer, description: "Total number of unique entities found"
end
end
```
### Schema Formats
DSPy.rb supports three schema formats for communicating type structure to LLMs.
#### JSON Schema (default)
Verbose but universally supported. Access via `YourSignature.output_json_schema`.
#### BAML Schema
Compact format that reduces schema tokens by 80-85%. Requires the `sorbet-baml` gem.
```ruby
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'],
schema_format: :baml
)
end
```
BAML applies only in Enhanced Prompting mode (`structured_outputs: false`). When `structured_outputs: true`, the provider receives JSON Schema directly.
#### TOON Schema + Data Format
Table-oriented text format that shrinks both schema definitions and prompt values.
```ruby
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'],
schema_format: :toon,
data_format: :toon
)
end
```
`schema_format: :toon` replaces the schema block in the system prompt. `data_format: :toon` renders input values and output templates inside `toon` fences. Only works with Enhanced Prompting mode. The `sorbet-toon` gem is included automatically as a dependency.
### Recursive Types
Structs that reference themselves produce `$defs` entries in the generated JSON schema, using `$ref` pointers to avoid infinite recursion.
```ruby
class ASTNode < T::Struct
const :node_type, String
const :text, String, default: ""
const :children, T::Array[ASTNode], default: []
end
```
The schema generator detects the self-reference in `T::Array[ASTNode]` and emits:
```json
{
"$defs": {
"ASTNode": { "type": "object", "properties": { ... } }
},
"properties": {
"children": {
"type": "array",
"items": { "$ref": "#/$defs/ASTNode" }
}
}
}
```
Access the schema with accumulated definitions via `YourSignature.output_json_schema_with_defs`.
### Union Types with T.any()
Specify fields that accept multiple types:
```ruby
output do
const :result, T.any(Float, String)
end
```
For struct unions, DSPy.rb automatically adds a `_type` discriminator field to each struct's JSON schema. The LLM returns `_type` in its response, and DSPy converts the hash to the correct struct instance.
```ruby
class CreateTask < T::Struct
const :title, String
const :priority, String
end
class DeleteTask < T::Struct
const :task_id, String
const :reason, T.nilable(String)
end
class TaskRouter < DSPy::Signature
description "Route user request to the appropriate task action"
input do
const :request, String
end
output do
const :action, T.any(CreateTask, DeleteTask)
end
end
result = DSPy::Predict.new(TaskRouter).call(request: "Create a task for Q4 review")
result.action.class # => CreateTask
result.action.title # => "Q4 Review"
```
Pattern matching works on the result:
```ruby
case result.action
when CreateTask then puts "Creating: #{result.action.title}"
when DeleteTask then puts "Deleting: #{result.action.task_id}"
end
```
Union types also work inside arrays for heterogeneous collections:
```ruby
output do
const :events, T::Array[T.any(LoginEvent, PurchaseEvent)]
end
```
Limit unions to 2-4 types for reliable LLM comprehension. Use clear struct names since they become the `_type` discriminator values.
---
## Modules ## Modules
Modules are composable building blocks that use signatures to perform LLM operations. They can be chained together to create complex workflows. Modules are composable building blocks that wrap predictors. Define a `forward` method; invoke the module with `.call()`.
### Basic Module Structure ### Basic Structure
```ruby ```ruby
class MyModule < DSPy::Module class SentimentAnalyzer < DSPy::Module
def initialize def initialize
super super
@predictor = DSPy::Predict.new(MySignature) @predictor = DSPy::Predict.new(SentimentSignature)
end end
def forward(input_hash) def forward(text:)
@predictor.forward(input_hash) @predictor.call(text: text)
end end
end end
analyzer = SentimentAnalyzer.new
result = analyzer.call(text: "I love this product!")
result.sentiment # => "positive"
result.confidence # => 0.9
``` ```
**API rules:**
- Invoke modules and predictors with `.call()`, not `.forward()`.
- Access result fields with `result.field`, not `result[:field]`.
### Module Composition ### Module Composition
Modules can call other modules to create pipelines: Combine multiple modules through explicit method calls in `forward`:
```ruby ```ruby
class ComplexWorkflow < DSPy::Module class DocumentProcessor < DSPy::Module
def initialize def initialize
super super
@step1 = FirstModule.new @classifier = DocumentClassifier.new
@step2 = SecondModule.new @summarizer = DocumentSummarizer.new
end end
def forward(input) def forward(document:)
result1 = @step1.forward(input) classification = @classifier.call(content: document)
result2 = @step2.forward(result1) summary = @summarizer.call(content: document)
result2
{
document_type: classification.document_type,
summary: summary.summary
}
end end
end end
``` ```
### Lifecycle Callbacks
Modules support `before`, `after`, and `around` callbacks on `forward`. Declare them as class-level macros referencing private methods.
#### Execution order
1. `before` callbacks (in registration order)
2. `around` callbacks (before `yield`)
3. `forward` method
4. `around` callbacks (after `yield`)
5. `after` callbacks (in registration order)
```ruby
class InstrumentedModule < DSPy::Module
before :setup_metrics
after :log_metrics
around :manage_context
def initialize
super
@predictor = DSPy::Predict.new(MySignature)
@metrics = {}
end
def forward(question:)
@predictor.call(question: question)
end
private
def setup_metrics
@metrics[:start_time] = Time.now
end
def manage_context
load_context
result = yield
save_context
result
end
def log_metrics
@metrics[:duration] = Time.now - @metrics[:start_time]
end
end
```
Multiple callbacks of the same type execute in registration order. Callbacks inherit from parent classes; parent callbacks run first.
#### Around callbacks
Around callbacks must call `yield` to execute the wrapped method and return the result:
```ruby
def with_retry
retries = 0
begin
yield
rescue StandardError => e
retries += 1
retry if retries < 3
raise e
end
end
```
### Instruction Update Contract
Teleprompters (GEPA, MIPROv2) require modules to expose immutable update hooks. Include `DSPy::Mixins::InstructionUpdatable` and implement `with_instruction` and `with_examples`, each returning a new instance:
```ruby
class SentimentPredictor < DSPy::Module
include DSPy::Mixins::InstructionUpdatable
def initialize
super
@predictor = DSPy::Predict.new(SentimentSignature)
end
def with_instruction(instruction)
clone = self.class.new
clone.instance_variable_set(:@predictor, @predictor.with_instruction(instruction))
clone
end
def with_examples(examples)
clone = self.class.new
clone.instance_variable_set(:@predictor, @predictor.with_examples(examples))
clone
end
end
```
If a module omits these hooks, teleprompters raise `DSPy::InstructionUpdateError` instead of silently mutating state.
---
## Predictors ## Predictors
Predictors are the core execution engines that take signatures and perform LLM inference. DSPy.rb provides several predictor types. Predictors are execution engines that take a signature and produce structured results from a language model. DSPy.rb provides four predictor types.
### Predict ### Predict
Basic LLM inference with type-safe inputs and outputs. Direct LLM call with typed input/output. Fastest option, lowest token usage.
```ruby ```ruby
predictor = DSPy::Predict.new(TaskSignature) classifier = DSPy::Predict.new(ClassifyText)
result = predictor.forward(field_name: "value", another_field: 42) result = classifier.call(text: "Technical document about APIs")
# Returns: { result_field: "...", confidence: 0.85 }
result.sentiment # => #<Sentiment::Positive>
result.topics # => ["APIs", "technical"]
result.confidence # => 0.92
``` ```
### ChainOfThought ### ChainOfThought
Automatically adds a reasoning field to the output, improving accuracy for complex tasks. Adds a `reasoning` field to the output automatically. The model generates step-by-step reasoning before the final answer. Do not define a `:reasoning` field in the signature output when using ChainOfThought.
```ruby ```ruby
class EmailClassificationSignature < DSPy::Signature class SolveMathProblem < DSPy::Signature
description "Classify customer support emails" description "Solve mathematical word problems step by step"
input do input do
const :email_subject, String const :problem, String
const :email_body, String
end end
output do output do
const :category, String # "Technical", "Billing", or "General" const :answer, String
const :priority, String # "High", "Medium", or "Low" # :reasoning is added automatically by ChainOfThought
end end
end end
predictor = DSPy::ChainOfThought.new(EmailClassificationSignature) solver = DSPy::ChainOfThought.new(SolveMathProblem)
result = predictor.forward( result = solver.call(problem: "Sarah has 15 apples. She gives 7 away and buys 12 more.")
email_subject: "Can't log in to my account",
email_body: "I've been trying to access my account for hours..." result.reasoning # => "Step by step: 15 - 7 = 8, then 8 + 12 = 20"
) result.answer # => "20 apples"
# Returns: {
# reasoning: "This appears to be a technical issue...",
# category: "Technical",
# priority: "High"
# }
``` ```
Use ChainOfThought for complex analysis, multi-step reasoning, or when explainability matters.
### ReAct ### ReAct
Tool-using agents with iterative reasoning. Enables autonomous problem-solving by allowing the LLM to use external tools. Reasoning + Action agent that uses tools in an iterative loop. Define tools by subclassing `DSPy::Tools::Base`. Group related tools with `DSPy::Tools::Toolset`.
```ruby ```ruby
class SearchTool < DSPy::Tool class WeatherTool < DSPy::Tools::Base
def call(query:) extend T::Sig
# Perform search and return results
{ results: search_database(query) } tool_name "weather"
tool_description "Get weather information for a location"
sig { params(location: String).returns(String) }
def call(location:)
{ location: location, temperature: 72, condition: "sunny" }.to_json
end end
end end
predictor = DSPy::ReAct.new( class TravelSignature < DSPy::Signature
TaskSignature, description "Help users plan travel"
tools: [SearchTool.new],
input do
const :destination, String
end
output do
const :recommendations, String
end
end
agent = DSPy::ReAct.new(
TravelSignature,
tools: [WeatherTool.new],
max_iterations: 5 max_iterations: 5
) )
result = agent.call(destination: "Tokyo, Japan")
result.recommendations # => "Visit Senso-ji Temple early morning..."
result.history # => Array of reasoning steps, actions, observations
result.iterations # => 3
result.tools_used # => ["weather"]
```
Use toolsets to expose multiple tool methods from a single class:
```ruby
text_tools = DSPy::Tools::TextProcessingToolset.to_tools
agent = DSPy::ReAct.new(MySignature, tools: text_tools)
``` ```
### CodeAct ### CodeAct
Dynamic code generation for solving problems programmatically. Requires the optional `dspy-code_act` gem. Think-Code-Observe agent that synthesizes and executes Ruby code. Ships as a separate gem.
```ruby ```ruby
predictor = DSPy::CodeAct.new(TaskSignature) # Gemfile
result = predictor.forward(task: "Calculate the factorial of 5") gem 'dspy-code_act', '~> 0.29'
# The LLM generates and executes Ruby code to solve the task
``` ```
## Multimodal Support ```ruby
programmer = DSPy::CodeAct.new(ProgrammingSignature, max_iterations: 10)
result = programmer.call(task: "Calculate the factorial of 20")
```
DSPy.rb supports vision capabilities across compatible models using the unified `DSPy::Image` interface. ### Predictor Comparison
| Predictor | Speed | Token Usage | Best For |
|-----------|-------|-------------|----------|
| Predict | Fastest | Low | Classification, extraction |
| ChainOfThought | Moderate | Medium-High | Complex reasoning, analysis |
| ReAct | Slower | High | Multi-step tasks with tools |
| CodeAct | Slowest | Very High | Dynamic programming, calculations |
### Concurrent Predictions
Process multiple independent predictions simultaneously using `Async::Barrier`:
```ruby ```ruby
class VisionSignature < DSPy::Signature require 'async'
description "Describe what's in an image" require 'async/barrier'
input do analyzer = DSPy::Predict.new(ContentAnalyzer)
const :image, DSPy::Image documents = ["Text one", "Text two", "Text three"]
const :question, String
Async do
barrier = Async::Barrier.new
tasks = documents.map do |doc|
barrier.async { analyzer.call(content: doc) }
end end
output do barrier.wait
const :description, String predictions = tasks.map(&:wait)
end
end
predictor = DSPy::Predict.new(VisionSignature) predictions.each { |p| puts p.sentiment }
result = predictor.forward(
image: DSPy::Image.from_file("path/to/image.jpg"),
question: "What objects are visible in this image?"
)
```
### Image Input Methods
```ruby
# From file path
DSPy::Image.from_file("path/to/image.jpg")
# From URL (OpenAI only)
DSPy::Image.from_url("https://example.com/image.jpg")
# From base64-encoded data
DSPy::Image.from_base64(base64_string, mime_type: "image/jpeg")
```
## Best Practices
### 1. Clear Signature Descriptions
Always provide clear, specific descriptions for signatures and fields:
```ruby
# Good
description "Classify customer support emails into Technical, Billing, or General categories"
# Avoid
description "Classify emails"
```
### 2. Type Safety
Use specific types rather than generic String when possible:
```ruby
# Good - Use enums for constrained outputs
output do
const :category, T.enum(["Technical", "Billing", "General"])
end
# Less ideal - Generic string
output do
const :category, String, desc: "Must be Technical, Billing, or General"
end end
``` ```
### 3. Composable Architecture Add `gem 'async', '~> 2.29'` to the Gemfile. Handle errors within each `barrier.async` block to prevent one failure from cancelling others:
Build complex workflows from simple, reusable modules:
```ruby ```ruby
class EmailPipeline < DSPy::Module barrier.async do
def initialize begin
super analyzer.call(content: doc)
@classifier = EmailClassifier.new rescue StandardError => e
@prioritizer = EmailPrioritizer.new nil
@responder = EmailResponder.new
end
def forward(email)
classification = @classifier.forward(email)
priority = @prioritizer.forward(classification)
@responder.forward(classification.merge(priority))
end end
end end
``` ```
### 4. Error Handling ### Few-Shot Examples and Instruction Tuning
Always handle potential type validation errors:
```ruby ```ruby
begin classifier = DSPy::Predict.new(SentimentAnalysis)
result = predictor.forward(input_data)
rescue DSPy::ValidationError => e examples = [
# Handle validation error DSPy::FewShotExample.new(
logger.error "Invalid output from LLM: #{e.message}" input: { text: "Love it!" },
output: { sentiment: "positive", confidence: 0.95 }
)
]
optimized = classifier.with_examples(examples)
tuned = classifier.with_instruction("Be precise and confident.")
```
---
## Type System
### Automatic Type Conversion
DSPy.rb v0.9.0+ automatically converts LLM JSON responses to typed Ruby objects:
- **Enums**: String values become `T::Enum` instances (case-insensitive)
- **Structs**: Nested hashes become `T::Struct` objects
- **Arrays**: Elements convert recursively
- **Defaults**: Missing fields use declared defaults
### Discriminators for Union Types
When a field uses `T.any()` with struct types, DSPy adds a `_type` field to each struct's schema. On deserialization, `_type` selects the correct struct class:
```json
{
"action": {
"_type": "CreateTask",
"title": "Review Q4 Report"
}
}
```
DSPy matches `"CreateTask"` against the union members and instantiates the correct struct. No manual discriminator field is needed.
### Recursive Types
Structs referencing themselves are supported. The schema generator tracks visited types and produces `$ref` pointers under `$defs`:
```ruby
class TreeNode < T::Struct
const :label, String
const :children, T::Array[TreeNode], default: []
end end
``` ```
## Limitations The generated schema uses `"$ref": "#/$defs/TreeNode"` for the children array items, preventing infinite schema expansion.
Current constraints to be aware of: ### Nesting Depth
- No streaming support (single-request processing only)
- Limited multimodal support through Ollama for local deployments - 1-2 levels: reliable across all providers.
- Vision capabilities vary by provider (see providers.md for compatibility matrix) - 3-4 levels: works but increases schema complexity.
- 5+ levels: may trigger OpenAI depth validation warnings and reduce LLM accuracy. Flatten deeply nested structures or split into multiple signatures.
### Tips
- Prefer `T::Array[X], default: []` over `T.nilable(T::Array[X])` -- the nilable form causes schema issues with OpenAI structured outputs.
- Use clear struct names for union types since they become `_type` discriminator values.
- Limit union types to 2-4 members for reliable model comprehension.
- Check schema compatibility with `DSPy::OpenAI::LM::SchemaConverter.validate_compatibility(schema)`.

View File

@@ -0,0 +1,366 @@
# DSPy.rb Observability
DSPy.rb provides an event-driven observability system built on OpenTelemetry. The system replaces monkey-patching with structured event emission, pluggable listeners, automatic span creation, and non-blocking Langfuse export.
## Event System
### Emitting Events
Emit structured events with `DSPy.event`:
```ruby
DSPy.event('lm.tokens', {
'gen_ai.system' => 'openai',
'gen_ai.request.model' => 'gpt-4',
input_tokens: 150,
output_tokens: 50,
total_tokens: 200
})
```
Event names are **strings** with dot-separated namespaces (e.g., `'llm.generate'`, `'react.iteration_complete'`, `'chain_of_thought.reasoning_complete'`). Do not use symbols for event names.
Attributes must be JSON-serializable. DSPy automatically merges context (trace ID, module stack) and creates OpenTelemetry spans.
### Global Subscriptions
Subscribe to events across the entire application with `DSPy.events.subscribe`:
```ruby
# Exact event name
subscription_id = DSPy.events.subscribe('lm.tokens') do |event_name, attrs|
puts "Tokens used: #{attrs[:total_tokens]}"
end
# Wildcard pattern -- matches llm.generate, llm.stream, etc.
DSPy.events.subscribe('llm.*') do |event_name, attrs|
track_llm_usage(attrs)
end
# Catch-all wildcard
DSPy.events.subscribe('*') do |event_name, attrs|
log_everything(event_name, attrs)
end
```
Use global subscriptions for cross-cutting concerns: observability exporters (Langfuse, Datadog), centralized logging, metrics collection.
### Module-Scoped Subscriptions
Declare listeners inside a `DSPy::Module` subclass. Subscriptions automatically scope to the module instance and its descendants:
```ruby
class ResearchReport < DSPy::Module
subscribe 'lm.tokens', :track_tokens, scope: :descendants
def initialize
super
@outliner = DSPy::Predict.new(OutlineSignature)
@writer = DSPy::Predict.new(SectionWriterSignature)
@token_count = 0
end
def forward(question:)
outline = @outliner.call(question: question)
outline.sections.map do |title|
draft = @writer.call(question: question, section_title: title)
{ title: title, body: draft.paragraph }
end
end
def track_tokens(_event, attrs)
@token_count += attrs.fetch(:total_tokens, 0)
end
end
```
The `scope:` parameter accepts:
- `:descendants` (default) -- receives events from the module **and** every nested module invoked inside it.
- `DSPy::Module::SubcriptionScope::SelfOnly` -- restricts delivery to events emitted by the module instance itself; ignores descendants.
Inspect active subscriptions with `registered_module_subscriptions`. Tear down with `unsubscribe_module_events`.
### Unsubscribe and Cleanup
Remove a global listener by subscription ID:
```ruby
id = DSPy.events.subscribe('llm.*') { |name, attrs| }
DSPy.events.unsubscribe(id)
```
Build tracker classes that manage their own subscription lifecycle:
```ruby
class TokenBudgetTracker
def initialize(budget:)
@budget = budget
@usage = 0
@subscriptions = []
@subscriptions << DSPy.events.subscribe('lm.tokens') do |_event, attrs|
@usage += attrs.fetch(:total_tokens, 0)
warn("Budget hit") if @usage >= @budget
end
end
def unsubscribe
@subscriptions.each { |id| DSPy.events.unsubscribe(id) }
@subscriptions.clear
end
end
```
### Clearing Listeners in Tests
Call `DSPy.events.clear_listeners` in `before`/`after` blocks to prevent cross-contamination between test cases:
```ruby
RSpec.configure do |config|
config.after(:each) { DSPy.events.clear_listeners }
end
```
## dspy-o11y Gems
Three gems compose the observability stack:
| Gem | Purpose |
|---|---|
| `dspy` | Core event bus (`DSPy.event`, `DSPy.events`) -- always available |
| `dspy-o11y` | OpenTelemetry spans, `AsyncSpanProcessor`, `DSPy::Context.with_span` helpers |
| `dspy-o11y-langfuse` | Langfuse adapter -- configures OTLP exporter targeting Langfuse endpoints |
### Installation
```ruby
# Gemfile
gem 'dspy'
gem 'dspy-o11y' # core spans + helpers
gem 'dspy-o11y-langfuse' # Langfuse/OpenTelemetry adapter (optional)
```
If the optional gems are absent, DSPy falls back to logging-only mode with no errors.
## Langfuse Integration
### Environment Variables
```bash
# Required
export LANGFUSE_PUBLIC_KEY=pk-lf-your-public-key
export LANGFUSE_SECRET_KEY=sk-lf-your-secret-key
# Optional (defaults to https://cloud.langfuse.com)
export LANGFUSE_HOST=https://us.cloud.langfuse.com
# Tuning (optional)
export DSPY_TELEMETRY_BATCH_SIZE=100 # spans per export batch (default 100)
export DSPY_TELEMETRY_QUEUE_SIZE=1000 # max queued spans (default 1000)
export DSPY_TELEMETRY_EXPORT_INTERVAL=60 # seconds between timed exports (default 60)
export DSPY_TELEMETRY_SHUTDOWN_TIMEOUT=10 # seconds to drain on shutdown (default 10)
```
### Automatic Configuration
Call `DSPy::Observability.configure!` once at boot (it is already called automatically when `require 'dspy'` runs and Langfuse env vars are present):
```ruby
require 'dspy'
# If LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are set,
# DSPy::Observability.configure! runs automatically and:
# 1. Configures the OpenTelemetry SDK with an OTLP exporter
# 2. Creates dual output: structured logs AND OpenTelemetry spans
# 3. Exports spans to Langfuse using proper authentication
# 4. Falls back gracefully if gems are missing
```
Verify status with `DSPy::Observability.enabled?`.
### Automatic Tracing
With observability enabled, every `DSPy::Module#forward` call, LM request, and tool invocation creates properly nested spans. Langfuse receives hierarchical traces:
```
Trace: abc-123-def
+-- ChainOfThought.forward [2000ms] (observation type: chain)
+-- llm.generate [1000ms] (observation type: generation)
Model: gpt-4-0613
Tokens: 100 in / 50 out / 150 total
```
DSPy maps module classes to Langfuse observation types automatically via `DSPy::ObservationType.for_module_class`:
| Module | Observation Type |
|---|---|
| `DSPy::LM` (raw chat) | `generation` |
| `DSPy::ChainOfThought` | `chain` |
| `DSPy::ReAct` | `agent` |
| Tool invocations | `tool` |
| Memory/retrieval | `retriever` |
| Embedding engines | `embedding` |
| Evaluation modules | `evaluator` |
| Generic operations | `span` |
## Score Reporting
### DSPy.score API
Report evaluation scores with `DSPy.score`:
```ruby
# Numeric (default)
DSPy.score('accuracy', 0.95)
# With comment
DSPy.score('relevance', 0.87, comment: 'High semantic similarity')
# Boolean
DSPy.score('is_valid', 1, data_type: DSPy::Scores::DataType::Boolean)
# Categorical
DSPy.score('sentiment', 'positive', data_type: DSPy::Scores::DataType::Categorical)
# Explicit trace binding
DSPy.score('accuracy', 0.95, trace_id: 'custom-trace-id')
```
Available data types: `DSPy::Scores::DataType::Numeric`, `::Boolean`, `::Categorical`.
### score.create Events
Every `DSPy.score` call emits a `'score.create'` event. Subscribe to react:
```ruby
DSPy.events.subscribe('score.create') do |event_name, attrs|
puts "#{attrs[:score_name]} = #{attrs[:score_value]}"
# Also available: attrs[:score_id], attrs[:score_data_type],
# attrs[:score_comment], attrs[:trace_id], attrs[:observation_id],
# attrs[:timestamp]
end
```
### Async Langfuse Export with DSPy::Scores::Exporter
Configure the exporter to send scores to Langfuse in the background:
```ruby
exporter = DSPy::Scores::Exporter.configure(
public_key: ENV['LANGFUSE_PUBLIC_KEY'],
secret_key: ENV['LANGFUSE_SECRET_KEY'],
host: 'https://cloud.langfuse.com'
)
# Scores are now exported automatically via a background Thread::Queue
DSPy.score('accuracy', 0.95)
# Shut down gracefully (waits up to 5 seconds by default)
exporter.shutdown
```
The exporter subscribes to `'score.create'` events internally, queues them for async processing, and retries with exponential backoff on failure.
### Automatic Export with DSPy::Evals
Pass `export_scores: true` to `DSPy::Evals` to export per-example scores and an aggregate batch score automatically:
```ruby
evaluator = DSPy::Evals.new(
program,
metric: my_metric,
export_scores: true,
score_name: 'qa_accuracy'
)
result = evaluator.evaluate(test_examples)
```
## DSPy::Context.with_span
Create manual spans for custom operations. Requires `dspy-o11y`.
```ruby
DSPy::Context.with_span(operation: 'custom.retrieval', 'retrieval.source' => 'pinecone') do |span|
results = pinecone_client.query(embedding)
span&.set_attribute('retrieval.count', results.size) if span
results
end
```
Pass semantic attributes as keyword arguments alongside `operation:`. The block receives an OpenTelemetry span object (or `nil` when observability is disabled). The span automatically nests under the current parent span and records `duration.ms`, `langfuse.observation.startTime`, and `langfuse.observation.endTime`.
Assign a Langfuse observation type to custom spans:
```ruby
DSPy::Context.with_span(
operation: 'evaluate.batch',
**DSPy::ObservationType::Evaluator.langfuse_attributes,
'batch.size' => examples.length
) do |span|
run_evaluation(examples)
end
```
Scores reported inside a `with_span` block automatically inherit the current trace context.
## Module Stack Metadata
When `DSPy::Module#forward` runs, the context layer maintains a module stack. Every event includes:
```ruby
{
module_path: [
{ id: "root_uuid", class: "DeepSearch", label: nil },
{ id: "planner_uuid", class: "DSPy::Predict", label: "planner" }
],
module_root: { id: "root_uuid", class: "DeepSearch", label: nil },
module_leaf: { id: "planner_uuid", class: "DSPy::Predict", label: "planner" },
module_scope: {
ancestry_token: "root_uuid>planner_uuid",
depth: 2
}
}
```
| Key | Meaning |
|---|---|
| `module_path` | Ordered array of `{id, class, label}` entries from root to leaf |
| `module_root` | The outermost module in the current call chain |
| `module_leaf` | The innermost (currently executing) module |
| `module_scope.ancestry_token` | Stable string of joined UUIDs representing the nesting path |
| `module_scope.depth` | Integer depth of the current module in the stack |
Labels are set via `module_scope_label=` on a module instance or derived automatically from named predictors. Use this metadata to power Langfuse filters, scoped metrics, or custom event routing.
## Dedicated Export Worker
The `DSPy::Observability::AsyncSpanProcessor` (from `dspy-o11y`) keeps telemetry export off the hot path:
- Runs on a `Concurrent::SingleThreadExecutor` -- LLM workflows never compete with OTLP networking.
- Buffers finished spans in a `Thread::Queue` (max size configurable via `DSPY_TELEMETRY_QUEUE_SIZE`).
- Drains spans in batches of `DSPY_TELEMETRY_BATCH_SIZE` (default 100). When the queue reaches batch size, an immediate async export fires.
- A background timer thread triggers periodic export every `DSPY_TELEMETRY_EXPORT_INTERVAL` seconds (default 60).
- Applies exponential backoff (`0.1 * 2^attempt` seconds) on export failures, up to `DEFAULT_MAX_RETRIES` (3).
- On shutdown, flushes all remaining spans within `DSPY_TELEMETRY_SHUTDOWN_TIMEOUT` seconds, then terminates the executor.
- Drops the oldest span when the queue is full, logging `'observability.span_dropped'`.
No application code interacts with the processor directly. Configure it entirely through environment variables.
## Built-in Events Reference
| Event Name | Emitted By | Key Attributes |
|---|---|---|
| `lm.tokens` | `DSPy::LM` | `gen_ai.system`, `gen_ai.request.model`, `input_tokens`, `output_tokens`, `total_tokens` |
| `chain_of_thought.reasoning_complete` | `DSPy::ChainOfThought` | `dspy.signature`, `cot.reasoning_steps`, `cot.reasoning_length`, `cot.has_reasoning` |
| `react.iteration_complete` | `DSPy::ReAct` | `iteration`, `thought`, `action`, `observation` |
| `codeact.iteration_complete` | `dspy-code_act` gem | `iteration`, `code_executed`, `execution_result` |
| `optimization.trial_complete` | Teleprompters (MIPROv2) | `trial_number`, `score` |
| `score.create` | `DSPy.score` | `score_name`, `score_value`, `score_data_type`, `trace_id` |
| `span.start` | `DSPy::Context.with_span` | `trace_id`, `span_id`, `parent_span_id`, `operation` |
## Best Practices
- Use dot-separated string names for events. Follow OpenTelemetry `gen_ai.*` conventions for LLM attributes.
- Always call `unsubscribe` (or `unsubscribe_module_events` for scoped subscriptions) when a tracker is no longer needed to prevent memory leaks.
- Call `DSPy.events.clear_listeners` in test teardown to avoid cross-contamination.
- Wrap risky listener logic in a rescue block. The event system isolates listener failures, but explicit rescue prevents silent swallowing of domain errors.
- Prefer module-scoped `subscribe` for agent internals. Reserve global `DSPy.events.subscribe` for infrastructure-level concerns.

View File

@@ -1,338 +1,418 @@
# DSPy.rb LLM Providers # DSPy.rb LLM Providers
## Supported Providers ## Adapter Architecture
DSPy.rb provides unified support across multiple LLM providers through adapter gems that automatically load when installed. DSPy.rb ships provider SDKs as separate adapter gems. Install only the adapters the project needs. Each adapter gem depends on the official SDK for its provider and auto-loads when present -- no explicit `require` necessary.
### Provider Overview
- **OpenAI**: GPT-4, GPT-4o, GPT-4o-mini, GPT-3.5-turbo
- **Anthropic**: Claude 3 family (Sonnet, Opus, Haiku), Claude 3.5 Sonnet
- **Google Gemini**: Gemini 1.5 Pro, Gemini 1.5 Flash, other versions
- **Ollama**: Local model support via OpenAI compatibility layer
- **OpenRouter**: Unified multi-provider API for 200+ models
## Configuration
### Basic Setup
```ruby ```ruby
require 'dspy' # Gemfile
gem 'dspy' # core framework (no provider SDKs)
DSPy.configure do |c| gem 'dspy-openai' # OpenAI, OpenRouter, Ollama
c.lm = DSPy::LM.new('provider/model-name', api_key: ENV['API_KEY']) gem 'dspy-anthropic' # Claude
end gem 'dspy-gemini' # Gemini
gem 'dspy-ruby_llm' # RubyLLM unified adapter (12+ providers)
``` ```
### OpenAI Configuration ---
**Required gem**: `dspy-openai` ## Per-Provider Adapters
### dspy-openai
Covers any endpoint that speaks the OpenAI chat-completions protocol: OpenAI itself, OpenRouter, and Ollama.
**SDK dependency:** `openai ~> 0.17`
```ruby ```ruby
DSPy.configure do |c| # OpenAI
# GPT-4o Mini (recommended for development) lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
# GPT-4o (more capable) # OpenRouter -- access 200+ models behind a single key
c.lm = DSPy::LM.new('openai/gpt-4o', api_key: ENV['OPENAI_API_KEY']) lm = DSPy::LM.new('openrouter/x-ai/grok-4-fast:free',
api_key: ENV['OPENROUTER_API_KEY']
)
# GPT-4 Turbo # Ollama -- local models, no API key required
c.lm = DSPy::LM.new('openai/gpt-4-turbo', api_key: ENV['OPENAI_API_KEY']) lm = DSPy::LM.new('ollama/llama3.2')
end
```
**Environment variable**: `OPENAI_API_KEY` # Remote Ollama instance
lm = DSPy::LM.new('ollama/llama3.2',
### Anthropic Configuration base_url: 'https://my-ollama.example.com/v1',
api_key: 'optional-auth-token'
**Required gem**: `dspy-anthropic`
```ruby
DSPy.configure do |c|
# Claude 3.5 Sonnet (latest, most capable)
c.lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'])
# Claude 3 Opus (most capable in Claude 3 family)
c.lm = DSPy::LM.new('anthropic/claude-3-opus-20240229',
api_key: ENV['ANTHROPIC_API_KEY'])
# Claude 3 Sonnet (balanced)
c.lm = DSPy::LM.new('anthropic/claude-3-sonnet-20240229',
api_key: ENV['ANTHROPIC_API_KEY'])
# Claude 3 Haiku (fast, cost-effective)
c.lm = DSPy::LM.new('anthropic/claude-3-haiku-20240307',
api_key: ENV['ANTHROPIC_API_KEY'])
end
```
**Environment variable**: `ANTHROPIC_API_KEY`
### Google Gemini Configuration
**Required gem**: `dspy-gemini`
```ruby
DSPy.configure do |c|
# Gemini 1.5 Pro (most capable)
c.lm = DSPy::LM.new('gemini/gemini-1.5-pro',
api_key: ENV['GOOGLE_API_KEY'])
# Gemini 1.5 Flash (faster, cost-effective)
c.lm = DSPy::LM.new('gemini/gemini-1.5-flash',
api_key: ENV['GOOGLE_API_KEY'])
end
```
**Environment variable**: `GOOGLE_API_KEY` or `GEMINI_API_KEY`
### Ollama Configuration
**Required gem**: None (uses OpenAI compatibility layer)
```ruby
DSPy.configure do |c|
# Local Ollama instance
c.lm = DSPy::LM.new('ollama/llama3.1',
base_url: 'http://localhost:11434')
# Other Ollama models
c.lm = DSPy::LM.new('ollama/mistral')
c.lm = DSPy::LM.new('ollama/codellama')
end
```
**Note**: Ensure Ollama is running locally: `ollama serve`
### OpenRouter Configuration
**Required gem**: `dspy-openai` (uses OpenAI adapter)
```ruby
DSPy.configure do |c|
# Access 200+ models through OpenRouter
c.lm = DSPy::LM.new('openrouter/anthropic/claude-3.5-sonnet',
api_key: ENV['OPENROUTER_API_KEY'],
base_url: 'https://openrouter.ai/api/v1')
# Other examples
c.lm = DSPy::LM.new('openrouter/google/gemini-pro')
c.lm = DSPy::LM.new('openrouter/meta-llama/llama-3.1-70b-instruct')
end
```
**Environment variable**: `OPENROUTER_API_KEY`
## Provider Compatibility Matrix
### Feature Support
| Feature | OpenAI | Anthropic | Gemini | Ollama |
|---------|--------|-----------|--------|--------|
| Structured Output | ✅ | ✅ | ✅ | ✅ |
| Vision (Images) | ✅ | ✅ | ✅ | ⚠️ Limited |
| Image URLs | ✅ | ❌ | ❌ | ❌ |
| Tool Calling | ✅ | ✅ | ✅ | Varies |
| Streaming | ❌ | ❌ | ❌ | ❌ |
| Function Calling | ✅ | ✅ | ✅ | Varies |
**Legend**: ✅ Full support | ⚠️ Partial support | ❌ Not supported
### Vision Capabilities
**Image URLs**: Only OpenAI supports direct URL references. For other providers, load images as base64 or from files.
```ruby
# OpenAI - supports URLs
DSPy::Image.from_url("https://example.com/image.jpg")
# Anthropic, Gemini - use file or base64
DSPy::Image.from_file("path/to/image.jpg")
DSPy::Image.from_base64(base64_data, mime_type: "image/jpeg")
```
**Ollama**: Limited multimodal functionality. Check specific model capabilities.
## Advanced Configuration
### Custom Parameters
Pass provider-specific parameters during configuration:
```ruby
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o',
api_key: ENV['OPENAI_API_KEY'],
temperature: 0.7,
max_tokens: 2000,
top_p: 0.9
)
end
```
### Multiple Providers
Use different models for different tasks:
```ruby
# Fast model for simple tasks
fast_lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
# Powerful model for complex tasks
powerful_lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
api_key: ENV['ANTHROPIC_API_KEY'])
# Use different models in different modules
class SimpleClassifier < DSPy::Module
def initialize
super
DSPy.configure { |c| c.lm = fast_lm }
@predictor = DSPy::Predict.new(SimpleSignature)
end
end
class ComplexAnalyzer < DSPy::Module
def initialize
super
DSPy.configure { |c| c.lm = powerful_lm }
@predictor = DSPy::ChainOfThought.new(ComplexSignature)
end
end
```
### Per-Request Configuration
Override configuration for specific predictions:
```ruby
predictor = DSPy::Predict.new(MySignature)
# Use default configuration
result1 = predictor.forward(input: "data")
# Override temperature for this request
result2 = predictor.forward(
input: "data",
config: { temperature: 0.2 } # More deterministic
) )
``` ```
## Cost Optimization All three sub-adapters share the same request handling, structured-output support, and error reporting. Swap providers without changing higher-level DSPy code.
### Model Selection Strategy For OpenRouter models that lack native structured-output support, disable it explicitly:
1. **Development**: Use cheaper, faster models (gpt-4o-mini, claude-3-haiku, gemini-1.5-flash)
2. **Production Simple Tasks**: Continue with cheaper models if quality is sufficient
3. **Production Complex Tasks**: Upgrade to more capable models (gpt-4o, claude-3.5-sonnet, gemini-1.5-pro)
4. **Local Development**: Use Ollama for privacy and zero API costs
### Example Cost-Conscious Setup
```ruby ```ruby
# Development environment lm = DSPy::LM.new('openrouter/deepseek/deepseek-chat-v3.1:free',
if Rails.env.development? api_key: ENV['OPENROUTER_API_KEY'],
DSPy.configure do |c| structured_outputs: false
c.lm = DSPy::LM.new('ollama/llama3.1') # Free, local )
end ```
elsif Rails.env.test?
DSPy.configure do |c| ### dspy-anthropic
c.lm = DSPy::LM.new('openai/gpt-4o-mini', # Cheap for testing
api_key: ENV['OPENAI_API_KEY']) Provides the Claude adapter. Install it for any `anthropic/*` model id.
end
else # production **SDK dependency:** `anthropic ~> 1.12`
DSPy.configure do |c|
c.lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022', ```ruby
api_key: ENV['ANTHROPIC_API_KEY']) lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514',
end api_key: ENV['ANTHROPIC_API_KEY']
)
```
Structured outputs default to tool-based JSON extraction (`structured_outputs: true`). Set `structured_outputs: false` to use enhanced-prompting extraction instead.
```ruby
# Tool-based extraction (default, most reliable)
lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514',
api_key: ENV['ANTHROPIC_API_KEY'],
structured_outputs: true
)
# Enhanced prompting extraction
lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514',
api_key: ENV['ANTHROPIC_API_KEY'],
structured_outputs: false
)
```
### dspy-gemini
Provides the Gemini adapter. Install it for any `gemini/*` model id.
**SDK dependency:** `gemini-ai ~> 4.3`
```ruby
lm = DSPy::LM.new('gemini/gemini-2.5-flash',
api_key: ENV['GEMINI_API_KEY']
)
```
**Environment variable:** `GEMINI_API_KEY` (also accepts `GOOGLE_API_KEY`).
---
## RubyLLM Unified Adapter
The `dspy-ruby_llm` gem provides a single adapter that routes to 12+ providers through [RubyLLM](https://rubyllm.com). Use it when a project talks to multiple providers or needs access to Bedrock, VertexAI, DeepSeek, or Mistral without dedicated adapter gems.
**SDK dependency:** `ruby_llm ~> 1.3`
### Model ID Format
Prefix every model id with `ruby_llm/`:
```ruby
lm = DSPy::LM.new('ruby_llm/gpt-4o-mini')
lm = DSPy::LM.new('ruby_llm/claude-sonnet-4-20250514')
lm = DSPy::LM.new('ruby_llm/gemini-2.5-flash')
```
The adapter detects the provider from RubyLLM's model registry automatically. For models not in the registry, pass `provider:` explicitly:
```ruby
lm = DSPy::LM.new('ruby_llm/llama3.2', provider: 'ollama')
lm = DSPy::LM.new('ruby_llm/anthropic/claude-3-opus',
api_key: ENV['OPENROUTER_API_KEY'],
provider: 'openrouter'
)
```
### Using Existing RubyLLM Configuration
When RubyLLM is already configured globally, omit the `api_key:` argument. DSPy reuses the global config automatically:
```ruby
RubyLLM.configure do |config|
config.openai_api_key = ENV['OPENAI_API_KEY']
config.anthropic_api_key = ENV['ANTHROPIC_API_KEY']
end
# No api_key needed -- picks up the global config
DSPy.configure do |c|
c.lm = DSPy::LM.new('ruby_llm/gpt-4o-mini')
end end
``` ```
## Provider-Specific Best Practices When an `api_key:` (or any of `base_url:`, `timeout:`, `max_retries:`) is passed, DSPy creates a **scoped context** instead of reusing the global config.
### OpenAI ### Cloud-Hosted Providers (Bedrock, VertexAI)
- Use `gpt-4o-mini` for development and simple tasks Configure RubyLLM globally first, then reference the model:
- Use `gpt-4o` for production complex tasks
- Best vision support including URL loading
- Excellent function calling capabilities
### Anthropic
- Claude 3.5 Sonnet is currently the most capable model
- Excellent for complex reasoning and analysis
- Strong safety features and helpful outputs
- Requires base64 for images (no URL support)
### Google Gemini
- Gemini 1.5 Pro for complex tasks, Flash for speed
- Strong multimodal capabilities
- Good balance of cost and performance
- Requires base64 for images
### Ollama
- Best for privacy-sensitive applications
- Zero API costs
- Requires local hardware resources
- Limited multimodal support depending on model
- Good for development and testing
## Troubleshooting
### API Key Issues
```ruby ```ruby
# Verify API key is set # AWS Bedrock
if ENV['OPENAI_API_KEY'].nil? RubyLLM.configure do |c|
raise "OPENAI_API_KEY environment variable not set" c.bedrock_api_key = ENV['AWS_ACCESS_KEY_ID']
c.bedrock_secret_key = ENV['AWS_SECRET_ACCESS_KEY']
c.bedrock_region = 'us-east-1'
end end
lm = DSPy::LM.new('ruby_llm/anthropic.claude-3-5-sonnet', provider: 'bedrock')
# Test connection # Google VertexAI
begin RubyLLM.configure do |c|
DSPy.configure { |c| c.lm = DSPy::LM.new('openai/gpt-4o-mini', c.vertexai_project_id = 'your-project-id'
api_key: ENV['OPENAI_API_KEY']) } c.vertexai_location = 'us-central1'
predictor = DSPy::Predict.new(TestSignature)
predictor.forward(test: "data")
puts "✅ Connection successful"
rescue => e
puts "❌ Connection failed: #{e.message}"
end end
lm = DSPy::LM.new('ruby_llm/gemini-pro', provider: 'vertexai')
``` ```
### Rate Limiting ### Supported Providers Table
Handle rate limits gracefully: | Provider | Example Model ID | Notes |
|-------------|--------------------------------------------|---------------------------------|
| OpenAI | `ruby_llm/gpt-4o-mini` | Auto-detected from registry |
| Anthropic | `ruby_llm/claude-sonnet-4-20250514` | Auto-detected from registry |
| Gemini | `ruby_llm/gemini-2.5-flash` | Auto-detected from registry |
| DeepSeek | `ruby_llm/deepseek-chat` | Auto-detected from registry |
| Mistral | `ruby_llm/mistral-large` | Auto-detected from registry |
| Ollama | `ruby_llm/llama3.2` | Use `provider: 'ollama'` |
| AWS Bedrock | `ruby_llm/anthropic.claude-3-5-sonnet` | Configure RubyLLM globally |
| VertexAI | `ruby_llm/gemini-pro` | Configure RubyLLM globally |
| OpenRouter | `ruby_llm/anthropic/claude-3-opus` | Use `provider: 'openrouter'` |
| Perplexity | `ruby_llm/llama-3.1-sonar-large` | Use `provider: 'perplexity'` |
| GPUStack | `ruby_llm/model-name` | Use `provider: 'gpustack'` |
---
## Rails Initializer Pattern
Configure DSPy inside an `after_initialize` block so Rails credentials and environment are fully loaded:
```ruby ```ruby
def call_with_retry(predictor, input, max_retries: 3) # config/initializers/dspy.rb
retries = 0 Rails.application.config.after_initialize do
begin return if Rails.env.test? # skip in test -- use VCR cassettes instead
predictor.forward(input)
rescue RateLimitError => e DSPy.configure do |config|
retries += 1 config.lm = DSPy::LM.new(
if retries < max_retries 'openai/gpt-4o-mini',
sleep(2 ** retries) # Exponential backoff api_key: Rails.application.credentials.openai_api_key,
retry structured_outputs: true
)
config.logger = if Rails.env.production?
Dry.Logger(:dspy, formatter: :json) do |logger|
logger.add_backend(stream: Rails.root.join("log/dspy.log"))
end
else else
raise Dry.Logger(:dspy) do |logger|
logger.add_backend(level: :debug, stream: $stdout)
end
end end
end end
end end
``` ```
### Model Not Found Key points:
Ensure the correct gem is installed: - Wrap in `after_initialize` so `Rails.application.credentials` is available.
- Return early in the test environment. Rely on VCR cassettes for deterministic LLM responses.
- Set `structured_outputs: true` (the default) for provider-native JSON extraction.
- Use `Dry.Logger` with `:json` formatter in production for structured log parsing.
---
## Fiber-Local LM Context
`DSPy.with_lm` sets a temporary language-model override scoped to the current Fiber. Every predictor call inside the block uses the override; outside the block the previous LM takes effect again.
```ruby
fast = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
powerful = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY'])
classifier = Classifier.new
# Uses the global LM
result = classifier.call(text: "Hello")
# Temporarily switch to the fast model
DSPy.with_lm(fast) do
result = classifier.call(text: "Hello") # uses gpt-4o-mini
end
# Temporarily switch to the powerful model
DSPy.with_lm(powerful) do
result = classifier.call(text: "Hello") # uses claude-sonnet-4
end
```
### LM Resolution Hierarchy
DSPy resolves the active language model in this order:
1. **Instance-level LM** -- set directly on a module instance via `configure`
2. **Fiber-local LM** -- set via `DSPy.with_lm`
3. **Global LM** -- set via `DSPy.configure`
Instance-level configuration always wins, even inside a `DSPy.with_lm` block:
```ruby
classifier = Classifier.new
classifier.configure { |c| c.lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY']) }
fast = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
DSPy.with_lm(fast) do
classifier.call(text: "Test") # still uses claude-sonnet-4 (instance-level wins)
end
```
### configure_predictor for Fine-Grained Agent Control
Complex agents (`ReAct`, `CodeAct`, `DeepResearch`, `DeepSearch`) contain internal predictors. Use `configure` for a blanket override and `configure_predictor` to target a specific sub-predictor:
```ruby
agent = DSPy::ReAct.new(MySignature, tools: tools)
# Set a default LM for the agent and all its children
agent.configure { |c| c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY']) }
# Override just the reasoning predictor with a more capable model
agent.configure_predictor('thought_generator') do |c|
c.lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY'])
end
result = agent.call(question: "Summarize the report")
```
Both methods support chaining:
```ruby
agent
.configure { |c| c.lm = cheap_model }
.configure_predictor('thought_generator') { |c| c.lm = expensive_model }
```
#### Available Predictors by Agent Type
| Agent | Internal Predictors |
|----------------------|------------------------------------------------------------------|
| `DSPy::ReAct` | `thought_generator`, `observation_processor` |
| `DSPy::CodeAct` | `code_generator`, `observation_processor` |
| `DSPy::DeepResearch` | `planner`, `synthesizer`, `qa_reviewer`, `reporter` |
| `DSPy::DeepSearch` | `seed_predictor`, `search_predictor`, `reader_predictor`, `reason_predictor` |
#### Propagation Rules
- Configuration propagates recursively to children and grandchildren.
- Children with an already-configured LM are **not** overwritten by a later parent `configure` call.
- Configure the parent first, then override specific children.
---
## Feature-Flagged Model Selection
Use a `FeatureFlags` module backed by ENV vars to centralize model selection. Each tool or agent reads its model from the flags, falling back to a global default.
```ruby
module FeatureFlags
module_function
def default_model
ENV.fetch('DSPY_DEFAULT_MODEL', 'openai/gpt-4o-mini')
end
def default_api_key
ENV.fetch('DSPY_DEFAULT_API_KEY') { ENV.fetch('OPENAI_API_KEY', nil) }
end
def model_for(tool_name)
env_key = "DSPY_MODEL_#{tool_name.upcase}"
ENV.fetch(env_key, default_model)
end
def api_key_for(tool_name)
env_key = "DSPY_API_KEY_#{tool_name.upcase}"
ENV.fetch(env_key, default_api_key)
end
end
```
### Per-Tool Model Override
Override an individual tool's model without touching application code:
```bash ```bash
# For OpenAI # .env
gem install dspy-openai DSPY_DEFAULT_MODEL=openai/gpt-4o-mini
DSPY_DEFAULT_API_KEY=sk-...
# For Anthropic # Override the classifier to use Claude
gem install dspy-anthropic DSPY_MODEL_CLASSIFIER=anthropic/claude-sonnet-4-20250514
DSPY_API_KEY_CLASSIFIER=sk-ant-...
# For Gemini # Override the summarizer to use Gemini
gem install dspy-gemini DSPY_MODEL_SUMMARIZER=gemini/gemini-2.5-flash
DSPY_API_KEY_SUMMARIZER=...
``` ```
Wire each agent to its flag at initialization:
```ruby
class ClassifierAgent < DSPy::Module
def initialize
super
model = FeatureFlags.model_for('classifier')
api_key = FeatureFlags.api_key_for('classifier')
@predictor = DSPy::Predict.new(ClassifySignature)
configure { |c| c.lm = DSPy::LM.new(model, api_key: api_key) }
end
def forward(text:)
@predictor.call(text: text)
end
end
```
This pattern keeps model routing declarative and avoids scattering `DSPy::LM.new` calls across the codebase.
---
## Compatibility Matrix
Feature support across direct adapter gems. All features listed assume `structured_outputs: true` (the default).
| Feature | OpenAI | Anthropic | Gemini | Ollama | OpenRouter | RubyLLM |
|----------------------|--------|-----------|--------|----------|------------|-------------|
| Structured Output | Native JSON mode | Tool-based extraction | Native JSON schema | OpenAI-compatible JSON | Varies by model | Via `with_schema` |
| Vision (Images) | File + URL | File + Base64 | File + Base64 | Limited | Varies | Delegates to underlying provider |
| Image URLs | Yes | No | No | No | Varies | Depends on provider |
| Tool Calling | Yes | Yes | Yes | Varies | Varies | Yes |
| Streaming | Yes | Yes | Yes | Yes | Yes | Yes |
**Notes:**
- **Structured Output** is enabled by default on every adapter. Set `structured_outputs: false` to fall back to enhanced-prompting extraction.
- **Vision / Image URLs:** Only OpenAI supports passing a URL directly. For Anthropic and Gemini, load images from file or Base64:
```ruby
DSPy::Image.from_url("https://example.com/img.jpg") # OpenAI only
DSPy::Image.from_file("path/to/image.jpg") # all providers
DSPy::Image.from_base64(data, mime_type: "image/jpeg") # all providers
```
- **RubyLLM** delegates to the underlying provider, so feature support matches the provider column in the table.
### Choosing an Adapter Strategy
| Scenario | Recommended Adapter |
|-------------------------------------------|--------------------------------|
| Single provider (OpenAI, Claude, or Gemini) | Dedicated gem (`dspy-openai`, `dspy-anthropic`, `dspy-gemini`) |
| Multi-provider with per-agent model routing | `dspy-ruby_llm` |
| AWS Bedrock or Google VertexAI | `dspy-ruby_llm` |
| Local development with Ollama | `dspy-openai` (Ollama sub-adapter) or `dspy-ruby_llm` |
| OpenRouter for cost optimization | `dspy-openai` (OpenRouter sub-adapter) |
### Current Recommended Models
| Provider | Model ID | Use Case |
|-----------|---------------------------------------|-----------------------|
| OpenAI | `openai/gpt-4o-mini` | Fast, cost-effective |
| Anthropic | `anthropic/claude-sonnet-4-20250514` | Balanced reasoning |
| Gemini | `gemini/gemini-2.5-flash` | Fast, cost-effective |
| Ollama | `ollama/llama3.2` | Local, zero API cost |

View File

@@ -0,0 +1,502 @@
# DSPy.rb Toolsets
## Tools::Base
`DSPy::Tools::Base` is the base class for single-purpose tools. Each subclass exposes one operation to an LLM agent through a `call` method.
### Defining a Tool
Set the tool's identity with the `tool_name` and `tool_description` class-level DSL methods. Define the `call` instance method with a Sorbet `sig` declaration so DSPy.rb can generate the JSON schema the LLM uses to invoke the tool.
```ruby
class WeatherLookup < DSPy::Tools::Base
extend T::Sig
tool_name "weather_lookup"
tool_description "Look up current weather for a given city"
sig { params(city: String, units: T.nilable(String)).returns(String) }
def call(city:, units: nil)
# Fetch weather data and return a string summary
"72F and sunny in #{city}"
end
end
```
Key points:
- Inherit from `DSPy::Tools::Base`, not `DSPy::Tool`.
- Use `tool_name` (class method) to set the name the LLM sees. Without it, the class name is lowercased as a fallback.
- Use `tool_description` (class method) to set the human-readable description surfaced in the tool schema.
- The `call` method must use **keyword arguments**. Positional arguments are supported but keyword arguments produce better schemas.
- Always attach a Sorbet `sig` to `call`. Without a signature, the generated schema has empty properties and the LLM cannot determine parameter types.
### Schema Generation
`call_schema_object` introspects the Sorbet signature on `call` and returns a hash representing the JSON Schema `parameters` object:
```ruby
WeatherLookup.call_schema_object
# => {
# type: "object",
# properties: {
# city: { type: "string", description: "Parameter city" },
# units: { type: "string", description: "Parameter units (optional)" }
# },
# required: ["city"]
# }
```
`call_schema` wraps this in the full LLM tool-calling format:
```ruby
WeatherLookup.call_schema
# => {
# type: "function",
# function: {
# name: "call",
# description: "Call the WeatherLookup tool",
# parameters: { ... }
# }
# }
```
### Using Tools with ReAct
Pass tool instances in an array to `DSPy::ReAct`:
```ruby
agent = DSPy::ReAct.new(
MySignature,
tools: [WeatherLookup.new, AnotherTool.new]
)
result = agent.call(question: "What is the weather in Berlin?")
puts result.answer
```
Access output fields with dot notation (`result.answer`), not hash access (`result[:answer]`).
---
## Tools::Toolset
`DSPy::Tools::Toolset` groups multiple related methods into a single class. Each exposed method becomes an independent tool from the LLM's perspective.
### Defining a Toolset
```ruby
class DatabaseToolset < DSPy::Tools::Toolset
extend T::Sig
toolset_name "db"
tool :query, description: "Run a read-only SQL query"
tool :insert, description: "Insert a record into a table"
tool :delete, description: "Delete a record by ID"
sig { params(sql: String).returns(String) }
def query(sql:)
# Execute read query
end
sig { params(table: String, data: T::Hash[String, String]).returns(String) }
def insert(table:, data:)
# Insert record
end
sig { params(table: String, id: Integer).returns(String) }
def delete(table:, id:)
# Delete record
end
end
```
### DSL Methods
**`toolset_name(name)`** -- Set the prefix for all generated tool names. If omitted, the class name minus `Toolset` suffix is lowercased (e.g., `DatabaseToolset` becomes `database`).
```ruby
toolset_name "db"
# tool :query produces a tool named "db_query"
```
**`tool(method_name, tool_name:, description:)`** -- Expose a method as a tool.
- `method_name` (Symbol, required) -- the instance method to expose.
- `tool_name:` (String, optional) -- override the default `<toolset_name>_<method_name>` naming.
- `description:` (String, optional) -- description shown to the LLM. Defaults to a humanized version of the method name.
```ruby
tool :word_count, tool_name: "text_wc", description: "Count lines, words, and characters"
# Produces a tool named "text_wc" instead of "text_word_count"
```
### Converting to a Tool Array
Call `to_tools` on the class (not an instance) to get an array of `ToolProxy` objects compatible with `DSPy::Tools::Base`:
```ruby
agent = DSPy::ReAct.new(
AnalyzeText,
tools: DatabaseToolset.to_tools
)
```
Each `ToolProxy` wraps one method, delegates `call` to the underlying toolset instance, and generates its own JSON schema from the method's Sorbet signature.
### Shared State
All tool proxies from a single `to_tools` call share one toolset instance. Store shared state (connections, caches, configuration) in the toolset's `initialize`:
```ruby
class ApiToolset < DSPy::Tools::Toolset
extend T::Sig
toolset_name "api"
tool :get, description: "Make a GET request"
tool :post, description: "Make a POST request"
sig { params(base_url: String).void }
def initialize(base_url:)
@base_url = base_url
@client = HTTP.persistent(base_url)
end
sig { params(path: String).returns(String) }
def get(path:)
@client.get("#{@base_url}#{path}").body.to_s
end
sig { params(path: String, body: String).returns(String) }
def post(path:, body:)
@client.post("#{@base_url}#{path}", body: body).body.to_s
end
end
```
---
## Type Safety
Sorbet signatures on tool methods drive both JSON schema generation and automatic type coercion of LLM responses.
### Basic Types
```ruby
sig { params(
text: String,
count: Integer,
score: Float,
enabled: T::Boolean,
threshold: Numeric
).returns(String) }
def analyze(text:, count:, score:, enabled:, threshold:)
# ...
end
```
| Sorbet Type | JSON Schema |
|------------------|----------------------------------------------------|
| `String` | `{"type": "string"}` |
| `Integer` | `{"type": "integer"}` |
| `Float` | `{"type": "number"}` |
| `Numeric` | `{"type": "number"}` |
| `T::Boolean` | `{"type": "boolean"}` |
| `T::Enum` | `{"type": "string", "enum": [...]}` |
| `T::Struct` | `{"type": "object", "properties": {...}}` |
| `T::Array[Type]` | `{"type": "array", "items": {...}}` |
| `T::Hash[K, V]` | `{"type": "object", "additionalProperties": {...}}`|
| `T.nilable(Type)`| `{"type": [original, "null"]}` |
| `T.any(T1, T2)` | `{"oneOf": [{...}, {...}]}` |
| `T.class_of(X)` | `{"type": "string"}` |
### T::Enum Parameters
Define a `T::Enum` and reference it in a tool signature. DSPy.rb generates a JSON Schema `enum` constraint and automatically deserializes the LLM's string response into the correct enum instance.
```ruby
class Priority < T::Enum
enums do
Low = new('low')
Medium = new('medium')
High = new('high')
Critical = new('critical')
end
end
class Status < T::Enum
enums do
Pending = new('pending')
InProgress = new('in-progress')
Completed = new('completed')
end
end
sig { params(priority: Priority, status: Status).returns(String) }
def update_task(priority:, status:)
"Updated to #{priority.serialize} / #{status.serialize}"
end
```
The generated schema constrains the parameter to valid values:
```json
{
"priority": {
"type": "string",
"enum": ["low", "medium", "high", "critical"]
}
}
```
**Case-insensitive matching**: When the LLM returns `"HIGH"` or `"High"` instead of `"high"`, DSPy.rb first tries an exact `try_deserialize`, then falls back to a case-insensitive lookup. This prevents failures caused by LLM casing variations.
### T::Struct Parameters
Use `T::Struct` for complex nested objects. DSPy.rb generates nested JSON Schema properties and recursively coerces the LLM's hash response into struct instances.
```ruby
class TaskMetadata < T::Struct
prop :id, String
prop :priority, Priority
prop :tags, T::Array[String]
prop :estimated_hours, T.nilable(Float), default: nil
end
class TaskRequest < T::Struct
prop :title, String
prop :description, String
prop :status, Status
prop :metadata, TaskMetadata
prop :assignees, T::Array[String]
end
sig { params(task: TaskRequest).returns(String) }
def create_task(task:)
"Created: #{task.title} (#{task.status.serialize})"
end
```
The LLM sees the full nested object schema and DSPy.rb reconstructs the struct tree from the JSON response, including enum fields inside nested structs.
### Nilable Parameters
Mark optional parameters with `T.nilable(...)` and provide a default value of `nil` in the method signature. These parameters are excluded from the JSON Schema `required` array.
```ruby
sig { params(
query: String,
max_results: T.nilable(Integer),
filter: T.nilable(String)
).returns(String) }
def search(query:, max_results: nil, filter: nil)
# query is required; max_results and filter are optional
end
```
### Collections
Typed arrays and hashes generate precise item/value schemas:
```ruby
sig { params(
tags: T::Array[String],
priorities: T::Array[Priority],
config: T::Hash[String, T.any(String, Integer, Float)]
).returns(String) }
def configure(tags:, priorities:, config:)
# Array elements and hash values are validated and coerced
end
```
### Union Types
`T.any(...)` generates a `oneOf` JSON Schema. When one of the union members is a `T::Struct`, DSPy.rb uses the `_type` discriminator field to select the correct struct class during coercion.
```ruby
sig { params(value: T.any(String, Integer, Float)).returns(String) }
def handle_flexible(value:)
# Accepts multiple types
end
```
---
## Built-in Toolsets
### TextProcessingToolset
`DSPy::Tools::TextProcessingToolset` provides Unix-style text analysis and manipulation operations. Toolset name prefix: `text`.
| Tool Name | Method | Description |
|-----------------------------------|-------------------|--------------------------------------------|
| `text_grep` | `grep` | Search for patterns with optional case-insensitive and count-only modes |
| `text_wc` | `word_count` | Count lines, words, and characters |
| `text_rg` | `ripgrep` | Fast pattern search with context lines |
| `text_extract_lines` | `extract_lines` | Extract a range of lines by number |
| `text_filter_lines` | `filter_lines` | Keep or reject lines matching a regex |
| `text_unique_lines` | `unique_lines` | Deduplicate lines, optionally preserving order |
| `text_sort_lines` | `sort_lines` | Sort lines alphabetically or numerically |
| `text_summarize_text` | `summarize_text` | Produce a statistical summary (counts, averages, frequent words) |
Usage:
```ruby
agent = DSPy::ReAct.new(
AnalyzeText,
tools: DSPy::Tools::TextProcessingToolset.to_tools
)
result = agent.call(text: log_contents, question: "How many error lines are there?")
puts result.answer
```
### GitHubCLIToolset
`DSPy::Tools::GitHubCLIToolset` wraps the `gh` CLI for read-oriented GitHub operations. Toolset name prefix: `github`.
| Tool Name | Method | Description |
|------------------------|-------------------|---------------------------------------------------|
| `github_list_issues` | `list_issues` | List issues filtered by state, labels, assignee |
| `github_list_prs` | `list_prs` | List pull requests filtered by state, author, base|
| `github_get_issue` | `get_issue` | Retrieve details of a single issue |
| `github_get_pr` | `get_pr` | Retrieve details of a single pull request |
| `github_api_request` | `api_request` | Make an arbitrary GET request to the GitHub API |
| `github_traffic_views` | `traffic_views` | Fetch repository traffic view counts |
| `github_traffic_clones`| `traffic_clones` | Fetch repository traffic clone counts |
This toolset uses `T::Enum` parameters (`IssueState`, `PRState`, `ReviewState`) for state filters, demonstrating enum-based tool signatures in practice.
```ruby
agent = DSPy::ReAct.new(
RepoAnalysis,
tools: DSPy::Tools::GitHubCLIToolset.to_tools
)
```
---
## Testing
### Unit Testing Individual Tools
Test `DSPy::Tools::Base` subclasses by instantiating and calling `call` directly:
```ruby
RSpec.describe WeatherLookup do
subject(:tool) { described_class.new }
it "returns weather for a city" do
result = tool.call(city: "Berlin")
expect(result).to include("Berlin")
end
it "exposes the correct tool name" do
expect(tool.name).to eq("weather_lookup")
end
it "generates a valid schema" do
schema = described_class.call_schema_object
expect(schema[:required]).to include("city")
expect(schema[:properties]).to have_key(:city)
end
end
```
### Unit Testing Toolsets
Test toolset methods directly on an instance. Verify tool generation with `to_tools`:
```ruby
RSpec.describe DatabaseToolset do
subject(:toolset) { described_class.new }
it "executes a query" do
result = toolset.query(sql: "SELECT 1")
expect(result).to be_a(String)
end
it "generates tools with correct names" do
tools = described_class.to_tools
names = tools.map(&:name)
expect(names).to contain_exactly("db_query", "db_insert", "db_delete")
end
it "generates tool descriptions" do
tools = described_class.to_tools
query_tool = tools.find { |t| t.name == "db_query" }
expect(query_tool.description).to eq("Run a read-only SQL query")
end
end
```
### Mocking Predictions Inside Tools
When a tool calls a DSPy predictor internally, stub the predictor to isolate tool logic from LLM calls:
```ruby
class SmartSearchTool < DSPy::Tools::Base
extend T::Sig
tool_name "smart_search"
tool_description "Search with query expansion"
sig { void }
def initialize
@expander = DSPy::Predict.new(QueryExpansionSignature)
end
sig { params(query: String).returns(String) }
def call(query:)
expanded = @expander.call(query: query)
perform_search(expanded.expanded_query)
end
private
def perform_search(query)
# actual search logic
end
end
RSpec.describe SmartSearchTool do
subject(:tool) { described_class.new }
before do
expansion_result = double("result", expanded_query: "expanded test query")
allow_any_instance_of(DSPy::Predict).to receive(:call).and_return(expansion_result)
end
it "expands the query before searching" do
allow(tool).to receive(:perform_search).with("expanded test query").and_return("found 3 results")
result = tool.call(query: "test")
expect(result).to eq("found 3 results")
end
end
```
### Testing Enum Coercion
Verify that string values from LLM responses deserialize into the correct enum instances:
```ruby
RSpec.describe "enum coercion" do
it "handles case-insensitive enum values" do
toolset = GitHubCLIToolset.new
# The LLM may return "OPEN" instead of "open"
result = toolset.list_issues(state: IssueState::Open)
expect(result).to be_a(String)
end
end
```
---
## Constraints
- All exposed tool methods must use **keyword arguments**. Positional-only parameters generate schemas but keyword arguments produce more reliable LLM interactions.
- Each exposed method becomes a **separate, independent tool**. Method chaining or multi-step sequences within a single tool call are not supported.
- Shared state across tool proxies is scoped to a single `to_tools` call. Separate `to_tools` invocations create separate toolset instances.
- Methods without a Sorbet `sig` produce an empty parameter schema. The LLM will not know what arguments to pass.