Rewrite all reference files, asset templates, and SKILL.md to use current API patterns (.call(), result.field, T::Enum classes, Tools::Base). Add two new reference files (toolsets, observability) covering tools DSL, event system, and Langfuse integration. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
419 lines
14 KiB
Markdown
419 lines
14 KiB
Markdown
# DSPy.rb LLM Providers
|
|
|
|
## Adapter Architecture
|
|
|
|
DSPy.rb ships provider SDKs as separate adapter gems. Install only the adapters the project needs. Each adapter gem depends on the official SDK for its provider and auto-loads when present -- no explicit `require` necessary.
|
|
|
|
```ruby
|
|
# Gemfile
|
|
gem 'dspy' # core framework (no provider SDKs)
|
|
gem 'dspy-openai' # OpenAI, OpenRouter, Ollama
|
|
gem 'dspy-anthropic' # Claude
|
|
gem 'dspy-gemini' # Gemini
|
|
gem 'dspy-ruby_llm' # RubyLLM unified adapter (12+ providers)
|
|
```
|
|
|
|
---
|
|
|
|
## Per-Provider Adapters
|
|
|
|
### dspy-openai
|
|
|
|
Covers any endpoint that speaks the OpenAI chat-completions protocol: OpenAI itself, OpenRouter, and Ollama.
|
|
|
|
**SDK dependency:** `openai ~> 0.17`
|
|
|
|
```ruby
|
|
# OpenAI
|
|
lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
|
|
|
|
# OpenRouter -- access 200+ models behind a single key
|
|
lm = DSPy::LM.new('openrouter/x-ai/grok-4-fast:free',
|
|
api_key: ENV['OPENROUTER_API_KEY']
|
|
)
|
|
|
|
# Ollama -- local models, no API key required
|
|
lm = DSPy::LM.new('ollama/llama3.2')
|
|
|
|
# Remote Ollama instance
|
|
lm = DSPy::LM.new('ollama/llama3.2',
|
|
base_url: 'https://my-ollama.example.com/v1',
|
|
api_key: 'optional-auth-token'
|
|
)
|
|
```
|
|
|
|
All three sub-adapters share the same request handling, structured-output support, and error reporting. Swap providers without changing higher-level DSPy code.
|
|
|
|
For OpenRouter models that lack native structured-output support, disable it explicitly:
|
|
|
|
```ruby
|
|
lm = DSPy::LM.new('openrouter/deepseek/deepseek-chat-v3.1:free',
|
|
api_key: ENV['OPENROUTER_API_KEY'],
|
|
structured_outputs: false
|
|
)
|
|
```
|
|
|
|
### dspy-anthropic
|
|
|
|
Provides the Claude adapter. Install it for any `anthropic/*` model id.
|
|
|
|
**SDK dependency:** `anthropic ~> 1.12`
|
|
|
|
```ruby
|
|
lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514',
|
|
api_key: ENV['ANTHROPIC_API_KEY']
|
|
)
|
|
```
|
|
|
|
Structured outputs default to tool-based JSON extraction (`structured_outputs: true`). Set `structured_outputs: false` to use enhanced-prompting extraction instead.
|
|
|
|
```ruby
|
|
# Tool-based extraction (default, most reliable)
|
|
lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514',
|
|
api_key: ENV['ANTHROPIC_API_KEY'],
|
|
structured_outputs: true
|
|
)
|
|
|
|
# Enhanced prompting extraction
|
|
lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514',
|
|
api_key: ENV['ANTHROPIC_API_KEY'],
|
|
structured_outputs: false
|
|
)
|
|
```
|
|
|
|
### dspy-gemini
|
|
|
|
Provides the Gemini adapter. Install it for any `gemini/*` model id.
|
|
|
|
**SDK dependency:** `gemini-ai ~> 4.3`
|
|
|
|
```ruby
|
|
lm = DSPy::LM.new('gemini/gemini-2.5-flash',
|
|
api_key: ENV['GEMINI_API_KEY']
|
|
)
|
|
```
|
|
|
|
**Environment variable:** `GEMINI_API_KEY` (also accepts `GOOGLE_API_KEY`).
|
|
|
|
---
|
|
|
|
## RubyLLM Unified Adapter
|
|
|
|
The `dspy-ruby_llm` gem provides a single adapter that routes to 12+ providers through [RubyLLM](https://rubyllm.com). Use it when a project talks to multiple providers or needs access to Bedrock, VertexAI, DeepSeek, or Mistral without dedicated adapter gems.
|
|
|
|
**SDK dependency:** `ruby_llm ~> 1.3`
|
|
|
|
### Model ID Format
|
|
|
|
Prefix every model id with `ruby_llm/`:
|
|
|
|
```ruby
|
|
lm = DSPy::LM.new('ruby_llm/gpt-4o-mini')
|
|
lm = DSPy::LM.new('ruby_llm/claude-sonnet-4-20250514')
|
|
lm = DSPy::LM.new('ruby_llm/gemini-2.5-flash')
|
|
```
|
|
|
|
The adapter detects the provider from RubyLLM's model registry automatically. For models not in the registry, pass `provider:` explicitly:
|
|
|
|
```ruby
|
|
lm = DSPy::LM.new('ruby_llm/llama3.2', provider: 'ollama')
|
|
lm = DSPy::LM.new('ruby_llm/anthropic/claude-3-opus',
|
|
api_key: ENV['OPENROUTER_API_KEY'],
|
|
provider: 'openrouter'
|
|
)
|
|
```
|
|
|
|
### Using Existing RubyLLM Configuration
|
|
|
|
When RubyLLM is already configured globally, omit the `api_key:` argument. DSPy reuses the global config automatically:
|
|
|
|
```ruby
|
|
RubyLLM.configure do |config|
|
|
config.openai_api_key = ENV['OPENAI_API_KEY']
|
|
config.anthropic_api_key = ENV['ANTHROPIC_API_KEY']
|
|
end
|
|
|
|
# No api_key needed -- picks up the global config
|
|
DSPy.configure do |c|
|
|
c.lm = DSPy::LM.new('ruby_llm/gpt-4o-mini')
|
|
end
|
|
```
|
|
|
|
When an `api_key:` (or any of `base_url:`, `timeout:`, `max_retries:`) is passed, DSPy creates a **scoped context** instead of reusing the global config.
|
|
|
|
### Cloud-Hosted Providers (Bedrock, VertexAI)
|
|
|
|
Configure RubyLLM globally first, then reference the model:
|
|
|
|
```ruby
|
|
# AWS Bedrock
|
|
RubyLLM.configure do |c|
|
|
c.bedrock_api_key = ENV['AWS_ACCESS_KEY_ID']
|
|
c.bedrock_secret_key = ENV['AWS_SECRET_ACCESS_KEY']
|
|
c.bedrock_region = 'us-east-1'
|
|
end
|
|
lm = DSPy::LM.new('ruby_llm/anthropic.claude-3-5-sonnet', provider: 'bedrock')
|
|
|
|
# Google VertexAI
|
|
RubyLLM.configure do |c|
|
|
c.vertexai_project_id = 'your-project-id'
|
|
c.vertexai_location = 'us-central1'
|
|
end
|
|
lm = DSPy::LM.new('ruby_llm/gemini-pro', provider: 'vertexai')
|
|
```
|
|
|
|
### Supported Providers Table
|
|
|
|
| Provider | Example Model ID | Notes |
|
|
|-------------|--------------------------------------------|---------------------------------|
|
|
| OpenAI | `ruby_llm/gpt-4o-mini` | Auto-detected from registry |
|
|
| Anthropic | `ruby_llm/claude-sonnet-4-20250514` | Auto-detected from registry |
|
|
| Gemini | `ruby_llm/gemini-2.5-flash` | Auto-detected from registry |
|
|
| DeepSeek | `ruby_llm/deepseek-chat` | Auto-detected from registry |
|
|
| Mistral | `ruby_llm/mistral-large` | Auto-detected from registry |
|
|
| Ollama | `ruby_llm/llama3.2` | Use `provider: 'ollama'` |
|
|
| AWS Bedrock | `ruby_llm/anthropic.claude-3-5-sonnet` | Configure RubyLLM globally |
|
|
| VertexAI | `ruby_llm/gemini-pro` | Configure RubyLLM globally |
|
|
| OpenRouter | `ruby_llm/anthropic/claude-3-opus` | Use `provider: 'openrouter'` |
|
|
| Perplexity | `ruby_llm/llama-3.1-sonar-large` | Use `provider: 'perplexity'` |
|
|
| GPUStack | `ruby_llm/model-name` | Use `provider: 'gpustack'` |
|
|
|
|
---
|
|
|
|
## Rails Initializer Pattern
|
|
|
|
Configure DSPy inside an `after_initialize` block so Rails credentials and environment are fully loaded:
|
|
|
|
```ruby
|
|
# config/initializers/dspy.rb
|
|
Rails.application.config.after_initialize do
|
|
return if Rails.env.test? # skip in test -- use VCR cassettes instead
|
|
|
|
DSPy.configure do |config|
|
|
config.lm = DSPy::LM.new(
|
|
'openai/gpt-4o-mini',
|
|
api_key: Rails.application.credentials.openai_api_key,
|
|
structured_outputs: true
|
|
)
|
|
|
|
config.logger = if Rails.env.production?
|
|
Dry.Logger(:dspy, formatter: :json) do |logger|
|
|
logger.add_backend(stream: Rails.root.join("log/dspy.log"))
|
|
end
|
|
else
|
|
Dry.Logger(:dspy) do |logger|
|
|
logger.add_backend(level: :debug, stream: $stdout)
|
|
end
|
|
end
|
|
end
|
|
end
|
|
```
|
|
|
|
Key points:
|
|
|
|
- Wrap in `after_initialize` so `Rails.application.credentials` is available.
|
|
- Return early in the test environment. Rely on VCR cassettes for deterministic LLM responses.
|
|
- Set `structured_outputs: true` (the default) for provider-native JSON extraction.
|
|
- Use `Dry.Logger` with `:json` formatter in production for structured log parsing.
|
|
|
|
---
|
|
|
|
## Fiber-Local LM Context
|
|
|
|
`DSPy.with_lm` sets a temporary language-model override scoped to the current Fiber. Every predictor call inside the block uses the override; outside the block the previous LM takes effect again.
|
|
|
|
```ruby
|
|
fast = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
|
|
powerful = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY'])
|
|
|
|
classifier = Classifier.new
|
|
|
|
# Uses the global LM
|
|
result = classifier.call(text: "Hello")
|
|
|
|
# Temporarily switch to the fast model
|
|
DSPy.with_lm(fast) do
|
|
result = classifier.call(text: "Hello") # uses gpt-4o-mini
|
|
end
|
|
|
|
# Temporarily switch to the powerful model
|
|
DSPy.with_lm(powerful) do
|
|
result = classifier.call(text: "Hello") # uses claude-sonnet-4
|
|
end
|
|
```
|
|
|
|
### LM Resolution Hierarchy
|
|
|
|
DSPy resolves the active language model in this order:
|
|
|
|
1. **Instance-level LM** -- set directly on a module instance via `configure`
|
|
2. **Fiber-local LM** -- set via `DSPy.with_lm`
|
|
3. **Global LM** -- set via `DSPy.configure`
|
|
|
|
Instance-level configuration always wins, even inside a `DSPy.with_lm` block:
|
|
|
|
```ruby
|
|
classifier = Classifier.new
|
|
classifier.configure { |c| c.lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY']) }
|
|
|
|
fast = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
|
|
|
|
DSPy.with_lm(fast) do
|
|
classifier.call(text: "Test") # still uses claude-sonnet-4 (instance-level wins)
|
|
end
|
|
```
|
|
|
|
### configure_predictor for Fine-Grained Agent Control
|
|
|
|
Complex agents (`ReAct`, `CodeAct`, `DeepResearch`, `DeepSearch`) contain internal predictors. Use `configure` for a blanket override and `configure_predictor` to target a specific sub-predictor:
|
|
|
|
```ruby
|
|
agent = DSPy::ReAct.new(MySignature, tools: tools)
|
|
|
|
# Set a default LM for the agent and all its children
|
|
agent.configure { |c| c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY']) }
|
|
|
|
# Override just the reasoning predictor with a more capable model
|
|
agent.configure_predictor('thought_generator') do |c|
|
|
c.lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY'])
|
|
end
|
|
|
|
result = agent.call(question: "Summarize the report")
|
|
```
|
|
|
|
Both methods support chaining:
|
|
|
|
```ruby
|
|
agent
|
|
.configure { |c| c.lm = cheap_model }
|
|
.configure_predictor('thought_generator') { |c| c.lm = expensive_model }
|
|
```
|
|
|
|
#### Available Predictors by Agent Type
|
|
|
|
| Agent | Internal Predictors |
|
|
|----------------------|------------------------------------------------------------------|
|
|
| `DSPy::ReAct` | `thought_generator`, `observation_processor` |
|
|
| `DSPy::CodeAct` | `code_generator`, `observation_processor` |
|
|
| `DSPy::DeepResearch` | `planner`, `synthesizer`, `qa_reviewer`, `reporter` |
|
|
| `DSPy::DeepSearch` | `seed_predictor`, `search_predictor`, `reader_predictor`, `reason_predictor` |
|
|
|
|
#### Propagation Rules
|
|
|
|
- Configuration propagates recursively to children and grandchildren.
|
|
- Children with an already-configured LM are **not** overwritten by a later parent `configure` call.
|
|
- Configure the parent first, then override specific children.
|
|
|
|
---
|
|
|
|
## Feature-Flagged Model Selection
|
|
|
|
Use a `FeatureFlags` module backed by ENV vars to centralize model selection. Each tool or agent reads its model from the flags, falling back to a global default.
|
|
|
|
```ruby
|
|
module FeatureFlags
|
|
module_function
|
|
|
|
def default_model
|
|
ENV.fetch('DSPY_DEFAULT_MODEL', 'openai/gpt-4o-mini')
|
|
end
|
|
|
|
def default_api_key
|
|
ENV.fetch('DSPY_DEFAULT_API_KEY') { ENV.fetch('OPENAI_API_KEY', nil) }
|
|
end
|
|
|
|
def model_for(tool_name)
|
|
env_key = "DSPY_MODEL_#{tool_name.upcase}"
|
|
ENV.fetch(env_key, default_model)
|
|
end
|
|
|
|
def api_key_for(tool_name)
|
|
env_key = "DSPY_API_KEY_#{tool_name.upcase}"
|
|
ENV.fetch(env_key, default_api_key)
|
|
end
|
|
end
|
|
```
|
|
|
|
### Per-Tool Model Override
|
|
|
|
Override an individual tool's model without touching application code:
|
|
|
|
```bash
|
|
# .env
|
|
DSPY_DEFAULT_MODEL=openai/gpt-4o-mini
|
|
DSPY_DEFAULT_API_KEY=sk-...
|
|
|
|
# Override the classifier to use Claude
|
|
DSPY_MODEL_CLASSIFIER=anthropic/claude-sonnet-4-20250514
|
|
DSPY_API_KEY_CLASSIFIER=sk-ant-...
|
|
|
|
# Override the summarizer to use Gemini
|
|
DSPY_MODEL_SUMMARIZER=gemini/gemini-2.5-flash
|
|
DSPY_API_KEY_SUMMARIZER=...
|
|
```
|
|
|
|
Wire each agent to its flag at initialization:
|
|
|
|
```ruby
|
|
class ClassifierAgent < DSPy::Module
|
|
def initialize
|
|
super
|
|
model = FeatureFlags.model_for('classifier')
|
|
api_key = FeatureFlags.api_key_for('classifier')
|
|
|
|
@predictor = DSPy::Predict.new(ClassifySignature)
|
|
configure { |c| c.lm = DSPy::LM.new(model, api_key: api_key) }
|
|
end
|
|
|
|
def forward(text:)
|
|
@predictor.call(text: text)
|
|
end
|
|
end
|
|
```
|
|
|
|
This pattern keeps model routing declarative and avoids scattering `DSPy::LM.new` calls across the codebase.
|
|
|
|
---
|
|
|
|
## Compatibility Matrix
|
|
|
|
Feature support across direct adapter gems. All features listed assume `structured_outputs: true` (the default).
|
|
|
|
| Feature | OpenAI | Anthropic | Gemini | Ollama | OpenRouter | RubyLLM |
|
|
|----------------------|--------|-----------|--------|----------|------------|-------------|
|
|
| Structured Output | Native JSON mode | Tool-based extraction | Native JSON schema | OpenAI-compatible JSON | Varies by model | Via `with_schema` |
|
|
| Vision (Images) | File + URL | File + Base64 | File + Base64 | Limited | Varies | Delegates to underlying provider |
|
|
| Image URLs | Yes | No | No | No | Varies | Depends on provider |
|
|
| Tool Calling | Yes | Yes | Yes | Varies | Varies | Yes |
|
|
| Streaming | Yes | Yes | Yes | Yes | Yes | Yes |
|
|
|
|
**Notes:**
|
|
|
|
- **Structured Output** is enabled by default on every adapter. Set `structured_outputs: false` to fall back to enhanced-prompting extraction.
|
|
- **Vision / Image URLs:** Only OpenAI supports passing a URL directly. For Anthropic and Gemini, load images from file or Base64:
|
|
```ruby
|
|
DSPy::Image.from_url("https://example.com/img.jpg") # OpenAI only
|
|
DSPy::Image.from_file("path/to/image.jpg") # all providers
|
|
DSPy::Image.from_base64(data, mime_type: "image/jpeg") # all providers
|
|
```
|
|
- **RubyLLM** delegates to the underlying provider, so feature support matches the provider column in the table.
|
|
|
|
### Choosing an Adapter Strategy
|
|
|
|
| Scenario | Recommended Adapter |
|
|
|-------------------------------------------|--------------------------------|
|
|
| Single provider (OpenAI, Claude, or Gemini) | Dedicated gem (`dspy-openai`, `dspy-anthropic`, `dspy-gemini`) |
|
|
| Multi-provider with per-agent model routing | `dspy-ruby_llm` |
|
|
| AWS Bedrock or Google VertexAI | `dspy-ruby_llm` |
|
|
| Local development with Ollama | `dspy-openai` (Ollama sub-adapter) or `dspy-ruby_llm` |
|
|
| OpenRouter for cost optimization | `dspy-openai` (OpenRouter sub-adapter) |
|
|
|
|
### Current Recommended Models
|
|
|
|
| Provider | Model ID | Use Case |
|
|
|-----------|---------------------------------------|-----------------------|
|
|
| OpenAI | `openai/gpt-4o-mini` | Fast, cost-effective |
|
|
| Anthropic | `anthropic/claude-sonnet-4-20250514` | Balanced reasoning |
|
|
| Gemini | `gemini/gemini-2.5-flash` | Fast, cost-effective |
|
|
| Ollama | `ollama/llama3.2` | Local, zero API cost |
|