refactor(skills): update dspy-ruby skill to DSPy.rb v0.34.3 API (#162)

Rewrite all reference files, asset templates, and SKILL.md to use current API patterns (.call(), result.field, T::Enum classes, Tools::Base). Add two new reference files (toolsets, observability) covering tools DSL, event system, and Langfuse integration. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 19:01:43 +01:00
parent f3b7d111f1
commit e8f3bbcb35
12 changed files with 3716 additions and 2246 deletions
--- a/plugins/compound-engineering/skills/dspy-ruby/references/core-concepts.md
+++ b/plugins/compound-engineering/skills/dspy-ruby/references/core-concepts.md
@@ -1,265 +1,674 @@
 # DSPy.rb Core Concepts

-## Philosophy
-
-DSPy.rb enables developers to **program LLMs, not prompt them**. Instead of manually crafting prompts, define application requirements through code using type-safe, composable modules.
-
 ## Signatures

-Signatures define type-safe input/output contracts for LLM operations. They specify what data goes in and what data comes out, with runtime type checking.
+Signatures define the interface between application code and language models. They specify inputs, outputs, and a task description using Sorbet types for compile-time and runtime type safety.

-### Basic Signature Structure
+### Structure

 ```ruby
-class TaskSignature < DSPy::Signature
-  description "Brief description of what this signature does"
+class ClassifyEmail < DSPy::Signature
+  description "Classify customer support emails by urgency and category"

  input do
-    const :field_name, String, desc: "Description of this input field"
-    const :another_field, Integer, desc: "Another input field"
+    const :subject, String
+    const :body, String
  end

  output do
-    const :result_field, String, desc: "Description of the output"
-    const :confidence, Float, desc: "Confidence score (0.0-1.0)"
+    const :category, String
+    const :urgency, String
  end
 end
 ```

-### Type Safety
+### Supported Types

-Signatures support Sorbet types including:
- `String` - Text data
- `Integer`, `Float` - Numeric data
- `T::Boolean` - Boolean values
- `T::Array[Type]` - Arrays of specific types
- Custom enums and classes
+| Type | JSON Schema | Notes |
+|------|-------------|-------|
+| `String` | `string` | Required string |
+| `Integer` | `integer` | Whole numbers |
+| `Float` | `number` | Decimal numbers |
+| `T::Boolean` | `boolean` | true/false |
+| `T::Array[X]` | `array` | Typed arrays |
+| `T::Hash[K, V]` | `object` | Typed key-value maps |
+| `T.nilable(X)` | nullable | Optional fields |
+| `Date` | `string` (ISO 8601) | Auto-converted |
+| `DateTime` | `string` (ISO 8601) | Preserves timezone |
+| `Time` | `string` (ISO 8601) | Converted to UTC |
+
+### Date and Time Types
+
+Date, DateTime, and Time fields serialize to ISO 8601 strings and auto-convert back to Ruby objects on output.
+
+```ruby
+class EventScheduler < DSPy::Signature
+  description "Schedule events based on requirements"
+
+  input do
+    const :start_date, Date                  # ISO 8601: YYYY-MM-DD
+    const :preferred_time, DateTime          # ISO 8601 with timezone
+    const :deadline, Time                    # Converted to UTC
+    const :end_date, T.nilable(Date)         # Optional date
+  end
+
+  output do
+    const :scheduled_date, Date              # String from LLM, auto-converted to Date
+    const :event_datetime, DateTime          # Preserves timezone info
+    const :created_at, Time                  # Converted to UTC
+  end
+end
+
+predictor = DSPy::Predict.new(EventScheduler)
+result = predictor.call(
+  start_date: "2024-01-15",
+  preferred_time: "2024-01-15T10:30:45Z",
+  deadline: Time.now,
+  end_date: nil
+)
+
+result.scheduled_date.class  # => Date
+result.event_datetime.class  # => DateTime
+```
+
+Timezone conventions follow ActiveRecord: Time objects convert to UTC, DateTime objects preserve timezone, Date objects are timezone-agnostic.
+
+### Enums with T::Enum
+
+Define constrained output values using `T::Enum` classes. Do not use inline `T.enum([...])` syntax.
+
+```ruby
+class SentimentAnalysis < DSPy::Signature
+  description "Analyze sentiment of text"
+
+  class Sentiment < T::Enum
+    enums do
+      Positive = new('positive')
+      Negative = new('negative')
+      Neutral = new('neutral')
+    end
+  end
+
+  input do
+    const :text, String
+  end
+
+  output do
+    const :sentiment, Sentiment
+    const :confidence, Float
+  end
+end
+
+predictor = DSPy::Predict.new(SentimentAnalysis)
+result = predictor.call(text: "This product is amazing!")
+
+result.sentiment              # => #<Sentiment::Positive>
+result.sentiment.serialize    # => "positive"
+result.confidence             # => 0.92
+```
+
+Enum matching is case-insensitive. The LLM returning `"POSITIVE"` matches `new('positive')`.
+
+### Default Values
+
+Default values work on both inputs and outputs. Input defaults reduce caller boilerplate. Output defaults provide fallbacks when the LLM omits optional fields.
+
+```ruby
+class SmartSearch < DSPy::Signature
+  description "Search with intelligent defaults"
+
+  input do
+    const :query, String
+    const :max_results, Integer, default: 10
+    const :language, String, default: "English"
+  end
+
+  output do
+    const :results, T::Array[String]
+    const :total_found, Integer
+    const :cached, T::Boolean, default: false
+  end
+end
+
+search = DSPy::Predict.new(SmartSearch)
+result = search.call(query: "Ruby programming")
+# max_results defaults to 10, language defaults to "English"
+# If LLM omits `cached`, it defaults to false
+```

 ### Field Descriptions

-Always provide clear field descriptions using the `desc:` parameter. These descriptions:
- Guide the LLM on expected input/output format
- Serve as documentation for developers
- Improve prediction accuracy
+Add `description:` to any field to guide the LLM on expected content. These descriptions appear in the generated JSON schema sent to the model.
+
+```ruby
+class ASTNode < T::Struct
+  const :node_type, String, description: "The type of AST node (heading, paragraph, code_block)"
+  const :text, String, default: "", description: "Text content of the node"
+  const :level, Integer, default: 0, description: "Heading level 1-6, only for heading nodes"
+  const :children, T::Array[ASTNode], default: []
+end
+
+ASTNode.field_descriptions[:node_type]  # => "The type of AST node ..."
+ASTNode.field_descriptions[:children]   # => nil (no description set)
+```
+
+Field descriptions also work inside signature `input` and `output` blocks:
+
+```ruby
+class ExtractEntities < DSPy::Signature
+  description "Extract named entities from text"
+
+  input do
+    const :text, String, description: "Raw text to analyze"
+    const :language, String, default: "en", description: "ISO 639-1 language code"
+  end
+
+  output do
+    const :entities, T::Array[String], description: "List of extracted entity names"
+    const :count, Integer, description: "Total number of unique entities found"
+  end
+end
+```
+
+### Schema Formats
+
+DSPy.rb supports three schema formats for communicating type structure to LLMs.
+
+#### JSON Schema (default)
+
+Verbose but universally supported. Access via `YourSignature.output_json_schema`.
+
+#### BAML Schema
+
+Compact format that reduces schema tokens by 80-85%. Requires the `sorbet-baml` gem.
+
+```ruby
+DSPy.configure do |c|
+  c.lm = DSPy::LM.new('openai/gpt-4o-mini',
+    api_key: ENV['OPENAI_API_KEY'],
+    schema_format: :baml
+  )
+end
+```
+
+BAML applies only in Enhanced Prompting mode (`structured_outputs: false`). When `structured_outputs: true`, the provider receives JSON Schema directly.
+
+#### TOON Schema + Data Format
+
+Table-oriented text format that shrinks both schema definitions and prompt values.
+
+```ruby
+DSPy.configure do |c|
+  c.lm = DSPy::LM.new('openai/gpt-4o-mini',
+    api_key: ENV['OPENAI_API_KEY'],
+    schema_format: :toon,
+    data_format:   :toon
+  )
+end
+```
+
+`schema_format: :toon` replaces the schema block in the system prompt. `data_format: :toon` renders input values and output templates inside `toon` fences. Only works with Enhanced Prompting mode. The `sorbet-toon` gem is included automatically as a dependency.
+
+### Recursive Types
+
+Structs that reference themselves produce `$defs` entries in the generated JSON schema, using `$ref` pointers to avoid infinite recursion.
+
+```ruby
+class ASTNode < T::Struct
+  const :node_type, String
+  const :text, String, default: ""
+  const :children, T::Array[ASTNode], default: []
+end
+```
+
+The schema generator detects the self-reference in `T::Array[ASTNode]` and emits:
+
+```json
+{
+  "$defs": {
+    "ASTNode": { "type": "object", "properties": { ... } }
+  },
+  "properties": {
+    "children": {
+      "type": "array",
+      "items": { "$ref": "#/$defs/ASTNode" }
+    }
+  }
+}
+```
+
+Access the schema with accumulated definitions via `YourSignature.output_json_schema_with_defs`.
+
+### Union Types with T.any()
+
+Specify fields that accept multiple types:
+
+```ruby
+output do
+  const :result, T.any(Float, String)
+end
+```
+
+For struct unions, DSPy.rb automatically adds a `_type` discriminator field to each struct's JSON schema. The LLM returns `_type` in its response, and DSPy converts the hash to the correct struct instance.
+
+```ruby
+class CreateTask < T::Struct
+  const :title, String
+  const :priority, String
+end
+
+class DeleteTask < T::Struct
+  const :task_id, String
+  const :reason, T.nilable(String)
+end
+
+class TaskRouter < DSPy::Signature
+  description "Route user request to the appropriate task action"
+
+  input do
+    const :request, String
+  end
+
+  output do
+    const :action, T.any(CreateTask, DeleteTask)
+  end
+end
+
+result = DSPy::Predict.new(TaskRouter).call(request: "Create a task for Q4 review")
+result.action.class  # => CreateTask
+result.action.title  # => "Q4 Review"
+```
+
+Pattern matching works on the result:
+
+```ruby
+case result.action
+when CreateTask then puts "Creating: #{result.action.title}"
+when DeleteTask then puts "Deleting: #{result.action.task_id}"
+end
+```
+
+Union types also work inside arrays for heterogeneous collections:
+
+```ruby
+output do
+  const :events, T::Array[T.any(LoginEvent, PurchaseEvent)]
+end
+```
+
+Limit unions to 2-4 types for reliable LLM comprehension. Use clear struct names since they become the `_type` discriminator values.
+
+---

 ## Modules

-Modules are composable building blocks that use signatures to perform LLM operations. They can be chained together to create complex workflows.
+Modules are composable building blocks that wrap predictors. Define a `forward` method; invoke the module with `.call()`.

-### Basic Module Structure
+### Basic Structure

 ```ruby
-class MyModule < DSPy::Module
+class SentimentAnalyzer < DSPy::Module
  def initialize
    super
-    @predictor = DSPy::Predict.new(MySignature)
+    @predictor = DSPy::Predict.new(SentimentSignature)
  end

-  def forward(input_hash)
-    @predictor.forward(input_hash)
+  def forward(text:)
+    @predictor.call(text: text)
  end
 end
+
+analyzer = SentimentAnalyzer.new
+result = analyzer.call(text: "I love this product!")
+
+result.sentiment    # => "positive"
+result.confidence   # => 0.9
 ```

+**API rules:**
+- Invoke modules and predictors with `.call()`, not `.forward()`.
+- Access result fields with `result.field`, not `result[:field]`.
+
 ### Module Composition

-Modules can call other modules to create pipelines:
+Combine multiple modules through explicit method calls in `forward`:

 ```ruby
-class ComplexWorkflow < DSPy::Module
+class DocumentProcessor < DSPy::Module
  def initialize
    super
-    @step1 = FirstModule.new
-    @step2 = SecondModule.new
+    @classifier = DocumentClassifier.new
+    @summarizer = DocumentSummarizer.new
  end

-  def forward(input)
-    result1 = @step1.forward(input)
-    result2 = @step2.forward(result1)
-    result2
+  def forward(document:)
+    classification = @classifier.call(content: document)
+    summary = @summarizer.call(content: document)
+
+    {
+      document_type: classification.document_type,
+      summary: summary.summary
+    }
  end
 end
 ```

+### Lifecycle Callbacks
+
+Modules support `before`, `after`, and `around` callbacks on `forward`. Declare them as class-level macros referencing private methods.
+
+#### Execution order
+
+1. `before` callbacks (in registration order)
+2. `around` callbacks (before `yield`)
+3. `forward` method
+4. `around` callbacks (after `yield`)
+5. `after` callbacks (in registration order)
+
+```ruby
+class InstrumentedModule < DSPy::Module
+  before :setup_metrics
+  after :log_metrics
+  around :manage_context
+
+  def initialize
+    super
+    @predictor = DSPy::Predict.new(MySignature)
+    @metrics = {}
+  end
+
+  def forward(question:)
+    @predictor.call(question: question)
+  end
+
+  private
+
+  def setup_metrics
+    @metrics[:start_time] = Time.now
+  end
+
+  def manage_context
+    load_context
+    result = yield
+    save_context
+    result
+  end
+
+  def log_metrics
+    @metrics[:duration] = Time.now - @metrics[:start_time]
+  end
+end
+```
+
+Multiple callbacks of the same type execute in registration order. Callbacks inherit from parent classes; parent callbacks run first.
+
+#### Around callbacks
+
+Around callbacks must call `yield` to execute the wrapped method and return the result:
+
+```ruby
+def with_retry
+  retries = 0
+  begin
+    yield
+  rescue StandardError => e
+    retries += 1
+    retry if retries < 3
+    raise e
+  end
+end
+```
+
+### Instruction Update Contract
+
+Teleprompters (GEPA, MIPROv2) require modules to expose immutable update hooks. Include `DSPy::Mixins::InstructionUpdatable` and implement `with_instruction` and `with_examples`, each returning a new instance:
+
+```ruby
+class SentimentPredictor < DSPy::Module
+  include DSPy::Mixins::InstructionUpdatable
+
+  def initialize
+    super
+    @predictor = DSPy::Predict.new(SentimentSignature)
+  end
+
+  def with_instruction(instruction)
+    clone = self.class.new
+    clone.instance_variable_set(:@predictor, @predictor.with_instruction(instruction))
+    clone
+  end
+
+  def with_examples(examples)
+    clone = self.class.new
+    clone.instance_variable_set(:@predictor, @predictor.with_examples(examples))
+    clone
+  end
+end
+```
+
+If a module omits these hooks, teleprompters raise `DSPy::InstructionUpdateError` instead of silently mutating state.
+
+---
+
 ## Predictors

-Predictors are the core execution engines that take signatures and perform LLM inference. DSPy.rb provides several predictor types.
+Predictors are execution engines that take a signature and produce structured results from a language model. DSPy.rb provides four predictor types.

 ### Predict

-Basic LLM inference with type-safe inputs and outputs.
+Direct LLM call with typed input/output. Fastest option, lowest token usage.

 ```ruby
-predictor = DSPy::Predict.new(TaskSignature)
-result = predictor.forward(field_name: "value", another_field: 42)
-# Returns: { result_field: "...", confidence: 0.85 }
+classifier = DSPy::Predict.new(ClassifyText)
+result = classifier.call(text: "Technical document about APIs")
+
+result.sentiment    # => #<Sentiment::Positive>
+result.topics       # => ["APIs", "technical"]
+result.confidence   # => 0.92
 ```

 ### ChainOfThought

-Automatically adds a reasoning field to the output, improving accuracy for complex tasks.
+Adds a `reasoning` field to the output automatically. The model generates step-by-step reasoning before the final answer. Do not define a `:reasoning` field in the signature output when using ChainOfThought.

 ```ruby
-class EmailClassificationSignature < DSPy::Signature
-  description "Classify customer support emails"
+class SolveMathProblem < DSPy::Signature
+  description "Solve mathematical word problems step by step"

  input do
-    const :email_subject, String
-    const :email_body, String
+    const :problem, String
  end

  output do
-    const :category, String  # "Technical", "Billing", or "General"
-    const :priority, String  # "High", "Medium", or "Low"
+    const :answer, String
+    # :reasoning is added automatically by ChainOfThought
  end
 end

-predictor = DSPy::ChainOfThought.new(EmailClassificationSignature)
-result = predictor.forward(
-  email_subject: "Can't log in to my account",
-  email_body: "I've been trying to access my account for hours..."
-)
-# Returns: {
-#   reasoning: "This appears to be a technical issue...",
-#   category: "Technical",
-#   priority: "High"
-# }
+solver = DSPy::ChainOfThought.new(SolveMathProblem)
+result = solver.call(problem: "Sarah has 15 apples. She gives 7 away and buys 12 more.")
+
+result.reasoning  # => "Step by step: 15 - 7 = 8, then 8 + 12 = 20"
+result.answer     # => "20 apples"
 ```

+Use ChainOfThought for complex analysis, multi-step reasoning, or when explainability matters.
+
 ### ReAct

-Tool-using agents with iterative reasoning. Enables autonomous problem-solving by allowing the LLM to use external tools.
+Reasoning + Action agent that uses tools in an iterative loop. Define tools by subclassing `DSPy::Tools::Base`. Group related tools with `DSPy::Tools::Toolset`.

 ```ruby
-class SearchTool < DSPy::Tool
-  def call(query:)
-    # Perform search and return results
-    { results: search_database(query) }
+class WeatherTool < DSPy::Tools::Base
+  extend T::Sig
+
+  tool_name "weather"
+  tool_description "Get weather information for a location"
+
+  sig { params(location: String).returns(String) }
+  def call(location:)
+    { location: location, temperature: 72, condition: "sunny" }.to_json
  end
 end

-predictor = DSPy::ReAct.new(
-  TaskSignature,
-  tools: [SearchTool.new],
+class TravelSignature < DSPy::Signature
+  description "Help users plan travel"
+
+  input do
+    const :destination, String
+  end
+
+  output do
+    const :recommendations, String
+  end
+end
+
+agent = DSPy::ReAct.new(
+  TravelSignature,
+  tools: [WeatherTool.new],
  max_iterations: 5
 )
+
+result = agent.call(destination: "Tokyo, Japan")
+result.recommendations  # => "Visit Senso-ji Temple early morning..."
+result.history          # => Array of reasoning steps, actions, observations
+result.iterations       # => 3
+result.tools_used       # => ["weather"]
+```
+
+Use toolsets to expose multiple tool methods from a single class:
+
+```ruby
+text_tools = DSPy::Tools::TextProcessingToolset.to_tools
+agent = DSPy::ReAct.new(MySignature, tools: text_tools)
 ```

 ### CodeAct

-Dynamic code generation for solving problems programmatically. Requires the optional `dspy-code_act` gem.
+Think-Code-Observe agent that synthesizes and executes Ruby code. Ships as a separate gem.

 ```ruby
-predictor = DSPy::CodeAct.new(TaskSignature)
-result = predictor.forward(task: "Calculate the factorial of 5")
-# The LLM generates and executes Ruby code to solve the task
+# Gemfile
+gem 'dspy-code_act', '~> 0.29'
 ```

-## Multimodal Support
+```ruby
+programmer = DSPy::CodeAct.new(ProgrammingSignature, max_iterations: 10)
+result = programmer.call(task: "Calculate the factorial of 20")
+```

-DSPy.rb supports vision capabilities across compatible models using the unified `DSPy::Image` interface.
+### Predictor Comparison
+
+| Predictor | Speed | Token Usage | Best For |
+|-----------|-------|-------------|----------|
+| Predict | Fastest | Low | Classification, extraction |
+| ChainOfThought | Moderate | Medium-High | Complex reasoning, analysis |
+| ReAct | Slower | High | Multi-step tasks with tools |
+| CodeAct | Slowest | Very High | Dynamic programming, calculations |
+
+### Concurrent Predictions
+
+Process multiple independent predictions simultaneously using `Async::Barrier`:

 ```ruby
-class VisionSignature < DSPy::Signature
-  description "Describe what's in an image"
+require 'async'
+require 'async/barrier'

-  input do
-    const :image, DSPy::Image
-    const :question, String
+analyzer = DSPy::Predict.new(ContentAnalyzer)
+documents = ["Text one", "Text two", "Text three"]
+
+Async do
+  barrier = Async::Barrier.new
+
+  tasks = documents.map do |doc|
+    barrier.async { analyzer.call(content: doc) }
  end

-  output do
-    const :description, String
-  end
-end
+  barrier.wait
+  predictions = tasks.map(&:wait)

-predictor = DSPy::Predict.new(VisionSignature)
-result = predictor.forward(
-  image: DSPy::Image.from_file("path/to/image.jpg"),
-  question: "What objects are visible in this image?"
-)
-```
-
-### Image Input Methods
-
-```ruby
-# From file path
-DSPy::Image.from_file("path/to/image.jpg")
-
-# From URL (OpenAI only)
-DSPy::Image.from_url("https://example.com/image.jpg")
-
-# From base64-encoded data
-DSPy::Image.from_base64(base64_string, mime_type: "image/jpeg")
-```
-
-## Best Practices
-
-### 1. Clear Signature Descriptions
-
-Always provide clear, specific descriptions for signatures and fields:
-
-```ruby
-# Good
-description "Classify customer support emails into Technical, Billing, or General categories"
-
-# Avoid
-description "Classify emails"
-```
-
-### 2. Type Safety
-
-Use specific types rather than generic String when possible:
-
-```ruby
-# Good - Use enums for constrained outputs
-output do
-  const :category, T.enum(["Technical", "Billing", "General"])
-end
-
-# Less ideal - Generic string
-output do
-  const :category, String, desc: "Must be Technical, Billing, or General"
+  predictions.each { |p| puts p.sentiment }
 end
 ```

-### 3. Composable Architecture
-
-Build complex workflows from simple, reusable modules:
+Add `gem 'async', '~> 2.29'` to the Gemfile. Handle errors within each `barrier.async` block to prevent one failure from cancelling others:

 ```ruby
-class EmailPipeline < DSPy::Module
-  def initialize
-    super
-    @classifier = EmailClassifier.new
-    @prioritizer = EmailPrioritizer.new
-    @responder = EmailResponder.new
-  end
-
-  def forward(email)
-    classification = @classifier.forward(email)
-    priority = @prioritizer.forward(classification)
-    @responder.forward(classification.merge(priority))
+barrier.async do
+  begin
+    analyzer.call(content: doc)
+  rescue StandardError => e
+    nil
  end
 end
 ```

-### 4. Error Handling
-
-Always handle potential type validation errors:
+### Few-Shot Examples and Instruction Tuning

 ```ruby
-begin
-  result = predictor.forward(input_data)
-rescue DSPy::ValidationError => e
-  # Handle validation error
-  logger.error "Invalid output from LLM: #{e.message}"
+classifier = DSPy::Predict.new(SentimentAnalysis)
+
+examples = [
+  DSPy::FewShotExample.new(
+    input: { text: "Love it!" },
+    output: { sentiment: "positive", confidence: 0.95 }
+  )
+]
+
+optimized = classifier.with_examples(examples)
+tuned = classifier.with_instruction("Be precise and confident.")
+```
+
+---
+
+## Type System
+
+### Automatic Type Conversion
+
+DSPy.rb v0.9.0+ automatically converts LLM JSON responses to typed Ruby objects:
+
+- **Enums**: String values become `T::Enum` instances (case-insensitive)
+- **Structs**: Nested hashes become `T::Struct` objects
+- **Arrays**: Elements convert recursively
+- **Defaults**: Missing fields use declared defaults
+
+### Discriminators for Union Types
+
+When a field uses `T.any()` with struct types, DSPy adds a `_type` field to each struct's schema. On deserialization, `_type` selects the correct struct class:
+
+```json
+{
+  "action": {
+    "_type": "CreateTask",
+    "title": "Review Q4 Report"
+  }
+}
+```
+
+DSPy matches `"CreateTask"` against the union members and instantiates the correct struct. No manual discriminator field is needed.
+
+### Recursive Types
+
+Structs referencing themselves are supported. The schema generator tracks visited types and produces `$ref` pointers under `$defs`:
+
+```ruby
+class TreeNode < T::Struct
+  const :label, String
+  const :children, T::Array[TreeNode], default: []
 end
 ```

-## Limitations
+The generated schema uses `"$ref": "#/$defs/TreeNode"` for the children array items, preventing infinite schema expansion.

-Current constraints to be aware of:
- No streaming support (single-request processing only)
- Limited multimodal support through Ollama for local deployments
- Vision capabilities vary by provider (see providers.md for compatibility matrix)
+### Nesting Depth
+
+- 1-2 levels: reliable across all providers.
+- 3-4 levels: works but increases schema complexity.
+- 5+ levels: may trigger OpenAI depth validation warnings and reduce LLM accuracy. Flatten deeply nested structures or split into multiple signatures.
+
+### Tips
+
+- Prefer `T::Array[X], default: []` over `T.nilable(T::Array[X])` -- the nilable form causes schema issues with OpenAI structured outputs.
+- Use clear struct names for union types since they become `_type` discriminator values.
+- Limit union types to 2-4 members for reliable model comprehension.
+- Check schema compatibility with `DSPy::OpenAI::LM::SchemaConverter.validate_compatibility(schema)`.
--- a/plugins/compound-engineering/skills/dspy-ruby/references/observability.md
+++ b/plugins/compound-engineering/skills/dspy-ruby/references/observability.md
@@ -0,0 +1,366 @@
+# DSPy.rb Observability
+
+DSPy.rb provides an event-driven observability system built on OpenTelemetry. The system replaces monkey-patching with structured event emission, pluggable listeners, automatic span creation, and non-blocking Langfuse export.
+
+## Event System
+
+### Emitting Events
+
+Emit structured events with `DSPy.event`:
+
+```ruby
+DSPy.event('lm.tokens', {
+  'gen_ai.system' => 'openai',
+  'gen_ai.request.model' => 'gpt-4',
+  input_tokens: 150,
+  output_tokens: 50,
+  total_tokens: 200
+})
+```
+
+Event names are **strings** with dot-separated namespaces (e.g., `'llm.generate'`, `'react.iteration_complete'`, `'chain_of_thought.reasoning_complete'`). Do not use symbols for event names.
+
+Attributes must be JSON-serializable. DSPy automatically merges context (trace ID, module stack) and creates OpenTelemetry spans.
+
+### Global Subscriptions
+
+Subscribe to events across the entire application with `DSPy.events.subscribe`:
+
+```ruby
+# Exact event name
+subscription_id = DSPy.events.subscribe('lm.tokens') do |event_name, attrs|
+  puts "Tokens used: #{attrs[:total_tokens]}"
+end
+
+# Wildcard pattern -- matches llm.generate, llm.stream, etc.
+DSPy.events.subscribe('llm.*') do |event_name, attrs|
+  track_llm_usage(attrs)
+end
+
+# Catch-all wildcard
+DSPy.events.subscribe('*') do |event_name, attrs|
+  log_everything(event_name, attrs)
+end
+```
+
+Use global subscriptions for cross-cutting concerns: observability exporters (Langfuse, Datadog), centralized logging, metrics collection.
+
+### Module-Scoped Subscriptions
+
+Declare listeners inside a `DSPy::Module` subclass. Subscriptions automatically scope to the module instance and its descendants:
+
+```ruby
+class ResearchReport < DSPy::Module
+  subscribe 'lm.tokens', :track_tokens, scope: :descendants
+
+  def initialize
+    super
+    @outliner = DSPy::Predict.new(OutlineSignature)
+    @writer   = DSPy::Predict.new(SectionWriterSignature)
+    @token_count = 0
+  end
+
+  def forward(question:)
+    outline = @outliner.call(question: question)
+    outline.sections.map do |title|
+      draft = @writer.call(question: question, section_title: title)
+      { title: title, body: draft.paragraph }
+    end
+  end
+
+  def track_tokens(_event, attrs)
+    @token_count += attrs.fetch(:total_tokens, 0)
+  end
+end
+```
+
+The `scope:` parameter accepts:
+- `:descendants` (default) -- receives events from the module **and** every nested module invoked inside it.
+- `DSPy::Module::SubcriptionScope::SelfOnly` -- restricts delivery to events emitted by the module instance itself; ignores descendants.
+
+Inspect active subscriptions with `registered_module_subscriptions`. Tear down with `unsubscribe_module_events`.
+
+### Unsubscribe and Cleanup
+
+Remove a global listener by subscription ID:
+
+```ruby
+id = DSPy.events.subscribe('llm.*') { |name, attrs| }
+DSPy.events.unsubscribe(id)
+```
+
+Build tracker classes that manage their own subscription lifecycle:
+
+```ruby
+class TokenBudgetTracker
+  def initialize(budget:)
+    @budget = budget
+    @usage  = 0
+    @subscriptions = []
+    @subscriptions << DSPy.events.subscribe('lm.tokens') do |_event, attrs|
+      @usage += attrs.fetch(:total_tokens, 0)
+      warn("Budget hit") if @usage >= @budget
+    end
+  end
+
+  def unsubscribe
+    @subscriptions.each { |id| DSPy.events.unsubscribe(id) }
+    @subscriptions.clear
+  end
+end
+```
+
+### Clearing Listeners in Tests
+
+Call `DSPy.events.clear_listeners` in `before`/`after` blocks to prevent cross-contamination between test cases:
+
+```ruby
+RSpec.configure do |config|
+  config.after(:each) { DSPy.events.clear_listeners }
+end
+```
+
+## dspy-o11y Gems
+
+Three gems compose the observability stack:
+
+| Gem | Purpose |
+|---|---|
+| `dspy` | Core event bus (`DSPy.event`, `DSPy.events`) -- always available |
+| `dspy-o11y` | OpenTelemetry spans, `AsyncSpanProcessor`, `DSPy::Context.with_span` helpers |
+| `dspy-o11y-langfuse` | Langfuse adapter -- configures OTLP exporter targeting Langfuse endpoints |
+
+### Installation
+
+```ruby
+# Gemfile
+gem 'dspy'
+gem 'dspy-o11y'           # core spans + helpers
+gem 'dspy-o11y-langfuse'  # Langfuse/OpenTelemetry adapter (optional)
+```
+
+If the optional gems are absent, DSPy falls back to logging-only mode with no errors.
+
+## Langfuse Integration
+
+### Environment Variables
+
+```bash
+# Required
+export LANGFUSE_PUBLIC_KEY=pk-lf-your-public-key
+export LANGFUSE_SECRET_KEY=sk-lf-your-secret-key
+
+# Optional (defaults to https://cloud.langfuse.com)
+export LANGFUSE_HOST=https://us.cloud.langfuse.com
+
+# Tuning (optional)
+export DSPY_TELEMETRY_BATCH_SIZE=100        # spans per export batch (default 100)
+export DSPY_TELEMETRY_QUEUE_SIZE=1000       # max queued spans (default 1000)
+export DSPY_TELEMETRY_EXPORT_INTERVAL=60    # seconds between timed exports (default 60)
+export DSPY_TELEMETRY_SHUTDOWN_TIMEOUT=10   # seconds to drain on shutdown (default 10)
+```
+
+### Automatic Configuration
+
+Call `DSPy::Observability.configure!` once at boot (it is already called automatically when `require 'dspy'` runs and Langfuse env vars are present):
+
+```ruby
+require 'dspy'
+# If LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are set,
+# DSPy::Observability.configure! runs automatically and:
+#   1. Configures the OpenTelemetry SDK with an OTLP exporter
+#   2. Creates dual output: structured logs AND OpenTelemetry spans
+#   3. Exports spans to Langfuse using proper authentication
+#   4. Falls back gracefully if gems are missing
+```
+
+Verify status with `DSPy::Observability.enabled?`.
+
+### Automatic Tracing
+
+With observability enabled, every `DSPy::Module#forward` call, LM request, and tool invocation creates properly nested spans. Langfuse receives hierarchical traces:
+
+```
+Trace: abc-123-def
+-- ChainOfThought.forward [2000ms]  (observation type: chain)
+    +-- llm.generate [1000ms]        (observation type: generation)
+        Model: gpt-4-0613
+        Tokens: 100 in / 50 out / 150 total
+```
+
+DSPy maps module classes to Langfuse observation types automatically via `DSPy::ObservationType.for_module_class`:
+
+| Module | Observation Type |
+|---|---|
+| `DSPy::LM` (raw chat) | `generation` |
+| `DSPy::ChainOfThought` | `chain` |
+| `DSPy::ReAct` | `agent` |
+| Tool invocations | `tool` |
+| Memory/retrieval | `retriever` |
+| Embedding engines | `embedding` |
+| Evaluation modules | `evaluator` |
+| Generic operations | `span` |
+
+## Score Reporting
+
+### DSPy.score API
+
+Report evaluation scores with `DSPy.score`:
+
+```ruby
+# Numeric (default)
+DSPy.score('accuracy', 0.95)
+
+# With comment
+DSPy.score('relevance', 0.87, comment: 'High semantic similarity')
+
+# Boolean
+DSPy.score('is_valid', 1, data_type: DSPy::Scores::DataType::Boolean)
+
+# Categorical
+DSPy.score('sentiment', 'positive', data_type: DSPy::Scores::DataType::Categorical)
+
+# Explicit trace binding
+DSPy.score('accuracy', 0.95, trace_id: 'custom-trace-id')
+```
+
+Available data types: `DSPy::Scores::DataType::Numeric`, `::Boolean`, `::Categorical`.
+
+### score.create Events
+
+Every `DSPy.score` call emits a `'score.create'` event. Subscribe to react:
+
+```ruby
+DSPy.events.subscribe('score.create') do |event_name, attrs|
+  puts "#{attrs[:score_name]} = #{attrs[:score_value]}"
+  # Also available: attrs[:score_id], attrs[:score_data_type],
+  # attrs[:score_comment], attrs[:trace_id], attrs[:observation_id],
+  # attrs[:timestamp]
+end
+```
+
+### Async Langfuse Export with DSPy::Scores::Exporter
+
+Configure the exporter to send scores to Langfuse in the background:
+
+```ruby
+exporter = DSPy::Scores::Exporter.configure(
+  public_key: ENV['LANGFUSE_PUBLIC_KEY'],
+  secret_key: ENV['LANGFUSE_SECRET_KEY'],
+  host: 'https://cloud.langfuse.com'
+)
+
+# Scores are now exported automatically via a background Thread::Queue
+DSPy.score('accuracy', 0.95)
+
+# Shut down gracefully (waits up to 5 seconds by default)
+exporter.shutdown
+```
+
+The exporter subscribes to `'score.create'` events internally, queues them for async processing, and retries with exponential backoff on failure.
+
+### Automatic Export with DSPy::Evals
+
+Pass `export_scores: true` to `DSPy::Evals` to export per-example scores and an aggregate batch score automatically:
+
+```ruby
+evaluator = DSPy::Evals.new(
+  program,
+  metric: my_metric,
+  export_scores: true,
+  score_name: 'qa_accuracy'
+)
+
+result = evaluator.evaluate(test_examples)
+```
+
+## DSPy::Context.with_span
+
+Create manual spans for custom operations. Requires `dspy-o11y`.
+
+```ruby
+DSPy::Context.with_span(operation: 'custom.retrieval', 'retrieval.source' => 'pinecone') do |span|
+  results = pinecone_client.query(embedding)
+  span&.set_attribute('retrieval.count', results.size) if span
+  results
+end
+```
+
+Pass semantic attributes as keyword arguments alongside `operation:`. The block receives an OpenTelemetry span object (or `nil` when observability is disabled). The span automatically nests under the current parent span and records `duration.ms`, `langfuse.observation.startTime`, and `langfuse.observation.endTime`.
+
+Assign a Langfuse observation type to custom spans:
+
+```ruby
+DSPy::Context.with_span(
+  operation: 'evaluate.batch',
+  **DSPy::ObservationType::Evaluator.langfuse_attributes,
+  'batch.size' => examples.length
+) do |span|
+  run_evaluation(examples)
+end
+```
+
+Scores reported inside a `with_span` block automatically inherit the current trace context.
+
+## Module Stack Metadata
+
+When `DSPy::Module#forward` runs, the context layer maintains a module stack. Every event includes:
+
+```ruby
+{
+  module_path: [
+    { id: "root_uuid",    class: "DeepSearch",    label: nil },
+    { id: "planner_uuid", class: "DSPy::Predict", label: "planner" }
+  ],
+  module_root: { id: "root_uuid", class: "DeepSearch", label: nil },
+  module_leaf: { id: "planner_uuid", class: "DSPy::Predict", label: "planner" },
+  module_scope: {
+    ancestry_token: "root_uuid>planner_uuid",
+    depth: 2
+  }
+}
+```
+
+| Key | Meaning |
+|---|---|
+| `module_path` | Ordered array of `{id, class, label}` entries from root to leaf |
+| `module_root` | The outermost module in the current call chain |
+| `module_leaf` | The innermost (currently executing) module |
+| `module_scope.ancestry_token` | Stable string of joined UUIDs representing the nesting path |
+| `module_scope.depth` | Integer depth of the current module in the stack |
+
+Labels are set via `module_scope_label=` on a module instance or derived automatically from named predictors. Use this metadata to power Langfuse filters, scoped metrics, or custom event routing.
+
+## Dedicated Export Worker
+
+The `DSPy::Observability::AsyncSpanProcessor` (from `dspy-o11y`) keeps telemetry export off the hot path:
+
+- Runs on a `Concurrent::SingleThreadExecutor` -- LLM workflows never compete with OTLP networking.
+- Buffers finished spans in a `Thread::Queue` (max size configurable via `DSPY_TELEMETRY_QUEUE_SIZE`).
+- Drains spans in batches of `DSPY_TELEMETRY_BATCH_SIZE` (default 100). When the queue reaches batch size, an immediate async export fires.
+- A background timer thread triggers periodic export every `DSPY_TELEMETRY_EXPORT_INTERVAL` seconds (default 60).
+- Applies exponential backoff (`0.1 * 2^attempt` seconds) on export failures, up to `DEFAULT_MAX_RETRIES` (3).
+- On shutdown, flushes all remaining spans within `DSPY_TELEMETRY_SHUTDOWN_TIMEOUT` seconds, then terminates the executor.
+- Drops the oldest span when the queue is full, logging `'observability.span_dropped'`.
+
+No application code interacts with the processor directly. Configure it entirely through environment variables.
+
+## Built-in Events Reference
+
+| Event Name | Emitted By | Key Attributes |
+|---|---|---|
+| `lm.tokens` | `DSPy::LM` | `gen_ai.system`, `gen_ai.request.model`, `input_tokens`, `output_tokens`, `total_tokens` |
+| `chain_of_thought.reasoning_complete` | `DSPy::ChainOfThought` | `dspy.signature`, `cot.reasoning_steps`, `cot.reasoning_length`, `cot.has_reasoning` |
+| `react.iteration_complete` | `DSPy::ReAct` | `iteration`, `thought`, `action`, `observation` |
+| `codeact.iteration_complete` | `dspy-code_act` gem | `iteration`, `code_executed`, `execution_result` |
+| `optimization.trial_complete` | Teleprompters (MIPROv2) | `trial_number`, `score` |
+| `score.create` | `DSPy.score` | `score_name`, `score_value`, `score_data_type`, `trace_id` |
+| `span.start` | `DSPy::Context.with_span` | `trace_id`, `span_id`, `parent_span_id`, `operation` |
+
+## Best Practices
+
+- Use dot-separated string names for events. Follow OpenTelemetry `gen_ai.*` conventions for LLM attributes.
+- Always call `unsubscribe` (or `unsubscribe_module_events` for scoped subscriptions) when a tracker is no longer needed to prevent memory leaks.
+- Call `DSPy.events.clear_listeners` in test teardown to avoid cross-contamination.
+- Wrap risky listener logic in a rescue block. The event system isolates listener failures, but explicit rescue prevents silent swallowing of domain errors.
+- Prefer module-scoped `subscribe` for agent internals. Reserve global `DSPy.events.subscribe` for infrastructure-level concerns.
--- a/plugins/compound-engineering/skills/dspy-ruby/references/optimization.md
+++ b/plugins/compound-engineering/skills/dspy-ruby/references/optimization.md
--- a/plugins/compound-engineering/skills/dspy-ruby/references/providers.md
+++ b/plugins/compound-engineering/skills/dspy-ruby/references/providers.md
@@ -1,338 +1,418 @@
 # DSPy.rb LLM Providers

-## Supported Providers
+## Adapter Architecture

-DSPy.rb provides unified support across multiple LLM providers through adapter gems that automatically load when installed.
-
-### Provider Overview
-
- **OpenAI**: GPT-4, GPT-4o, GPT-4o-mini, GPT-3.5-turbo
- **Anthropic**: Claude 3 family (Sonnet, Opus, Haiku), Claude 3.5 Sonnet
- **Google Gemini**: Gemini 1.5 Pro, Gemini 1.5 Flash, other versions
- **Ollama**: Local model support via OpenAI compatibility layer
- **OpenRouter**: Unified multi-provider API for 200+ models
-
-## Configuration
-
-### Basic Setup
+DSPy.rb ships provider SDKs as separate adapter gems. Install only the adapters the project needs. Each adapter gem depends on the official SDK for its provider and auto-loads when present -- no explicit `require` necessary.

 ```ruby
-require 'dspy'
-
-DSPy.configure do |c|
-  c.lm = DSPy::LM.new('provider/model-name', api_key: ENV['API_KEY'])
-end
+# Gemfile
+gem 'dspy'              # core framework (no provider SDKs)
+gem 'dspy-openai'       # OpenAI, OpenRouter, Ollama
+gem 'dspy-anthropic'    # Claude
+gem 'dspy-gemini'       # Gemini
+gem 'dspy-ruby_llm'     # RubyLLM unified adapter (12+ providers)
 ```

-### OpenAI Configuration
+---

-**Required gem**: `dspy-openai`
+## Per-Provider Adapters
+
+### dspy-openai
+
+Covers any endpoint that speaks the OpenAI chat-completions protocol: OpenAI itself, OpenRouter, and Ollama.
+
+**SDK dependency:** `openai ~> 0.17`

 ```ruby
-DSPy.configure do |c|
-  # GPT-4o Mini (recommended for development)
-  c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
+# OpenAI
+lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])

-  # GPT-4o (more capable)
-  c.lm = DSPy::LM.new('openai/gpt-4o', api_key: ENV['OPENAI_API_KEY'])
+# OpenRouter -- access 200+ models behind a single key
+lm = DSPy::LM.new('openrouter/x-ai/grok-4-fast:free',
+  api_key: ENV['OPENROUTER_API_KEY']
+)

-  # GPT-4 Turbo
-  c.lm = DSPy::LM.new('openai/gpt-4-turbo', api_key: ENV['OPENAI_API_KEY'])
-end
-```
+# Ollama -- local models, no API key required
+lm = DSPy::LM.new('ollama/llama3.2')

-**Environment variable**: `OPENAI_API_KEY`
-
-### Anthropic Configuration
-
-**Required gem**: `dspy-anthropic`
-
-```ruby
-DSPy.configure do |c|
-  # Claude 3.5 Sonnet (latest, most capable)
-  c.lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
-    api_key: ENV['ANTHROPIC_API_KEY'])
-
-  # Claude 3 Opus (most capable in Claude 3 family)
-  c.lm = DSPy::LM.new('anthropic/claude-3-opus-20240229',
-    api_key: ENV['ANTHROPIC_API_KEY'])
-
-  # Claude 3 Sonnet (balanced)
-  c.lm = DSPy::LM.new('anthropic/claude-3-sonnet-20240229',
-    api_key: ENV['ANTHROPIC_API_KEY'])
-
-  # Claude 3 Haiku (fast, cost-effective)
-  c.lm = DSPy::LM.new('anthropic/claude-3-haiku-20240307',
-    api_key: ENV['ANTHROPIC_API_KEY'])
-end
-```
-
-**Environment variable**: `ANTHROPIC_API_KEY`
-
-### Google Gemini Configuration
-
-**Required gem**: `dspy-gemini`
-
-```ruby
-DSPy.configure do |c|
-  # Gemini 1.5 Pro (most capable)
-  c.lm = DSPy::LM.new('gemini/gemini-1.5-pro',
-    api_key: ENV['GOOGLE_API_KEY'])
-
-  # Gemini 1.5 Flash (faster, cost-effective)
-  c.lm = DSPy::LM.new('gemini/gemini-1.5-flash',
-    api_key: ENV['GOOGLE_API_KEY'])
-end
-```
-
-**Environment variable**: `GOOGLE_API_KEY` or `GEMINI_API_KEY`
-
-### Ollama Configuration
-
-**Required gem**: None (uses OpenAI compatibility layer)
-
-```ruby
-DSPy.configure do |c|
-  # Local Ollama instance
-  c.lm = DSPy::LM.new('ollama/llama3.1',
-    base_url: 'http://localhost:11434')
-
-  # Other Ollama models
-  c.lm = DSPy::LM.new('ollama/mistral')
-  c.lm = DSPy::LM.new('ollama/codellama')
-end
-```
-
-**Note**: Ensure Ollama is running locally: `ollama serve`
-
-### OpenRouter Configuration
-
-**Required gem**: `dspy-openai` (uses OpenAI adapter)
-
-```ruby
-DSPy.configure do |c|
-  # Access 200+ models through OpenRouter
-  c.lm = DSPy::LM.new('openrouter/anthropic/claude-3.5-sonnet',
-    api_key: ENV['OPENROUTER_API_KEY'],
-    base_url: 'https://openrouter.ai/api/v1')
-
-  # Other examples
-  c.lm = DSPy::LM.new('openrouter/google/gemini-pro')
-  c.lm = DSPy::LM.new('openrouter/meta-llama/llama-3.1-70b-instruct')
-end
-```
-
-**Environment variable**: `OPENROUTER_API_KEY`
-
-## Provider Compatibility Matrix
-
-### Feature Support
-
-| Feature | OpenAI | Anthropic | Gemini | Ollama |
-|---------|--------|-----------|--------|--------|
-| Structured Output | ✅ | ✅ | ✅ | ✅ |
-| Vision (Images) | ✅ | ✅ | ✅ | ⚠️ Limited |
-| Image URLs | ✅ | ❌ | ❌ | ❌ |
-| Tool Calling | ✅ | ✅ | ✅ | Varies |
-| Streaming | ❌ | ❌ | ❌ | ❌ |
-| Function Calling | ✅ | ✅ | ✅ | Varies |
-
-**Legend**: ✅ Full support | ⚠️ Partial support | ❌ Not supported
-
-### Vision Capabilities
-
-**Image URLs**: Only OpenAI supports direct URL references. For other providers, load images as base64 or from files.
-
-```ruby
-# OpenAI - supports URLs
-DSPy::Image.from_url("https://example.com/image.jpg")
-
-# Anthropic, Gemini - use file or base64
-DSPy::Image.from_file("path/to/image.jpg")
-DSPy::Image.from_base64(base64_data, mime_type: "image/jpeg")
-```
-
-**Ollama**: Limited multimodal functionality. Check specific model capabilities.
-
-## Advanced Configuration
-
-### Custom Parameters
-
-Pass provider-specific parameters during configuration:
-
-```ruby
-DSPy.configure do |c|
-  c.lm = DSPy::LM.new('openai/gpt-4o',
-    api_key: ENV['OPENAI_API_KEY'],
-    temperature: 0.7,
-    max_tokens: 2000,
-    top_p: 0.9
-  )
-end
-```
-
-### Multiple Providers
-
-Use different models for different tasks:
-
-```ruby
-# Fast model for simple tasks
-fast_lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
-
-# Powerful model for complex tasks
-powerful_lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
-  api_key: ENV['ANTHROPIC_API_KEY'])
-
-# Use different models in different modules
-class SimpleClassifier < DSPy::Module
-  def initialize
-    super
-    DSPy.configure { |c| c.lm = fast_lm }
-    @predictor = DSPy::Predict.new(SimpleSignature)
-  end
-end
-
-class ComplexAnalyzer < DSPy::Module
-  def initialize
-    super
-    DSPy.configure { |c| c.lm = powerful_lm }
-    @predictor = DSPy::ChainOfThought.new(ComplexSignature)
-  end
-end
-```
-
-### Per-Request Configuration
-
-Override configuration for specific predictions:
-
-```ruby
-predictor = DSPy::Predict.new(MySignature)
-
-# Use default configuration
-result1 = predictor.forward(input: "data")
-
-# Override temperature for this request
-result2 = predictor.forward(
-  input: "data",
-  config: { temperature: 0.2 }  # More deterministic
+# Remote Ollama instance
+lm = DSPy::LM.new('ollama/llama3.2',
+  base_url: 'https://my-ollama.example.com/v1',
+  api_key: 'optional-auth-token'
 )
 ```

-## Cost Optimization
+All three sub-adapters share the same request handling, structured-output support, and error reporting. Swap providers without changing higher-level DSPy code.

-### Model Selection Strategy
-
-1. **Development**: Use cheaper, faster models (gpt-4o-mini, claude-3-haiku, gemini-1.5-flash)
-2. **Production Simple Tasks**: Continue with cheaper models if quality is sufficient
-3. **Production Complex Tasks**: Upgrade to more capable models (gpt-4o, claude-3.5-sonnet, gemini-1.5-pro)
-4. **Local Development**: Use Ollama for privacy and zero API costs
-
-### Example Cost-Conscious Setup
+For OpenRouter models that lack native structured-output support, disable it explicitly:

 ```ruby
-# Development environment
-if Rails.env.development?
-  DSPy.configure do |c|
-    c.lm = DSPy::LM.new('ollama/llama3.1')  # Free, local
-  end
-elsif Rails.env.test?
-  DSPy.configure do |c|
-    c.lm = DSPy::LM.new('openai/gpt-4o-mini',  # Cheap for testing
-      api_key: ENV['OPENAI_API_KEY'])
-  end
-else  # production
-  DSPy.configure do |c|
-    c.lm = DSPy::LM.new('anthropic/claude-3-5-sonnet-20241022',
-      api_key: ENV['ANTHROPIC_API_KEY'])
-  end
+lm = DSPy::LM.new('openrouter/deepseek/deepseek-chat-v3.1:free',
+  api_key: ENV['OPENROUTER_API_KEY'],
+  structured_outputs: false
+)
+```
+
+### dspy-anthropic
+
+Provides the Claude adapter. Install it for any `anthropic/*` model id.
+
+**SDK dependency:** `anthropic ~> 1.12`
+
+```ruby
+lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514',
+  api_key: ENV['ANTHROPIC_API_KEY']
+)
+```
+
+Structured outputs default to tool-based JSON extraction (`structured_outputs: true`). Set `structured_outputs: false` to use enhanced-prompting extraction instead.
+
+```ruby
+# Tool-based extraction (default, most reliable)
+lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514',
+  api_key: ENV['ANTHROPIC_API_KEY'],
+  structured_outputs: true
+)
+
+# Enhanced prompting extraction
+lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514',
+  api_key: ENV['ANTHROPIC_API_KEY'],
+  structured_outputs: false
+)
+```
+
+### dspy-gemini
+
+Provides the Gemini adapter. Install it for any `gemini/*` model id.
+
+**SDK dependency:** `gemini-ai ~> 4.3`
+
+```ruby
+lm = DSPy::LM.new('gemini/gemini-2.5-flash',
+  api_key: ENV['GEMINI_API_KEY']
+)
+```
+
+**Environment variable:** `GEMINI_API_KEY` (also accepts `GOOGLE_API_KEY`).
+
+---
+
+## RubyLLM Unified Adapter
+
+The `dspy-ruby_llm` gem provides a single adapter that routes to 12+ providers through [RubyLLM](https://rubyllm.com). Use it when a project talks to multiple providers or needs access to Bedrock, VertexAI, DeepSeek, or Mistral without dedicated adapter gems.
+
+**SDK dependency:** `ruby_llm ~> 1.3`
+
+### Model ID Format
+
+Prefix every model id with `ruby_llm/`:
+
+```ruby
+lm = DSPy::LM.new('ruby_llm/gpt-4o-mini')
+lm = DSPy::LM.new('ruby_llm/claude-sonnet-4-20250514')
+lm = DSPy::LM.new('ruby_llm/gemini-2.5-flash')
+```
+
+The adapter detects the provider from RubyLLM's model registry automatically. For models not in the registry, pass `provider:` explicitly:
+
+```ruby
+lm = DSPy::LM.new('ruby_llm/llama3.2', provider: 'ollama')
+lm = DSPy::LM.new('ruby_llm/anthropic/claude-3-opus',
+  api_key: ENV['OPENROUTER_API_KEY'],
+  provider: 'openrouter'
+)
+```
+
+### Using Existing RubyLLM Configuration
+
+When RubyLLM is already configured globally, omit the `api_key:` argument. DSPy reuses the global config automatically:
+
+```ruby
+RubyLLM.configure do |config|
+  config.openai_api_key = ENV['OPENAI_API_KEY']
+  config.anthropic_api_key = ENV['ANTHROPIC_API_KEY']
+end
+
+# No api_key needed -- picks up the global config
+DSPy.configure do |c|
+  c.lm = DSPy::LM.new('ruby_llm/gpt-4o-mini')
 end
 ```

-## Provider-Specific Best Practices
+When an `api_key:` (or any of `base_url:`, `timeout:`, `max_retries:`) is passed, DSPy creates a **scoped context** instead of reusing the global config.

-### OpenAI
+### Cloud-Hosted Providers (Bedrock, VertexAI)

- Use `gpt-4o-mini` for development and simple tasks
- Use `gpt-4o` for production complex tasks
- Best vision support including URL loading
- Excellent function calling capabilities
-
-### Anthropic
-
- Claude 3.5 Sonnet is currently the most capable model
- Excellent for complex reasoning and analysis
- Strong safety features and helpful outputs
- Requires base64 for images (no URL support)
-
-### Google Gemini
-
- Gemini 1.5 Pro for complex tasks, Flash for speed
- Strong multimodal capabilities
- Good balance of cost and performance
- Requires base64 for images
-
-### Ollama
-
- Best for privacy-sensitive applications
- Zero API costs
- Requires local hardware resources
- Limited multimodal support depending on model
- Good for development and testing
-
-## Troubleshooting
-
-### API Key Issues
+Configure RubyLLM globally first, then reference the model:

 ```ruby
-# Verify API key is set
-if ENV['OPENAI_API_KEY'].nil?
-  raise "OPENAI_API_KEY environment variable not set"
+# AWS Bedrock
+RubyLLM.configure do |c|
+  c.bedrock_api_key = ENV['AWS_ACCESS_KEY_ID']
+  c.bedrock_secret_key = ENV['AWS_SECRET_ACCESS_KEY']
+  c.bedrock_region = 'us-east-1'
 end
+lm = DSPy::LM.new('ruby_llm/anthropic.claude-3-5-sonnet', provider: 'bedrock')

-# Test connection
-begin
-  DSPy.configure { |c| c.lm = DSPy::LM.new('openai/gpt-4o-mini',
-    api_key: ENV['OPENAI_API_KEY']) }
-  predictor = DSPy::Predict.new(TestSignature)
-  predictor.forward(test: "data")
-  puts "✅ Connection successful"
-rescue => e
-  puts "❌ Connection failed: #{e.message}"
+# Google VertexAI
+RubyLLM.configure do |c|
+  c.vertexai_project_id = 'your-project-id'
+  c.vertexai_location = 'us-central1'
 end
+lm = DSPy::LM.new('ruby_llm/gemini-pro', provider: 'vertexai')
 ```

-### Rate Limiting
+### Supported Providers Table

-Handle rate limits gracefully:
+| Provider    | Example Model ID                           | Notes                           |
+|-------------|--------------------------------------------|---------------------------------|
+| OpenAI      | `ruby_llm/gpt-4o-mini`                    | Auto-detected from registry     |
+| Anthropic   | `ruby_llm/claude-sonnet-4-20250514`       | Auto-detected from registry     |
+| Gemini      | `ruby_llm/gemini-2.5-flash`               | Auto-detected from registry     |
+| DeepSeek    | `ruby_llm/deepseek-chat`                  | Auto-detected from registry     |
+| Mistral     | `ruby_llm/mistral-large`                  | Auto-detected from registry     |
+| Ollama      | `ruby_llm/llama3.2`                       | Use `provider: 'ollama'`        |
+| AWS Bedrock | `ruby_llm/anthropic.claude-3-5-sonnet`    | Configure RubyLLM globally      |
+| VertexAI    | `ruby_llm/gemini-pro`                     | Configure RubyLLM globally      |
+| OpenRouter  | `ruby_llm/anthropic/claude-3-opus`        | Use `provider: 'openrouter'`    |
+| Perplexity  | `ruby_llm/llama-3.1-sonar-large`          | Use `provider: 'perplexity'`    |
+| GPUStack    | `ruby_llm/model-name`                     | Use `provider: 'gpustack'`      |
+
+---
+
+## Rails Initializer Pattern
+
+Configure DSPy inside an `after_initialize` block so Rails credentials and environment are fully loaded:

 ```ruby
-def call_with_retry(predictor, input, max_retries: 3)
-  retries = 0
-  begin
-    predictor.forward(input)
-  rescue RateLimitError => e
-    retries += 1
-    if retries < max_retries
-      sleep(2 ** retries)  # Exponential backoff
-      retry
+# config/initializers/dspy.rb
+Rails.application.config.after_initialize do
+  return if Rails.env.test? # skip in test -- use VCR cassettes instead
+
+  DSPy.configure do |config|
+    config.lm = DSPy::LM.new(
+      'openai/gpt-4o-mini',
+      api_key: Rails.application.credentials.openai_api_key,
+      structured_outputs: true
+    )
+
+    config.logger = if Rails.env.production?
+      Dry.Logger(:dspy, formatter: :json) do |logger|
+        logger.add_backend(stream: Rails.root.join("log/dspy.log"))
+      end
    else
-      raise
+      Dry.Logger(:dspy) do |logger|
+        logger.add_backend(level: :debug, stream: $stdout)
+      end
    end
  end
 end
 ```

-### Model Not Found
+Key points:

-Ensure the correct gem is installed:
+- Wrap in `after_initialize` so `Rails.application.credentials` is available.
+- Return early in the test environment. Rely on VCR cassettes for deterministic LLM responses.
+- Set `structured_outputs: true` (the default) for provider-native JSON extraction.
+- Use `Dry.Logger` with `:json` formatter in production for structured log parsing.
+
+---
+
+## Fiber-Local LM Context
+
+`DSPy.with_lm` sets a temporary language-model override scoped to the current Fiber. Every predictor call inside the block uses the override; outside the block the previous LM takes effect again.
+
+```ruby
+fast = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
+powerful = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY'])
+
+classifier = Classifier.new
+
+# Uses the global LM
+result = classifier.call(text: "Hello")
+
+# Temporarily switch to the fast model
+DSPy.with_lm(fast) do
+  result = classifier.call(text: "Hello")   # uses gpt-4o-mini
+end
+
+# Temporarily switch to the powerful model
+DSPy.with_lm(powerful) do
+  result = classifier.call(text: "Hello")   # uses claude-sonnet-4
+end
+```
+
+### LM Resolution Hierarchy
+
+DSPy resolves the active language model in this order:
+
+1. **Instance-level LM** -- set directly on a module instance via `configure`
+2. **Fiber-local LM** -- set via `DSPy.with_lm`
+3. **Global LM** -- set via `DSPy.configure`
+
+Instance-level configuration always wins, even inside a `DSPy.with_lm` block:
+
+```ruby
+classifier = Classifier.new
+classifier.configure { |c| c.lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY']) }
+
+fast = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
+
+DSPy.with_lm(fast) do
+  classifier.call(text: "Test")  # still uses claude-sonnet-4 (instance-level wins)
+end
+```
+
+### configure_predictor for Fine-Grained Agent Control
+
+Complex agents (`ReAct`, `CodeAct`, `DeepResearch`, `DeepSearch`) contain internal predictors. Use `configure` for a blanket override and `configure_predictor` to target a specific sub-predictor:
+
+```ruby
+agent = DSPy::ReAct.new(MySignature, tools: tools)
+
+# Set a default LM for the agent and all its children
+agent.configure { |c| c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY']) }
+
+# Override just the reasoning predictor with a more capable model
+agent.configure_predictor('thought_generator') do |c|
+  c.lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY'])
+end
+
+result = agent.call(question: "Summarize the report")
+```
+
+Both methods support chaining:
+
+```ruby
+agent
+  .configure { |c| c.lm = cheap_model }
+  .configure_predictor('thought_generator') { |c| c.lm = expensive_model }
+```
+
+#### Available Predictors by Agent Type
+
+| Agent                | Internal Predictors                                              |
+|----------------------|------------------------------------------------------------------|
+| `DSPy::ReAct`        | `thought_generator`, `observation_processor`                    |
+| `DSPy::CodeAct`      | `code_generator`, `observation_processor`                       |
+| `DSPy::DeepResearch`  | `planner`, `synthesizer`, `qa_reviewer`, `reporter`            |
+| `DSPy::DeepSearch`    | `seed_predictor`, `search_predictor`, `reader_predictor`, `reason_predictor` |
+
+#### Propagation Rules
+
+- Configuration propagates recursively to children and grandchildren.
+- Children with an already-configured LM are **not** overwritten by a later parent `configure` call.
+- Configure the parent first, then override specific children.
+
+---
+
+## Feature-Flagged Model Selection
+
+Use a `FeatureFlags` module backed by ENV vars to centralize model selection. Each tool or agent reads its model from the flags, falling back to a global default.
+
+```ruby
+module FeatureFlags
+  module_function
+
+  def default_model
+    ENV.fetch('DSPY_DEFAULT_MODEL', 'openai/gpt-4o-mini')
+  end
+
+  def default_api_key
+    ENV.fetch('DSPY_DEFAULT_API_KEY') { ENV.fetch('OPENAI_API_KEY', nil) }
+  end
+
+  def model_for(tool_name)
+    env_key = "DSPY_MODEL_#{tool_name.upcase}"
+    ENV.fetch(env_key, default_model)
+  end
+
+  def api_key_for(tool_name)
+    env_key = "DSPY_API_KEY_#{tool_name.upcase}"
+    ENV.fetch(env_key, default_api_key)
+  end
+end
+```
+
+### Per-Tool Model Override
+
+Override an individual tool's model without touching application code:

 ```bash
-# For OpenAI
-gem install dspy-openai
+# .env
+DSPY_DEFAULT_MODEL=openai/gpt-4o-mini
+DSPY_DEFAULT_API_KEY=sk-...

-# For Anthropic
-gem install dspy-anthropic
+# Override the classifier to use Claude
+DSPY_MODEL_CLASSIFIER=anthropic/claude-sonnet-4-20250514
+DSPY_API_KEY_CLASSIFIER=sk-ant-...

-# For Gemini
-gem install dspy-gemini
+# Override the summarizer to use Gemini
+DSPY_MODEL_SUMMARIZER=gemini/gemini-2.5-flash
+DSPY_API_KEY_SUMMARIZER=...
 ```
+
+Wire each agent to its flag at initialization:
+
+```ruby
+class ClassifierAgent < DSPy::Module
+  def initialize
+    super
+    model = FeatureFlags.model_for('classifier')
+    api_key = FeatureFlags.api_key_for('classifier')
+
+    @predictor = DSPy::Predict.new(ClassifySignature)
+    configure { |c| c.lm = DSPy::LM.new(model, api_key: api_key) }
+  end
+
+  def forward(text:)
+    @predictor.call(text: text)
+  end
+end
+```
+
+This pattern keeps model routing declarative and avoids scattering `DSPy::LM.new` calls across the codebase.
+
+---
+
+## Compatibility Matrix
+
+Feature support across direct adapter gems. All features listed assume `structured_outputs: true` (the default).
+
+| Feature              | OpenAI | Anthropic | Gemini | Ollama   | OpenRouter | RubyLLM     |
+|----------------------|--------|-----------|--------|----------|------------|-------------|
+| Structured Output    | Native JSON mode | Tool-based extraction | Native JSON schema | OpenAI-compatible JSON | Varies by model | Via `with_schema` |
+| Vision (Images)      | File + URL | File + Base64 | File + Base64 | Limited  | Varies     | Delegates to underlying provider |
+| Image URLs           | Yes    | No        | No     | No       | Varies     | Depends on provider |
+| Tool Calling         | Yes    | Yes       | Yes    | Varies   | Varies     | Yes         |
+| Streaming            | Yes    | Yes       | Yes    | Yes      | Yes        | Yes         |
+
+**Notes:**
+
+- **Structured Output** is enabled by default on every adapter. Set `structured_outputs: false` to fall back to enhanced-prompting extraction.
+- **Vision / Image URLs:** Only OpenAI supports passing a URL directly. For Anthropic and Gemini, load images from file or Base64:
+  ```ruby
+  DSPy::Image.from_url("https://example.com/img.jpg")    # OpenAI only
+  DSPy::Image.from_file("path/to/image.jpg")             # all providers
+  DSPy::Image.from_base64(data, mime_type: "image/jpeg")  # all providers
+  ```
+- **RubyLLM** delegates to the underlying provider, so feature support matches the provider column in the table.
+
+### Choosing an Adapter Strategy
+
+| Scenario                                  | Recommended Adapter            |
+|-------------------------------------------|--------------------------------|
+| Single provider (OpenAI, Claude, or Gemini) | Dedicated gem (`dspy-openai`, `dspy-anthropic`, `dspy-gemini`) |
+| Multi-provider with per-agent model routing | `dspy-ruby_llm`               |
+| AWS Bedrock or Google VertexAI             | `dspy-ruby_llm`               |
+| Local development with Ollama              | `dspy-openai` (Ollama sub-adapter) or `dspy-ruby_llm` |
+| OpenRouter for cost optimization           | `dspy-openai` (OpenRouter sub-adapter) |
+
+### Current Recommended Models
+
+| Provider  | Model ID                              | Use Case              |
+|-----------|---------------------------------------|-----------------------|
+| OpenAI    | `openai/gpt-4o-mini`                 | Fast, cost-effective  |
+| Anthropic | `anthropic/claude-sonnet-4-20250514` | Balanced reasoning    |
+| Gemini    | `gemini/gemini-2.5-flash`            | Fast, cost-effective  |
+| Ollama    | `ollama/llama3.2`                    | Local, zero API cost  |
--- a/plugins/compound-engineering/skills/dspy-ruby/references/toolsets.md
+++ b/plugins/compound-engineering/skills/dspy-ruby/references/toolsets.md
@@ -0,0 +1,502 @@
+# DSPy.rb Toolsets
+
+## Tools::Base
+
+`DSPy::Tools::Base` is the base class for single-purpose tools. Each subclass exposes one operation to an LLM agent through a `call` method.
+
+### Defining a Tool
+
+Set the tool's identity with the `tool_name` and `tool_description` class-level DSL methods. Define the `call` instance method with a Sorbet `sig` declaration so DSPy.rb can generate the JSON schema the LLM uses to invoke the tool.
+
+```ruby
+class WeatherLookup < DSPy::Tools::Base
+  extend T::Sig
+
+  tool_name "weather_lookup"
+  tool_description "Look up current weather for a given city"
+
+  sig { params(city: String, units: T.nilable(String)).returns(String) }
+  def call(city:, units: nil)
+    # Fetch weather data and return a string summary
+    "72F and sunny in #{city}"
+  end
+end
+```
+
+Key points:
+
+- Inherit from `DSPy::Tools::Base`, not `DSPy::Tool`.
+- Use `tool_name` (class method) to set the name the LLM sees. Without it, the class name is lowercased as a fallback.
+- Use `tool_description` (class method) to set the human-readable description surfaced in the tool schema.
+- The `call` method must use **keyword arguments**. Positional arguments are supported but keyword arguments produce better schemas.
+- Always attach a Sorbet `sig` to `call`. Without a signature, the generated schema has empty properties and the LLM cannot determine parameter types.
+
+### Schema Generation
+
+`call_schema_object` introspects the Sorbet signature on `call` and returns a hash representing the JSON Schema `parameters` object:
+
+```ruby
+WeatherLookup.call_schema_object
+# => {
+#   type: "object",
+#   properties: {
+#     city:  { type: "string", description: "Parameter city" },
+#     units: { type: "string", description: "Parameter units (optional)" }
+#   },
+#   required: ["city"]
+# }
+```
+
+`call_schema` wraps this in the full LLM tool-calling format:
+
+```ruby
+WeatherLookup.call_schema
+# => {
+#   type: "function",
+#   function: {
+#     name: "call",
+#     description: "Call the WeatherLookup tool",
+#     parameters: { ... }
+#   }
+# }
+```
+
+### Using Tools with ReAct
+
+Pass tool instances in an array to `DSPy::ReAct`:
+
+```ruby
+agent = DSPy::ReAct.new(
+  MySignature,
+  tools: [WeatherLookup.new, AnotherTool.new]
+)
+
+result = agent.call(question: "What is the weather in Berlin?")
+puts result.answer
+```
+
+Access output fields with dot notation (`result.answer`), not hash access (`result[:answer]`).
+
+---
+
+## Tools::Toolset
+
+`DSPy::Tools::Toolset` groups multiple related methods into a single class. Each exposed method becomes an independent tool from the LLM's perspective.
+
+### Defining a Toolset
+
+```ruby
+class DatabaseToolset < DSPy::Tools::Toolset
+  extend T::Sig
+
+  toolset_name "db"
+
+  tool :query,  description: "Run a read-only SQL query"
+  tool :insert, description: "Insert a record into a table"
+  tool :delete, description: "Delete a record by ID"
+
+  sig { params(sql: String).returns(String) }
+  def query(sql:)
+    # Execute read query
+  end
+
+  sig { params(table: String, data: T::Hash[String, String]).returns(String) }
+  def insert(table:, data:)
+    # Insert record
+  end
+
+  sig { params(table: String, id: Integer).returns(String) }
+  def delete(table:, id:)
+    # Delete record
+  end
+end
+```
+
+### DSL Methods
+
+**`toolset_name(name)`** -- Set the prefix for all generated tool names. If omitted, the class name minus `Toolset` suffix is lowercased (e.g., `DatabaseToolset` becomes `database`).
+
+```ruby
+toolset_name "db"
+# tool :query produces a tool named "db_query"
+```
+
+**`tool(method_name, tool_name:, description:)`** -- Expose a method as a tool.
+
+- `method_name` (Symbol, required) -- the instance method to expose.
+- `tool_name:` (String, optional) -- override the default `<toolset_name>_<method_name>` naming.
+- `description:` (String, optional) -- description shown to the LLM. Defaults to a humanized version of the method name.
+
+```ruby
+tool :word_count, tool_name: "text_wc", description: "Count lines, words, and characters"
+# Produces a tool named "text_wc" instead of "text_word_count"
+```
+
+### Converting to a Tool Array
+
+Call `to_tools` on the class (not an instance) to get an array of `ToolProxy` objects compatible with `DSPy::Tools::Base`:
+
+```ruby
+agent = DSPy::ReAct.new(
+  AnalyzeText,
+  tools: DatabaseToolset.to_tools
+)
+```
+
+Each `ToolProxy` wraps one method, delegates `call` to the underlying toolset instance, and generates its own JSON schema from the method's Sorbet signature.
+
+### Shared State
+
+All tool proxies from a single `to_tools` call share one toolset instance. Store shared state (connections, caches, configuration) in the toolset's `initialize`:
+
+```ruby
+class ApiToolset < DSPy::Tools::Toolset
+  extend T::Sig
+
+  toolset_name "api"
+
+  tool :get,  description: "Make a GET request"
+  tool :post, description: "Make a POST request"
+
+  sig { params(base_url: String).void }
+  def initialize(base_url:)
+    @base_url = base_url
+    @client = HTTP.persistent(base_url)
+  end
+
+  sig { params(path: String).returns(String) }
+  def get(path:)
+    @client.get("#{@base_url}#{path}").body.to_s
+  end
+
+  sig { params(path: String, body: String).returns(String) }
+  def post(path:, body:)
+    @client.post("#{@base_url}#{path}", body: body).body.to_s
+  end
+end
+```
+
+---
+
+## Type Safety
+
+Sorbet signatures on tool methods drive both JSON schema generation and automatic type coercion of LLM responses.
+
+### Basic Types
+
+```ruby
+sig { params(
+  text: String,
+  count: Integer,
+  score: Float,
+  enabled: T::Boolean,
+  threshold: Numeric
+).returns(String) }
+def analyze(text:, count:, score:, enabled:, threshold:)
+  # ...
+end
+```
+
+| Sorbet Type      | JSON Schema                                        |
+|------------------|----------------------------------------------------|
+| `String`         | `{"type": "string"}`                               |
+| `Integer`        | `{"type": "integer"}`                              |
+| `Float`          | `{"type": "number"}`                               |
+| `Numeric`        | `{"type": "number"}`                               |
+| `T::Boolean`     | `{"type": "boolean"}`                              |
+| `T::Enum`        | `{"type": "string", "enum": [...]}`                |
+| `T::Struct`      | `{"type": "object", "properties": {...}}`          |
+| `T::Array[Type]` | `{"type": "array", "items": {...}}`                |
+| `T::Hash[K, V]`  | `{"type": "object", "additionalProperties": {...}}`|
+| `T.nilable(Type)`| `{"type": [original, "null"]}`                     |
+| `T.any(T1, T2)`  | `{"oneOf": [{...}, {...}]}`                        |
+| `T.class_of(X)`  | `{"type": "string"}`                               |
+
+### T::Enum Parameters
+
+Define a `T::Enum` and reference it in a tool signature. DSPy.rb generates a JSON Schema `enum` constraint and automatically deserializes the LLM's string response into the correct enum instance.
+
+```ruby
+class Priority < T::Enum
+  enums do
+    Low = new('low')
+    Medium = new('medium')
+    High = new('high')
+    Critical = new('critical')
+  end
+end
+
+class Status < T::Enum
+  enums do
+    Pending = new('pending')
+    InProgress = new('in-progress')
+    Completed = new('completed')
+  end
+end
+
+sig { params(priority: Priority, status: Status).returns(String) }
+def update_task(priority:, status:)
+  "Updated to #{priority.serialize} / #{status.serialize}"
+end
+```
+
+The generated schema constrains the parameter to valid values:
+
+```json
+{
+  "priority": {
+    "type": "string",
+    "enum": ["low", "medium", "high", "critical"]
+  }
+}
+```
+
+**Case-insensitive matching**: When the LLM returns `"HIGH"` or `"High"` instead of `"high"`, DSPy.rb first tries an exact `try_deserialize`, then falls back to a case-insensitive lookup. This prevents failures caused by LLM casing variations.
+
+### T::Struct Parameters
+
+Use `T::Struct` for complex nested objects. DSPy.rb generates nested JSON Schema properties and recursively coerces the LLM's hash response into struct instances.
+
+```ruby
+class TaskMetadata < T::Struct
+  prop :id, String
+  prop :priority, Priority
+  prop :tags, T::Array[String]
+  prop :estimated_hours, T.nilable(Float), default: nil
+end
+
+class TaskRequest < T::Struct
+  prop :title, String
+  prop :description, String
+  prop :status, Status
+  prop :metadata, TaskMetadata
+  prop :assignees, T::Array[String]
+end
+
+sig { params(task: TaskRequest).returns(String) }
+def create_task(task:)
+  "Created: #{task.title} (#{task.status.serialize})"
+end
+```
+
+The LLM sees the full nested object schema and DSPy.rb reconstructs the struct tree from the JSON response, including enum fields inside nested structs.
+
+### Nilable Parameters
+
+Mark optional parameters with `T.nilable(...)` and provide a default value of `nil` in the method signature. These parameters are excluded from the JSON Schema `required` array.
+
+```ruby
+sig { params(
+  query: String,
+  max_results: T.nilable(Integer),
+  filter: T.nilable(String)
+).returns(String) }
+def search(query:, max_results: nil, filter: nil)
+  # query is required; max_results and filter are optional
+end
+```
+
+### Collections
+
+Typed arrays and hashes generate precise item/value schemas:
+
+```ruby
+sig { params(
+  tags: T::Array[String],
+  priorities: T::Array[Priority],
+  config: T::Hash[String, T.any(String, Integer, Float)]
+).returns(String) }
+def configure(tags:, priorities:, config:)
+  # Array elements and hash values are validated and coerced
+end
+```
+
+### Union Types
+
+`T.any(...)` generates a `oneOf` JSON Schema. When one of the union members is a `T::Struct`, DSPy.rb uses the `_type` discriminator field to select the correct struct class during coercion.
+
+```ruby
+sig { params(value: T.any(String, Integer, Float)).returns(String) }
+def handle_flexible(value:)
+  # Accepts multiple types
+end
+```
+
+---
+
+## Built-in Toolsets
+
+### TextProcessingToolset
+
+`DSPy::Tools::TextProcessingToolset` provides Unix-style text analysis and manipulation operations. Toolset name prefix: `text`.
+
+| Tool Name                         | Method            | Description                                |
+|-----------------------------------|-------------------|--------------------------------------------|
+| `text_grep`                       | `grep`            | Search for patterns with optional case-insensitive and count-only modes |
+| `text_wc`                         | `word_count`      | Count lines, words, and characters         |
+| `text_rg`                         | `ripgrep`         | Fast pattern search with context lines     |
+| `text_extract_lines`              | `extract_lines`   | Extract a range of lines by number         |
+| `text_filter_lines`               | `filter_lines`    | Keep or reject lines matching a regex      |
+| `text_unique_lines`               | `unique_lines`    | Deduplicate lines, optionally preserving order |
+| `text_sort_lines`                 | `sort_lines`      | Sort lines alphabetically or numerically   |
+| `text_summarize_text`             | `summarize_text`  | Produce a statistical summary (counts, averages, frequent words) |
+
+Usage:
+
+```ruby
+agent = DSPy::ReAct.new(
+  AnalyzeText,
+  tools: DSPy::Tools::TextProcessingToolset.to_tools
+)
+
+result = agent.call(text: log_contents, question: "How many error lines are there?")
+puts result.answer
+```
+
+### GitHubCLIToolset
+
+`DSPy::Tools::GitHubCLIToolset` wraps the `gh` CLI for read-oriented GitHub operations. Toolset name prefix: `github`.
+
+| Tool Name              | Method            | Description                                       |
+|------------------------|-------------------|---------------------------------------------------|
+| `github_list_issues`   | `list_issues`     | List issues filtered by state, labels, assignee   |
+| `github_list_prs`      | `list_prs`        | List pull requests filtered by state, author, base|
+| `github_get_issue`     | `get_issue`       | Retrieve details of a single issue                |
+| `github_get_pr`        | `get_pr`          | Retrieve details of a single pull request         |
+| `github_api_request`   | `api_request`     | Make an arbitrary GET request to the GitHub API    |
+| `github_traffic_views` | `traffic_views`   | Fetch repository traffic view counts              |
+| `github_traffic_clones`| `traffic_clones`  | Fetch repository traffic clone counts             |
+
+This toolset uses `T::Enum` parameters (`IssueState`, `PRState`, `ReviewState`) for state filters, demonstrating enum-based tool signatures in practice.
+
+```ruby
+agent = DSPy::ReAct.new(
+  RepoAnalysis,
+  tools: DSPy::Tools::GitHubCLIToolset.to_tools
+)
+```
+
+---
+
+## Testing
+
+### Unit Testing Individual Tools
+
+Test `DSPy::Tools::Base` subclasses by instantiating and calling `call` directly:
+
+```ruby
+RSpec.describe WeatherLookup do
+  subject(:tool) { described_class.new }
+
+  it "returns weather for a city" do
+    result = tool.call(city: "Berlin")
+    expect(result).to include("Berlin")
+  end
+
+  it "exposes the correct tool name" do
+    expect(tool.name).to eq("weather_lookup")
+  end
+
+  it "generates a valid schema" do
+    schema = described_class.call_schema_object
+    expect(schema[:required]).to include("city")
+    expect(schema[:properties]).to have_key(:city)
+  end
+end
+```
+
+### Unit Testing Toolsets
+
+Test toolset methods directly on an instance. Verify tool generation with `to_tools`:
+
+```ruby
+RSpec.describe DatabaseToolset do
+  subject(:toolset) { described_class.new }
+
+  it "executes a query" do
+    result = toolset.query(sql: "SELECT 1")
+    expect(result).to be_a(String)
+  end
+
+  it "generates tools with correct names" do
+    tools = described_class.to_tools
+    names = tools.map(&:name)
+    expect(names).to contain_exactly("db_query", "db_insert", "db_delete")
+  end
+
+  it "generates tool descriptions" do
+    tools = described_class.to_tools
+    query_tool = tools.find { |t| t.name == "db_query" }
+    expect(query_tool.description).to eq("Run a read-only SQL query")
+  end
+end
+```
+
+### Mocking Predictions Inside Tools
+
+When a tool calls a DSPy predictor internally, stub the predictor to isolate tool logic from LLM calls:
+
+```ruby
+class SmartSearchTool < DSPy::Tools::Base
+  extend T::Sig
+
+  tool_name "smart_search"
+  tool_description "Search with query expansion"
+
+  sig { void }
+  def initialize
+    @expander = DSPy::Predict.new(QueryExpansionSignature)
+  end
+
+  sig { params(query: String).returns(String) }
+  def call(query:)
+    expanded = @expander.call(query: query)
+    perform_search(expanded.expanded_query)
+  end
+
+  private
+
+  def perform_search(query)
+    # actual search logic
+  end
+end
+
+RSpec.describe SmartSearchTool do
+  subject(:tool) { described_class.new }
+
+  before do
+    expansion_result = double("result", expanded_query: "expanded test query")
+    allow_any_instance_of(DSPy::Predict).to receive(:call).and_return(expansion_result)
+  end
+
+  it "expands the query before searching" do
+    allow(tool).to receive(:perform_search).with("expanded test query").and_return("found 3 results")
+    result = tool.call(query: "test")
+    expect(result).to eq("found 3 results")
+  end
+end
+```
+
+### Testing Enum Coercion
+
+Verify that string values from LLM responses deserialize into the correct enum instances:
+
+```ruby
+RSpec.describe "enum coercion" do
+  it "handles case-insensitive enum values" do
+    toolset = GitHubCLIToolset.new
+    # The LLM may return "OPEN" instead of "open"
+    result = toolset.list_issues(state: IssueState::Open)
+    expect(result).to be_a(String)
+  end
+end
+```
+
+---
+
+## Constraints
+
+- All exposed tool methods must use **keyword arguments**. Positional-only parameters generate schemas but keyword arguments produce more reliable LLM interactions.
+- Each exposed method becomes a **separate, independent tool**. Method chaining or multi-step sequences within a single tool call are not supported.
+- Shared state across tool proxies is scoped to a single `to_tools` call. Separate `to_tools` invocations create separate toolset instances.
+- Methods without a Sorbet `sig` produce an empty parameter schema. The LLM will not know what arguments to pass.