Merge upstream v2.34.0 with FastAPI pivot (v2.35.0)

Incorporate 42 upstream commits while preserving the Ruby/Rails → Python/FastAPI
pivot. Each of the 24 conflicting files was individually triaged.

Added: tiangolo-fastapi-reviewer, python-package-readme-writer, lint (Python),
pr-comments-to-todos, fastapi-style skill, python-package-writer skill.

Removed: 3 design agents, ankane-readme-writer, dhh-rails-reviewer,
kieran-rails-reviewer, andrew-kane-gem-writer, dhh-rails-style, dspy-ruby.

Merged: best-practices-researcher, kieran-python-reviewer, resolve_todo_parallel,
file-todos, workflows/review (pressure test), workflows/plan (reviewer names).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
John Lamb
2026-02-16 17:34:54 -06:00
parent 1a3e8e2b58
commit d306c49179
45 changed files with 1533 additions and 8548 deletions

View File

@@ -1,184 +0,0 @@
---
name: andrew-kane-gem-writer
description: This skill should be used when writing Ruby gems following Andrew Kane's proven patterns and philosophy. It applies when creating new Ruby gems, refactoring existing gems, designing gem APIs, or when clean, minimal, production-ready Ruby library code is needed. Triggers on requests like "create a gem", "write a Ruby library", "design a gem API", or mentions of Andrew Kane's style.
---
# Andrew Kane Gem Writer
Write Ruby gems following Andrew Kane's battle-tested patterns from 100+ gems with 374M+ downloads (Searchkick, PgHero, Chartkick, Strong Migrations, Lockbox, Ahoy, Blazer, Groupdate, Neighbor, Blind Index).
## Core Philosophy
**Simplicity over cleverness.** Zero or minimal dependencies. Explicit code over metaprogramming. Rails integration without Rails coupling. Every pattern serves production use cases.
## Entry Point Structure
Every gem follows this exact pattern in `lib/gemname.rb`:
```ruby
# 1. Dependencies (stdlib preferred)
require "forwardable"
# 2. Internal modules
require_relative "gemname/model"
require_relative "gemname/version"
# 3. Conditional Rails (CRITICAL - never require Rails directly)
require_relative "gemname/railtie" if defined?(Rails)
# 4. Module with config and errors
module GemName
class Error < StandardError; end
class InvalidConfigError < Error; end
class << self
attr_accessor :timeout, :logger
attr_writer :client
end
self.timeout = 10 # Defaults set immediately
end
```
## Class Macro DSL Pattern
The signature Kane pattern—single method call configures everything:
```ruby
# Usage
class Product < ApplicationRecord
searchkick word_start: [:name]
end
# Implementation
module GemName
module Model
def gemname(**options)
unknown = options.keys - KNOWN_KEYWORDS
raise ArgumentError, "unknown keywords: #{unknown.join(", ")}" if unknown.any?
mod = Module.new
mod.module_eval do
define_method :some_method do
# implementation
end unless method_defined?(:some_method)
end
include mod
class_eval do
cattr_reader :gemname_options, instance_reader: false
class_variable_set :@@gemname_options, options.dup
end
end
end
end
```
## Rails Integration
**Always use `ActiveSupport.on_load`—never require Rails gems directly:**
```ruby
# WRONG
require "active_record"
ActiveRecord::Base.include(MyGem::Model)
# CORRECT
ActiveSupport.on_load(:active_record) do
extend GemName::Model
end
# Use prepend for behavior modification
ActiveSupport.on_load(:active_record) do
ActiveRecord::Migration.prepend(GemName::Migration)
end
```
## Configuration Pattern
Use `class << self` with `attr_accessor`, not Configuration objects:
```ruby
module GemName
class << self
attr_accessor :timeout, :logger
attr_writer :master_key
end
def self.master_key
@master_key ||= ENV["GEMNAME_MASTER_KEY"]
end
self.timeout = 10
self.logger = nil
end
```
## Error Handling
Simple hierarchy with informative messages:
```ruby
module GemName
class Error < StandardError; end
class ConfigError < Error; end
class ValidationError < Error; end
end
# Validate early with ArgumentError
def initialize(key:)
raise ArgumentError, "Key must be 32 bytes" unless key&.bytesize == 32
end
```
## Testing (Minitest Only)
```ruby
# test/test_helper.rb
require "bundler/setup"
Bundler.require(:default)
require "minitest/autorun"
require "minitest/pride"
# test/model_test.rb
class ModelTest < Minitest::Test
def test_basic_functionality
assert_equal expected, actual
end
end
```
## Gemspec Pattern
Zero runtime dependencies when possible:
```ruby
Gem::Specification.new do |spec|
spec.name = "gemname"
spec.version = GemName::VERSION
spec.required_ruby_version = ">= 3.1"
spec.files = Dir["*.{md,txt}", "{lib}/**/*"]
spec.require_path = "lib"
# NO add_dependency lines - dev deps go in Gemfile
end
```
## Anti-Patterns to Avoid
- `method_missing` (use `define_method` instead)
- Configuration objects (use class accessors)
- `@@class_variables` (use `class << self`)
- Requiring Rails gems directly
- Many runtime dependencies
- Committing Gemfile.lock in gems
- RSpec (use Minitest)
- Heavy DSLs (prefer explicit Ruby)
## Reference Files
For deeper patterns, see:
- **[references/module-organization.md](references/module-organization.md)** - Directory layouts, method decomposition
- **[references/rails-integration.md](references/rails-integration.md)** - Railtie, Engine, on_load patterns
- **[references/database-adapters.md](references/database-adapters.md)** - Multi-database support patterns
- **[references/testing-patterns.md](references/testing-patterns.md)** - Multi-version testing, CI setup
- **[references/resources.md](references/resources.md)** - Links to Kane's repos and articles

View File

@@ -1,231 +0,0 @@
# Database Adapter Patterns
## Abstract Base Class Pattern
```ruby
# lib/strong_migrations/adapters/abstract_adapter.rb
module StrongMigrations
module Adapters
class AbstractAdapter
def initialize(checker)
@checker = checker
end
def min_version
nil
end
def set_statement_timeout(timeout)
# no-op by default
end
def check_lock_timeout
# no-op by default
end
private
def connection
@checker.send(:connection)
end
def quote(value)
connection.quote(value)
end
end
end
end
```
## PostgreSQL Adapter
```ruby
# lib/strong_migrations/adapters/postgresql_adapter.rb
module StrongMigrations
module Adapters
class PostgreSQLAdapter < AbstractAdapter
def min_version
"12"
end
def set_statement_timeout(timeout)
select_all("SET statement_timeout = #{timeout.to_i * 1000}")
end
def set_lock_timeout(timeout)
select_all("SET lock_timeout = #{timeout.to_i * 1000}")
end
def check_lock_timeout
lock_timeout = connection.select_value("SHOW lock_timeout")
lock_timeout_sec = timeout_to_sec(lock_timeout)
# validation logic
end
private
def select_all(sql)
connection.select_all(sql)
end
def timeout_to_sec(timeout)
units = {"us" => 1e-6, "ms" => 1e-3, "s" => 1, "min" => 60}
timeout.to_f * (units[timeout.gsub(/\d+/, "")] || 1e-3)
end
end
end
end
```
## MySQL Adapter
```ruby
# lib/strong_migrations/adapters/mysql_adapter.rb
module StrongMigrations
module Adapters
class MySQLAdapter < AbstractAdapter
def min_version
"8.0"
end
def set_statement_timeout(timeout)
select_all("SET max_execution_time = #{timeout.to_i * 1000}")
end
def check_lock_timeout
lock_timeout = connection.select_value("SELECT @@lock_wait_timeout")
# validation logic
end
end
end
end
```
## MariaDB Adapter (MySQL variant)
```ruby
# lib/strong_migrations/adapters/mariadb_adapter.rb
module StrongMigrations
module Adapters
class MariaDBAdapter < MySQLAdapter
def min_version
"10.5"
end
# Override MySQL-specific behavior
def set_statement_timeout(timeout)
select_all("SET max_statement_time = #{timeout.to_i}")
end
end
end
end
```
## Adapter Detection Pattern
Use regex matching on adapter name:
```ruby
def adapter
@adapter ||= case connection.adapter_name
when /postg/i
Adapters::PostgreSQLAdapter.new(self)
when /mysql|trilogy/i
if connection.try(:mariadb?)
Adapters::MariaDBAdapter.new(self)
else
Adapters::MySQLAdapter.new(self)
end
when /sqlite/i
Adapters::SQLiteAdapter.new(self)
else
Adapters::AbstractAdapter.new(self)
end
end
```
## Multi-Database Support (PgHero pattern)
```ruby
module PgHero
class << self
attr_accessor :databases
end
self.databases = {}
def self.primary_database
databases.values.first
end
def self.capture_query_stats(database: nil)
db = database ? databases[database] : primary_database
db.capture_query_stats
end
class Database
attr_reader :id, :config
def initialize(id, config)
@id = id
@config = config
end
def connection_model
@connection_model ||= begin
Class.new(ActiveRecord::Base) do
self.abstract_class = true
end.tap do |model|
model.establish_connection(config)
end
end
end
def connection
connection_model.connection
end
end
end
```
## Connection Switching
```ruby
def with_connection(database_name)
db = databases[database_name.to_s]
raise Error, "Unknown database: #{database_name}" unless db
yield db.connection
end
# Usage
PgHero.with_connection(:replica) do |conn|
conn.execute("SELECT * FROM users")
end
```
## SQL Dialect Handling
```ruby
def quote_column(column)
case adapter_name
when /postg/i
%("#{column}")
when /mysql/i
"`#{column}`"
else
column
end
end
def boolean_value(value)
case adapter_name
when /postg/i
value ? "true" : "false"
when /mysql/i
value ? "1" : "0"
else
value.to_s
end
end
```

View File

@@ -1,121 +0,0 @@
# Module Organization Patterns
## Simple Gem Layout
```
lib/
├── gemname.rb # Entry point, config, errors
└── gemname/
├── helper.rb # Core functionality
├── engine.rb # Rails engine (if needed)
└── version.rb # VERSION constant only
```
## Complex Gem Layout (PgHero pattern)
```
lib/
├── pghero.rb
└── pghero/
├── database.rb # Main class
├── engine.rb # Rails engine
└── methods/ # Functional decomposition
├── basic.rb
├── connections.rb
├── indexes.rb
├── queries.rb
└── replication.rb
```
## Method Decomposition Pattern
Break large classes into includable modules by feature:
```ruby
# lib/pghero/database.rb
module PgHero
class Database
include Methods::Basic
include Methods::Connections
include Methods::Indexes
include Methods::Queries
end
end
# lib/pghero/methods/indexes.rb
module PgHero
module Methods
module Indexes
def index_hit_rate
# implementation
end
def unused_indexes
# implementation
end
end
end
end
```
## Version File Pattern
Keep version.rb minimal:
```ruby
# lib/gemname/version.rb
module GemName
VERSION = "2.0.0"
end
```
## Require Order in Entry Point
```ruby
# lib/searchkick.rb
# 1. Standard library
require "forwardable"
require "json"
# 2. External dependencies (minimal)
require "active_support"
# 3. Internal files via require_relative
require_relative "searchkick/index"
require_relative "searchkick/model"
require_relative "searchkick/query"
require_relative "searchkick/version"
# 4. Conditional Rails loading (LAST)
require_relative "searchkick/railtie" if defined?(Rails)
```
## Autoload vs Require
Kane uses explicit `require_relative`, not autoload:
```ruby
# CORRECT
require_relative "gemname/model"
require_relative "gemname/query"
# AVOID
autoload :Model, "gemname/model"
autoload :Query, "gemname/query"
```
## Comments Style
Minimal section headers only:
```ruby
# dependencies
require "active_support"
# adapters
require_relative "adapters/postgresql_adapter"
# modules
require_relative "migration"
```

View File

@@ -1,183 +0,0 @@
# Rails Integration Patterns
## The Golden Rule
**Never require Rails gems directly.** This causes loading order issues.
```ruby
# WRONG - causes premature loading
require "active_record"
ActiveRecord::Base.include(MyGem::Model)
# CORRECT - lazy loading
ActiveSupport.on_load(:active_record) do
extend MyGem::Model
end
```
## ActiveSupport.on_load Hooks
Common hooks and their uses:
```ruby
# Models
ActiveSupport.on_load(:active_record) do
extend GemName::Model # Add class methods (searchkick, has_encrypted)
include GemName::Callbacks # Add instance methods
end
# Controllers
ActiveSupport.on_load(:action_controller) do
include Ahoy::Controller
end
# Jobs
ActiveSupport.on_load(:active_job) do
include GemName::JobExtensions
end
# Mailers
ActiveSupport.on_load(:action_mailer) do
include GemName::MailerExtensions
end
```
## Prepend for Behavior Modification
When overriding existing Rails methods:
```ruby
ActiveSupport.on_load(:active_record) do
ActiveRecord::Migration.prepend(StrongMigrations::Migration)
ActiveRecord::Migrator.prepend(StrongMigrations::Migrator)
end
```
## Railtie Pattern
Minimal Railtie for non-mountable gems:
```ruby
# lib/gemname/railtie.rb
module GemName
class Railtie < Rails::Railtie
initializer "gemname.configure" do
ActiveSupport.on_load(:active_record) do
extend GemName::Model
end
end
# Optional: Add to controller runtime logging
initializer "gemname.log_runtime" do
require_relative "controller_runtime"
ActiveSupport.on_load(:action_controller) do
include GemName::ControllerRuntime
end
end
# Optional: Rake tasks
rake_tasks do
load "tasks/gemname.rake"
end
end
end
```
## Engine Pattern (Mountable Gems)
For gems with web interfaces (PgHero, Blazer, Ahoy):
```ruby
# lib/pghero/engine.rb
module PgHero
class Engine < ::Rails::Engine
isolate_namespace PgHero
initializer "pghero.assets", group: :all do |app|
if app.config.respond_to?(:assets) && defined?(Sprockets)
app.config.assets.precompile << "pghero/application.js"
app.config.assets.precompile << "pghero/application.css"
end
end
initializer "pghero.config" do
PgHero.config = Rails.application.config_for(:pghero) rescue {}
end
end
end
```
## Routes for Engines
```ruby
# config/routes.rb (in engine)
PgHero::Engine.routes.draw do
root to: "home#index"
resources :databases, only: [:show]
end
```
Mount in app:
```ruby
# config/routes.rb (in app)
mount PgHero::Engine, at: "pghero"
```
## YAML Configuration with ERB
For complex gems needing config files:
```ruby
def self.settings
@settings ||= begin
path = Rails.root.join("config", "blazer.yml")
if path.exist?
YAML.safe_load(ERB.new(File.read(path)).result, aliases: true)
else
{}
end
end
end
```
## Generator Pattern
```ruby
# lib/generators/gemname/install_generator.rb
module GemName
module Generators
class InstallGenerator < Rails::Generators::Base
source_root File.expand_path("templates", __dir__)
def copy_initializer
template "initializer.rb", "config/initializers/gemname.rb"
end
def copy_migration
migration_template "migration.rb", "db/migrate/create_gemname_tables.rb"
end
end
end
end
```
## Conditional Feature Detection
```ruby
# Check for specific Rails versions
if ActiveRecord.version >= Gem::Version.new("7.0")
# Rails 7+ specific code
end
# Check for optional dependencies
def self.client
@client ||= if defined?(OpenSearch::Client)
OpenSearch::Client.new
elsif defined?(Elasticsearch::Client)
Elasticsearch::Client.new
else
raise Error, "Install elasticsearch or opensearch-ruby"
end
end
```

View File

@@ -1,119 +0,0 @@
# Andrew Kane Resources
## Primary Documentation
- **Gem Patterns Article**: https://ankane.org/gem-patterns
- Kane's own documentation of patterns used across his gems
- Covers configuration, Rails integration, error handling
## Top Ruby Gems by Stars
### Search & Data
| Gem | Stars | Description | Source |
|-----|-------|-------------|--------|
| **Searchkick** | 6.6k+ | Intelligent search for Rails | https://github.com/ankane/searchkick |
| **Chartkick** | 6.4k+ | Beautiful charts in Ruby | https://github.com/ankane/chartkick |
| **Groupdate** | 3.8k+ | Group by day, week, month | https://github.com/ankane/groupdate |
| **Blazer** | 4.6k+ | SQL dashboard for Rails | https://github.com/ankane/blazer |
### Database & Migrations
| Gem | Stars | Description | Source |
|-----|-------|-------------|--------|
| **PgHero** | 8.2k+ | PostgreSQL insights | https://github.com/ankane/pghero |
| **Strong Migrations** | 4.1k+ | Safe migration checks | https://github.com/ankane/strong_migrations |
| **Dexter** | 1.8k+ | Auto index advisor | https://github.com/ankane/dexter |
| **PgSync** | 1.5k+ | Sync Postgres data | https://github.com/ankane/pgsync |
### Security & Encryption
| Gem | Stars | Description | Source |
|-----|-------|-------------|--------|
| **Lockbox** | 1.5k+ | Application-level encryption | https://github.com/ankane/lockbox |
| **Blind Index** | 1.0k+ | Encrypted search | https://github.com/ankane/blind_index |
| **Secure Headers** | — | Contributed patterns | Referenced in gems |
### Analytics & ML
| Gem | Stars | Description | Source |
|-----|-------|-------------|--------|
| **Ahoy** | 4.2k+ | Analytics for Rails | https://github.com/ankane/ahoy |
| **Neighbor** | 1.1k+ | Vector search for Rails | https://github.com/ankane/neighbor |
| **Rover** | 700+ | DataFrames for Ruby | https://github.com/ankane/rover |
| **Tomoto** | 200+ | Topic modeling | https://github.com/ankane/tomoto-ruby |
### Utilities
| Gem | Stars | Description | Source |
|-----|-------|-------------|--------|
| **Pretender** | 2.0k+ | Login as another user | https://github.com/ankane/pretender |
| **Authtrail** | 900+ | Login activity tracking | https://github.com/ankane/authtrail |
| **Notable** | 200+ | Track notable requests | https://github.com/ankane/notable |
| **Logstop** | 200+ | Filter sensitive logs | https://github.com/ankane/logstop |
## Key Source Files to Study
### Entry Point Patterns
- https://github.com/ankane/searchkick/blob/master/lib/searchkick.rb
- https://github.com/ankane/pghero/blob/master/lib/pghero.rb
- https://github.com/ankane/strong_migrations/blob/master/lib/strong_migrations.rb
- https://github.com/ankane/lockbox/blob/master/lib/lockbox.rb
### Class Macro Implementations
- https://github.com/ankane/searchkick/blob/master/lib/searchkick/model.rb
- https://github.com/ankane/lockbox/blob/master/lib/lockbox/model.rb
- https://github.com/ankane/neighbor/blob/master/lib/neighbor/model.rb
- https://github.com/ankane/blind_index/blob/master/lib/blind_index/model.rb
### Rails Integration (Railtie/Engine)
- https://github.com/ankane/pghero/blob/master/lib/pghero/engine.rb
- https://github.com/ankane/searchkick/blob/master/lib/searchkick/railtie.rb
- https://github.com/ankane/ahoy/blob/master/lib/ahoy/engine.rb
- https://github.com/ankane/blazer/blob/master/lib/blazer/engine.rb
### Database Adapters
- https://github.com/ankane/strong_migrations/tree/master/lib/strong_migrations/adapters
- https://github.com/ankane/groupdate/tree/master/lib/groupdate/adapters
- https://github.com/ankane/neighbor/tree/master/lib/neighbor
### Error Messages (Template Pattern)
- https://github.com/ankane/strong_migrations/blob/master/lib/strong_migrations/error_messages.rb
### Gemspec Examples
- https://github.com/ankane/searchkick/blob/master/searchkick.gemspec
- https://github.com/ankane/neighbor/blob/master/neighbor.gemspec
- https://github.com/ankane/ahoy/blob/master/ahoy_matey.gemspec
### Test Setups
- https://github.com/ankane/searchkick/tree/master/test
- https://github.com/ankane/lockbox/tree/master/test
- https://github.com/ankane/strong_migrations/tree/master/test
## GitHub Profile
- **Profile**: https://github.com/ankane
- **All Ruby Repos**: https://github.com/ankane?tab=repositories&q=&type=&language=ruby&sort=stargazers
- **RubyGems Profile**: https://rubygems.org/profiles/ankane
## Blog Posts & Articles
- **ankane.org**: https://ankane.org/
- **Gem Patterns**: https://ankane.org/gem-patterns (essential reading)
- **Postgres Performance**: https://ankane.org/introducing-pghero
- **Search Tips**: https://ankane.org/search-rails
## Design Philosophy Summary
From studying 100+ gems, Kane's consistent principles:
1. **Zero dependencies when possible** - Each dep is a maintenance burden
2. **ActiveSupport.on_load always** - Never require Rails gems directly
3. **Class macro DSLs** - Single method configures everything
4. **Explicit over magic** - No method_missing, define methods directly
5. **Minitest only** - Simple, sufficient, no RSpec
6. **Multi-version testing** - Support broad Rails/Ruby versions
7. **Helpful errors** - Template-based messages with fix suggestions
8. **Abstract adapters** - Clean multi-database support
9. **Engine isolation** - isolate_namespace for mountable gems
10. **Minimal documentation** - Code is self-documenting, README is examples

View File

@@ -1,261 +0,0 @@
# Testing Patterns
## Minitest Setup
Kane exclusively uses Minitest—never RSpec.
```ruby
# test/test_helper.rb
require "bundler/setup"
Bundler.require(:default)
require "minitest/autorun"
require "minitest/pride"
# Load the gem
require "gemname"
# Test database setup (if needed)
ActiveRecord::Base.establish_connection(
adapter: "postgresql",
database: "gemname_test"
)
# Base test class
class Minitest::Test
def setup
# Reset state before each test
end
end
```
## Test File Structure
```ruby
# test/model_test.rb
require_relative "test_helper"
class ModelTest < Minitest::Test
def setup
User.delete_all
end
def test_basic_functionality
user = User.create!(email: "test@example.org")
assert_equal "test@example.org", user.email
end
def test_with_invalid_input
error = assert_raises(ArgumentError) do
User.create!(email: nil)
end
assert_match /email/, error.message
end
def test_class_method
result = User.search("test")
assert_kind_of Array, result
end
end
```
## Multi-Version Testing
Test against multiple Rails/Ruby versions using gemfiles:
```
test/
├── test_helper.rb
└── gemfiles/
├── activerecord70.gemfile
├── activerecord71.gemfile
└── activerecord72.gemfile
```
```ruby
# test/gemfiles/activerecord70.gemfile
source "https://rubygems.org"
gemspec path: "../../"
gem "activerecord", "~> 7.0.0"
gem "sqlite3"
```
```ruby
# test/gemfiles/activerecord72.gemfile
source "https://rubygems.org"
gemspec path: "../../"
gem "activerecord", "~> 7.2.0"
gem "sqlite3"
```
Run with specific gemfile:
```bash
BUNDLE_GEMFILE=test/gemfiles/activerecord70.gemfile bundle install
BUNDLE_GEMFILE=test/gemfiles/activerecord70.gemfile bundle exec rake test
```
## Rakefile
```ruby
# Rakefile
require "bundler/gem_tasks"
require "rake/testtask"
Rake::TestTask.new(:test) do |t|
t.libs << "test"
t.pattern = "test/**/*_test.rb"
end
task default: :test
```
## GitHub Actions CI
```yaml
# .github/workflows/build.yml
name: build
on: [push, pull_request]
jobs:
build:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
include:
- ruby: "3.2"
gemfile: activerecord70
- ruby: "3.3"
gemfile: activerecord71
- ruby: "3.3"
gemfile: activerecord72
env:
BUNDLE_GEMFILE: test/gemfiles/${{ matrix.gemfile }}.gemfile
steps:
- uses: actions/checkout@v4
- uses: ruby/setup-ruby@v1
with:
ruby-version: ${{ matrix.ruby }}
bundler-cache: true
- run: bundle exec rake test
```
## Database-Specific Testing
```yaml
# .github/workflows/build.yml (with services)
services:
postgres:
image: postgres:15
env:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
ports:
- 5432:5432
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
env:
DATABASE_URL: postgres://postgres:postgres@localhost/gemname_test
```
## Test Database Setup
```ruby
# test/test_helper.rb
require "active_record"
# Connect to database
ActiveRecord::Base.establish_connection(
ENV["DATABASE_URL"] || {
adapter: "postgresql",
database: "gemname_test"
}
)
# Create tables
ActiveRecord::Schema.define do
create_table :users, force: true do |t|
t.string :email
t.text :encrypted_data
t.timestamps
end
end
# Define models
class User < ActiveRecord::Base
gemname_feature :email
end
```
## Assertion Patterns
```ruby
# Basic assertions
assert result
assert_equal expected, actual
assert_nil value
assert_empty array
# Exception testing
assert_raises(ArgumentError) { bad_code }
error = assert_raises(GemName::Error) do
risky_operation
end
assert_match /expected message/, error.message
# Refutations
refute condition
refute_equal unexpected, actual
refute_nil value
```
## Test Helpers
```ruby
# test/test_helper.rb
class Minitest::Test
def with_options(options)
original = GemName.options.dup
GemName.options.merge!(options)
yield
ensure
GemName.options = original
end
def assert_queries(expected_count)
queries = []
callback = ->(*, payload) { queries << payload[:sql] }
ActiveSupport::Notifications.subscribe("sql.active_record", callback)
yield
assert_equal expected_count, queries.size, "Expected #{expected_count} queries, got #{queries.size}"
ensure
ActiveSupport::Notifications.unsubscribe(callback)
end
end
```
## Skipping Tests
```ruby
def test_postgresql_specific
skip "PostgreSQL only" unless postgresql?
# test code
end
def postgresql?
ActiveRecord::Base.connection.adapter_name =~ /postg/i
end
```

View File

@@ -1,185 +0,0 @@
---
name: dhh-rails-style
description: This skill should be used when writing Ruby and Rails code in DHH's distinctive 37signals style. It applies when writing Ruby code, Rails applications, creating models, controllers, or any Ruby file. Triggers on Ruby/Rails code generation, refactoring requests, code review, or when the user mentions DHH, 37signals, Basecamp, HEY, or Campfire style. Embodies REST purity, fat models, thin controllers, Current attributes, Hotwire patterns, and the "clarity over cleverness" philosophy.
---
<objective>
Apply 37signals/DHH Rails conventions to Ruby and Rails code. This skill provides comprehensive domain expertise extracted from analyzing production 37signals codebases (Fizzy/Campfire) and DHH's code review patterns.
</objective>
<essential_principles>
## Core Philosophy
"The best code is the code you don't write. The second best is the code that's obviously correct."
**Vanilla Rails is plenty:**
- Rich domain models over service objects
- CRUD controllers over custom actions
- Concerns for horizontal code sharing
- Records as state instead of boolean columns
- Database-backed everything (no Redis)
- Build solutions before reaching for gems
**What they deliberately avoid:**
- devise (custom ~150-line auth instead)
- pundit/cancancan (simple role checks in models)
- sidekiq (Solid Queue uses database)
- redis (database for everything)
- view_component (partials work fine)
- GraphQL (REST with Turbo sufficient)
- factory_bot (fixtures are simpler)
- rspec (Minitest ships with Rails)
- Tailwind (native CSS with layers)
**Development Philosophy:**
- Ship, Validate, Refine - prototype-quality code to production to learn
- Fix root causes, not symptoms
- Write-time operations over read-time computations
- Database constraints over ActiveRecord validations
</essential_principles>
<intake>
What are you working on?
1. **Controllers** - REST mapping, concerns, Turbo responses, API patterns
2. **Models** - Concerns, state records, callbacks, scopes, POROs
3. **Views & Frontend** - Turbo, Stimulus, CSS, partials
4. **Architecture** - Routing, multi-tenancy, authentication, jobs, caching
5. **Testing** - Minitest, fixtures, integration tests
6. **Gems & Dependencies** - What to use vs avoid
7. **Code Review** - Review code against DHH style
8. **General Guidance** - Philosophy and conventions
**Specify a number or describe your task.**
</intake>
<routing>
| Response | Reference to Read |
|----------|-------------------|
| 1, controller | [controllers.md](./references/controllers.md) |
| 2, model | [models.md](./references/models.md) |
| 3, view, frontend, turbo, stimulus, css | [frontend.md](./references/frontend.md) |
| 4, architecture, routing, auth, job, cache | [architecture.md](./references/architecture.md) |
| 5, test, testing, minitest, fixture | [testing.md](./references/testing.md) |
| 6, gem, dependency, library | [gems.md](./references/gems.md) |
| 7, review | Read all references, then review code |
| 8, general task | Read relevant references based on context |
**After reading relevant references, apply patterns to the user's code.**
</routing>
<quick_reference>
## Naming Conventions
**Verbs:** `card.close`, `card.gild`, `board.publish` (not `set_style` methods)
**Predicates:** `card.closed?`, `card.golden?` (derived from presence of related record)
**Concerns:** Adjectives describing capability (`Closeable`, `Publishable`, `Watchable`)
**Controllers:** Nouns matching resources (`Cards::ClosuresController`)
**Scopes:**
- `chronologically`, `reverse_chronologically`, `alphabetically`, `latest`
- `preloaded` (standard eager loading name)
- `indexed_by`, `sorted_by` (parameterized)
- `active`, `unassigned` (business terms, not SQL-ish)
## REST Mapping
Instead of custom actions, create new resources:
```
POST /cards/:id/close → POST /cards/:id/closure
DELETE /cards/:id/close → DELETE /cards/:id/closure
POST /cards/:id/archive → POST /cards/:id/archival
```
## Ruby Syntax Preferences
```ruby
# Symbol arrays with spaces inside brackets
before_action :set_message, only: %i[ show edit update destroy ]
# Private method indentation
private
def set_message
@message = Message.find(params[:id])
end
# Expression-less case for conditionals
case
when params[:before].present?
messages.page_before(params[:before])
else
messages.last_page
end
# Bang methods for fail-fast
@message = Message.create!(params)
# Ternaries for simple conditionals
@room.direct? ? @room.users : @message.mentionees
```
## Key Patterns
**State as Records:**
```ruby
Card.joins(:closure) # closed cards
Card.where.missing(:closure) # open cards
```
**Current Attributes:**
```ruby
belongs_to :creator, default: -> { Current.user }
```
**Authorization on Models:**
```ruby
class User < ApplicationRecord
def can_administer?(message)
message.creator == self || admin?
end
end
```
</quick_reference>
<reference_index>
## Domain Knowledge
All detailed patterns in `references/`:
| File | Topics |
|------|--------|
| [controllers.md](./references/controllers.md) | REST mapping, concerns, Turbo responses, API patterns, HTTP caching |
| [models.md](./references/models.md) | Concerns, state records, callbacks, scopes, POROs, authorization, broadcasting |
| [frontend.md](./references/frontend.md) | Turbo Streams, Stimulus controllers, CSS layers, OKLCH colors, partials |
| [architecture.md](./references/architecture.md) | Routing, authentication, jobs, Current attributes, caching, database patterns |
| [testing.md](./references/testing.md) | Minitest, fixtures, unit/integration/system tests, testing patterns |
| [gems.md](./references/gems.md) | What they use vs avoid, decision framework, Gemfile examples |
</reference_index>
<success_criteria>
Code follows DHH style when:
- Controllers map to CRUD verbs on resources
- Models use concerns for horizontal behavior
- State is tracked via records, not booleans
- No unnecessary service objects or abstractions
- Database-backed solutions preferred over external services
- Tests use Minitest with fixtures
- Turbo/Stimulus for interactivity (no heavy JS frameworks)
- Native CSS with modern features (layers, OKLCH, nesting)
- Authorization logic lives on User model
- Jobs are shallow wrappers calling model methods
</success_criteria>
<credits>
Based on [The Unofficial 37signals/DHH Rails Style Guide](https://github.com/marckohlbrugge/unofficial-37signals-coding-style-guide) by [Marc Köhlbrugge](https://x.com/marckohlbrugge), generated through deep analysis of 265 pull requests from the Fizzy codebase.
**Important Disclaimers:**
- LLM-generated guide - may contain inaccuracies
- Code examples from Fizzy are licensed under the O'Saasy License
- Not affiliated with or endorsed by 37signals
</credits>

View File

@@ -1,653 +0,0 @@
# Architecture - DHH Rails Style
<routing>
## Routing
Everything maps to CRUD. Nested resources for related actions:
```ruby
Rails.application.routes.draw do
resources :boards do
resources :cards do
resource :closure
resource :goldness
resource :not_now
resources :assignments
resources :comments
end
end
end
```
**Verb-to-noun conversion:**
| Action | Resource |
|--------|----------|
| close a card | `card.closure` |
| watch a board | `board.watching` |
| mark as golden | `card.goldness` |
| archive a card | `card.archival` |
**Shallow nesting** - avoid deep URLs:
```ruby
resources :boards do
resources :cards, shallow: true # /boards/:id/cards, but /cards/:id
end
```
**Singular resources** for one-per-parent:
```ruby
resource :closure # not resources
resource :goldness
```
**Resolve for URL generation:**
```ruby
# config/routes.rb
resolve("Comment") { |comment| [comment.card, anchor: dom_id(comment)] }
# Now url_for(@comment) works correctly
```
</routing>
<multi_tenancy>
## Multi-Tenancy (Path-Based)
**Middleware extracts tenant** from URL prefix:
```ruby
# lib/tenant_extractor.rb
class TenantExtractor
def initialize(app)
@app = app
end
def call(env)
path = env["PATH_INFO"]
if match = path.match(%r{^/(\d+)(/.*)?$})
env["SCRIPT_NAME"] = "/#{match[1]}"
env["PATH_INFO"] = match[2] || "/"
end
@app.call(env)
end
end
```
**Cookie scoping** per tenant:
```ruby
# Cookies scoped to tenant path
cookies.signed[:session_id] = {
value: session.id,
path: "/#{Current.account.id}"
}
```
**Background job context** - serialize tenant:
```ruby
class ApplicationJob < ActiveJob::Base
around_perform do |job, block|
Current.set(account: job.arguments.first.account) { block.call }
end
end
```
**Recurring jobs** must iterate all tenants:
```ruby
class DailyDigestJob < ApplicationJob
def perform
Account.find_each do |account|
Current.set(account: account) do
send_digest_for(account)
end
end
end
end
```
**Controller security** - always scope through tenant:
```ruby
# Good - scoped through user's accessible records
@card = Current.user.accessible_cards.find(params[:id])
# Avoid - direct lookup
@card = Card.find(params[:id])
```
</multi_tenancy>
<authentication>
## Authentication
Custom passwordless magic link auth (~150 lines total):
```ruby
# app/models/session.rb
class Session < ApplicationRecord
belongs_to :user
before_create { self.token = SecureRandom.urlsafe_base64(32) }
end
# app/models/magic_link.rb
class MagicLink < ApplicationRecord
belongs_to :user
before_create do
self.code = SecureRandom.random_number(100_000..999_999).to_s
self.expires_at = 15.minutes.from_now
end
def expired?
expires_at < Time.current
end
end
```
**Why not Devise:**
- ~150 lines vs massive dependency
- No password storage liability
- Simpler UX for users
- Full control over flow
**Bearer token** for APIs:
```ruby
module Authentication
extend ActiveSupport::Concern
included do
before_action :authenticate
end
private
def authenticate
if bearer_token = request.headers["Authorization"]&.split(" ")&.last
Current.session = Session.find_by(token: bearer_token)
else
Current.session = Session.find_by(id: cookies.signed[:session_id])
end
redirect_to login_path unless Current.session
end
end
```
</authentication>
<background_jobs>
## Background Jobs
Jobs are shallow wrappers calling model methods:
```ruby
class NotifyWatchersJob < ApplicationJob
def perform(card)
card.notify_watchers
end
end
```
**Naming convention:**
- `_later` suffix for async: `card.notify_watchers_later`
- `_now` suffix for immediate: `card.notify_watchers_now`
```ruby
module Watchable
def notify_watchers_later
NotifyWatchersJob.perform_later(self)
end
def notify_watchers_now
NotifyWatchersJob.perform_now(self)
end
def notify_watchers
watchers.each do |watcher|
WatcherMailer.notification(watcher, self).deliver_later
end
end
end
```
**Database-backed** with Solid Queue:
- No Redis required
- Same transactional guarantees as your data
- Simpler infrastructure
**Transaction safety:**
```ruby
# config/application.rb
config.active_job.enqueue_after_transaction_commit = true
```
**Error handling** by type:
```ruby
class DeliveryJob < ApplicationJob
# Transient errors - retry with backoff
retry_on Net::OpenTimeout, Net::ReadTimeout,
Resolv::ResolvError,
wait: :polynomially_longer
# Permanent errors - log and discard
discard_on Net::SMTPSyntaxError do |job, error|
Sentry.capture_exception(error, level: :info)
end
end
```
**Batch processing** with continuable:
```ruby
class ProcessCardsJob < ApplicationJob
include ActiveJob::Continuable
def perform
Card.in_batches.each_record do |card|
checkpoint! # Resume from here if interrupted
process(card)
end
end
end
```
</background_jobs>
<database_patterns>
## Database Patterns
**UUIDs as primary keys** (time-sortable UUIDv7):
```ruby
# migration
create_table :cards, id: :uuid do |t|
t.references :board, type: :uuid, foreign_key: true
end
```
Benefits: No ID enumeration, distributed-friendly, client-side generation.
**State as records** (not booleans):
```ruby
# Instead of closed: boolean
class Card::Closure < ApplicationRecord
belongs_to :card
belongs_to :creator, class_name: "User"
end
# Queries become joins
Card.joins(:closure) # closed
Card.where.missing(:closure) # open
```
**Hard deletes** - no soft delete:
```ruby
# Just destroy
card.destroy!
# Use events for history
card.record_event(:deleted, by: Current.user)
```
Simplifies queries, uses event logs for auditing.
**Counter caches** for performance:
```ruby
class Comment < ApplicationRecord
belongs_to :card, counter_cache: true
end
# card.comments_count available without query
```
**Account scoping** on every table:
```ruby
class Card < ApplicationRecord
belongs_to :account
default_scope { where(account: Current.account) }
end
```
</database_patterns>
<current_attributes>
## Current Attributes
Use `Current` for request-scoped state:
```ruby
# app/models/current.rb
class Current < ActiveSupport::CurrentAttributes
attribute :session, :user, :account, :request_id
delegate :user, to: :session, allow_nil: true
def account=(account)
super
Time.zone = account&.time_zone || "UTC"
end
end
```
Set in controller:
```ruby
class ApplicationController < ActionController::Base
before_action :set_current_request
private
def set_current_request
Current.session = authenticated_session
Current.account = Account.find(params[:account_id])
Current.request_id = request.request_id
end
end
```
Use throughout app:
```ruby
class Card < ApplicationRecord
belongs_to :creator, default: -> { Current.user }
end
```
</current_attributes>
<caching>
## Caching
**HTTP caching** with ETags:
```ruby
fresh_when etag: [@card, Current.user.timezone]
```
**Fragment caching:**
```erb
<% cache card do %>
<%= render card %>
<% end %>
```
**Russian doll caching:**
```erb
<% cache @board do %>
<% @board.cards.each do |card| %>
<% cache card do %>
<%= render card %>
<% end %>
<% end %>
<% end %>
```
**Cache invalidation** via `touch: true`:
```ruby
class Card < ApplicationRecord
belongs_to :board, touch: true
end
```
**Solid Cache** - database-backed:
- No Redis required
- Consistent with application data
- Simpler infrastructure
</caching>
<configuration>
## Configuration
**ENV.fetch with defaults:**
```ruby
# config/application.rb
config.active_job.queue_adapter = ENV.fetch("QUEUE_ADAPTER", "solid_queue").to_sym
config.cache_store = ENV.fetch("CACHE_STORE", "solid_cache").to_sym
```
**Multiple databases:**
```yaml
# config/database.yml
production:
primary:
<<: *default
cable:
<<: *default
migrations_paths: db/cable_migrate
queue:
<<: *default
migrations_paths: db/queue_migrate
cache:
<<: *default
migrations_paths: db/cache_migrate
```
**Switch between SQLite and MySQL via ENV:**
```ruby
adapter = ENV.fetch("DATABASE_ADAPTER", "sqlite3")
```
**CSP extensible via ENV:**
```ruby
config.content_security_policy do |policy|
policy.default_src :self
policy.script_src :self, *ENV.fetch("CSP_SCRIPT_SRC", "").split(",")
end
```
</configuration>
<testing>
## Testing
**Minitest**, not RSpec:
```ruby
class CardTest < ActiveSupport::TestCase
test "closing a card creates a closure" do
card = cards(:one)
card.close
assert card.closed?
assert_not_nil card.closure
end
end
```
**Fixtures** instead of factories:
```yaml
# test/fixtures/cards.yml
one:
title: First Card
board: main
creator: alice
two:
title: Second Card
board: main
creator: bob
```
**Integration tests** for controllers:
```ruby
class CardsControllerTest < ActionDispatch::IntegrationTest
test "closing a card" do
card = cards(:one)
sign_in users(:alice)
post card_closure_path(card)
assert_response :success
assert card.reload.closed?
end
end
```
**Tests ship with features** - same commit, not TDD-first but together.
**Regression tests for security fixes** - always.
</testing>
<events>
## Event Tracking
Events are the single source of truth:
```ruby
class Event < ApplicationRecord
belongs_to :creator, class_name: "User"
belongs_to :eventable, polymorphic: true
serialize :particulars, coder: JSON
end
```
**Eventable concern:**
```ruby
module Eventable
extend ActiveSupport::Concern
included do
has_many :events, as: :eventable, dependent: :destroy
end
def record_event(action, particulars = {})
events.create!(
creator: Current.user,
action: action,
particulars: particulars
)
end
end
```
**Webhooks driven by events** - events are the canonical source.
</events>
<email_patterns>
## Email Patterns
**Multi-tenant URL helpers:**
```ruby
class ApplicationMailer < ActionMailer::Base
def default_url_options
options = super
if Current.account
options[:script_name] = "/#{Current.account.id}"
end
options
end
end
```
**Timezone-aware delivery:**
```ruby
class NotificationMailer < ApplicationMailer
def daily_digest(user)
Time.use_zone(user.timezone) do
@user = user
@digest = user.digest_for_today
mail(to: user.email, subject: "Daily Digest")
end
end
end
```
**Batch delivery:**
```ruby
emails = users.map { |user| NotificationMailer.digest(user) }
ActiveJob.perform_all_later(emails.map(&:deliver_later))
```
**One-click unsubscribe (RFC 8058):**
```ruby
class ApplicationMailer < ActionMailer::Base
after_action :set_unsubscribe_headers
private
def set_unsubscribe_headers
headers["List-Unsubscribe-Post"] = "List-Unsubscribe=One-Click"
headers["List-Unsubscribe"] = "<#{unsubscribe_url}>"
end
end
```
</email_patterns>
<security_patterns>
## Security Patterns
**XSS prevention** - escape in helpers:
```ruby
def formatted_content(text)
# Escape first, then mark safe
simple_format(h(text)).html_safe
end
```
**SSRF protection:**
```ruby
# Resolve DNS once, pin the IP
def fetch_safely(url)
uri = URI.parse(url)
ip = Resolv.getaddress(uri.host)
# Block private networks
raise "Private IP" if private_ip?(ip)
# Use pinned IP for request
Net::HTTP.start(uri.host, uri.port, ipaddr: ip) { |http| ... }
end
def private_ip?(ip)
ip.start_with?("127.", "10.", "192.168.") ||
ip.match?(/^172\.(1[6-9]|2[0-9]|3[0-1])\./)
end
```
**Content Security Policy:**
```ruby
# config/initializers/content_security_policy.rb
Rails.application.configure do
config.content_security_policy do |policy|
policy.default_src :self
policy.script_src :self
policy.style_src :self, :unsafe_inline
policy.base_uri :none
policy.form_action :self
policy.frame_ancestors :self
end
end
```
**ActionText sanitization:**
```ruby
# config/initializers/action_text.rb
Rails.application.config.after_initialize do
ActionText::ContentHelper.allowed_tags = %w[
strong em a ul ol li p br h1 h2 h3 h4 blockquote
]
end
```
</security_patterns>
<active_storage>
## Active Storage Patterns
**Variant preprocessing:**
```ruby
class User < ApplicationRecord
has_one_attached :avatar do |attachable|
attachable.variant :thumb, resize_to_limit: [100, 100], preprocessed: true
attachable.variant :medium, resize_to_limit: [300, 300], preprocessed: true
end
end
```
**Direct upload expiry** - extend for slow connections:
```ruby
# config/initializers/active_storage.rb
Rails.application.config.active_storage.service_urls_expire_in = 48.hours
```
**Avatar optimization** - redirect to blob:
```ruby
def show
expires_in 1.year, public: true
redirect_to @user.avatar.variant(:thumb).processed.url, allow_other_host: true
end
```
**Mirror service** for migrations:
```yaml
# config/storage.yml
production:
service: Mirror
primary: amazon
mirrors: [google]
```
</active_storage>

View File

@@ -1,303 +0,0 @@
# Controllers - DHH Rails Style
<rest_mapping>
## Everything Maps to CRUD
Custom actions become new resources. Instead of verbs on existing resources, create noun resources:
```ruby
# Instead of this:
POST /cards/:id/close
DELETE /cards/:id/close
POST /cards/:id/archive
# Do this:
POST /cards/:id/closure # create closure
DELETE /cards/:id/closure # destroy closure
POST /cards/:id/archival # create archival
```
**Real examples from 37signals:**
```ruby
resources :cards do
resource :closure # closing/reopening
resource :goldness # marking important
resource :not_now # postponing
resources :assignments # managing assignees
end
```
Each resource gets its own controller with standard CRUD actions.
</rest_mapping>
<controller_concerns>
## Concerns for Shared Behavior
Controllers use concerns extensively. Common patterns:
**CardScoped** - loads @card, @board, provides render_card_replacement
```ruby
module CardScoped
extend ActiveSupport::Concern
included do
before_action :set_card
end
private
def set_card
@card = Card.find(params[:card_id])
@board = @card.board
end
def render_card_replacement
render turbo_stream: turbo_stream.replace(@card)
end
end
```
**BoardScoped** - loads @board
**CurrentRequest** - populates Current with request data
**CurrentTimezone** - wraps requests in user's timezone
**FilterScoped** - handles complex filtering
**TurboFlash** - flash messages via Turbo Stream
**ViewTransitions** - disables on page refresh
**BlockSearchEngineIndexing** - sets X-Robots-Tag header
**RequestForgeryProtection** - Sec-Fetch-Site CSRF (modern browsers)
</controller_concerns>
<authorization_patterns>
## Authorization Patterns
Controllers check permissions via before_action, models define what permissions mean:
```ruby
# Controller concern
module Authorization
extend ActiveSupport::Concern
private
def ensure_can_administer
head :forbidden unless Current.user.admin?
end
def ensure_is_staff_member
head :forbidden unless Current.user.staff?
end
end
# Usage
class BoardsController < ApplicationController
before_action :ensure_can_administer, only: [:destroy]
end
```
**Model-level authorization:**
```ruby
class Board < ApplicationRecord
def editable_by?(user)
user.admin? || user == creator
end
def publishable_by?(user)
editable_by?(user) && !published?
end
end
```
Keep authorization simple, readable, colocated with domain.
</authorization_patterns>
<security_concerns>
## Security Concerns
**Sec-Fetch-Site CSRF Protection:**
Modern browsers send Sec-Fetch-Site header. Use it for defense in depth:
```ruby
module RequestForgeryProtection
extend ActiveSupport::Concern
included do
before_action :verify_request_origin
end
private
def verify_request_origin
return if request.get? || request.head?
return if %w[same-origin same-site].include?(
request.headers["Sec-Fetch-Site"]&.downcase
)
# Fall back to token verification for older browsers
verify_authenticity_token
end
end
```
**Rate Limiting (Rails 8+):**
```ruby
class MagicLinksController < ApplicationController
rate_limit to: 10, within: 15.minutes, only: :create
end
```
Apply to: auth endpoints, email sending, external API calls, resource creation.
</security_concerns>
<request_context>
## Request Context Concerns
**CurrentRequest** - populates Current with HTTP metadata:
```ruby
module CurrentRequest
extend ActiveSupport::Concern
included do
before_action :set_current_request
end
private
def set_current_request
Current.request_id = request.request_id
Current.user_agent = request.user_agent
Current.ip_address = request.remote_ip
Current.referrer = request.referrer
end
end
```
**CurrentTimezone** - wraps requests in user's timezone:
```ruby
module CurrentTimezone
extend ActiveSupport::Concern
included do
around_action :set_timezone
helper_method :timezone_from_cookie
end
private
def set_timezone
Time.use_zone(timezone_from_cookie) { yield }
end
def timezone_from_cookie
cookies[:timezone] || "UTC"
end
end
```
**SetPlatform** - detects mobile/desktop:
```ruby
module SetPlatform
extend ActiveSupport::Concern
included do
helper_method :platform
end
def platform
@platform ||= request.user_agent&.match?(/Mobile|Android/) ? :mobile : :desktop
end
end
```
</request_context>
<turbo_responses>
## Turbo Stream Responses
Use Turbo Streams for partial updates:
```ruby
class Cards::ClosuresController < ApplicationController
include CardScoped
def create
@card.close
render_card_replacement
end
def destroy
@card.reopen
render_card_replacement
end
end
```
For complex updates, use morphing:
```ruby
render turbo_stream: turbo_stream.morph(@card)
```
</turbo_responses>
<api_patterns>
## API Design
Same controllers, different format. Convention for responses:
```ruby
def create
@card = Card.create!(card_params)
respond_to do |format|
format.html { redirect_to @card }
format.json { head :created, location: @card }
end
end
def update
@card.update!(card_params)
respond_to do |format|
format.html { redirect_to @card }
format.json { head :no_content }
end
end
def destroy
@card.destroy
respond_to do |format|
format.html { redirect_to cards_path }
format.json { head :no_content }
end
end
```
**Status codes:**
- Create: 201 Created + Location header
- Update: 204 No Content
- Delete: 204 No Content
- Bearer token authentication
</api_patterns>
<http_caching>
## HTTP Caching
Extensive use of ETags and conditional GETs:
```ruby
class CardsController < ApplicationController
def show
@card = Card.find(params[:id])
fresh_when etag: [@card, Current.user.timezone]
end
def index
@cards = @board.cards.preloaded
fresh_when etag: [@cards, @board.updated_at]
end
end
```
Key insight: Times render server-side in user's timezone, so timezone must affect the ETag to prevent serving wrong times to other timezones.
**ApplicationController global etag:**
```ruby
class ApplicationController < ActionController::Base
etag { "v1" } # Bump to invalidate all caches
end
```
Use `touch: true` on associations for cache invalidation.
</http_caching>

View File

@@ -1,510 +0,0 @@
# Frontend - DHH Rails Style
<turbo_patterns>
## Turbo Patterns
**Turbo Streams** for partial updates:
```erb
<%# app/views/cards/closures/create.turbo_stream.erb %>
<%= turbo_stream.replace @card %>
```
**Morphing** for complex updates:
```ruby
render turbo_stream: turbo_stream.morph(@card)
```
**Global morphing** - enable in layout:
```ruby
turbo_refreshes_with method: :morph, scroll: :preserve
```
**Fragment caching** with `cached: true`:
```erb
<%= render partial: "card", collection: @cards, cached: true %>
```
**No ViewComponents** - standard partials work fine.
</turbo_patterns>
<turbo_morphing>
## Turbo Morphing Best Practices
**Listen for morph events** to restore client state:
```javascript
document.addEventListener("turbo:morph-element", (event) => {
// Restore any client-side state after morph
})
```
**Permanent elements** - skip morphing with data attribute:
```erb
<div data-turbo-permanent id="notification-count">
<%= @count %>
</div>
```
**Frame morphing** - add refresh attribute:
```erb
<%= turbo_frame_tag :assignment, src: path, refresh: :morph %>
```
**Common issues and solutions:**
| Problem | Solution |
|---------|----------|
| Timers not updating | Clear/restart in morph event listener |
| Forms resetting | Wrap form sections in turbo frames |
| Pagination breaking | Use turbo frames with `refresh: :morph` |
| Flickering on replace | Switch to morph instead of replace |
| localStorage loss | Listen to `turbo:morph-element`, restore state |
</turbo_morphing>
<turbo_frames>
## Turbo Frames
**Lazy loading** with spinner:
```erb
<%= turbo_frame_tag "menu",
src: menu_path,
loading: :lazy do %>
<div class="spinner">Loading...</div>
<% end %>
```
**Inline editing** with edit/view toggle:
```erb
<%= turbo_frame_tag dom_id(card, :edit) do %>
<%= link_to "Edit", edit_card_path(card),
data: { turbo_frame: dom_id(card, :edit) } %>
<% end %>
```
**Target parent frame** without hardcoding:
```erb
<%= form_with model: @card, data: { turbo_frame: "_parent" } do |f| %>
```
**Real-time subscriptions:**
```erb
<%= turbo_stream_from @card %>
<%= turbo_stream_from @card, :activity %>
```
</turbo_frames>
<stimulus_controllers>
## Stimulus Controllers
52 controllers in Fizzy, split 62% reusable, 38% domain-specific.
**Characteristics:**
- Single responsibility per controller
- Configuration via values/classes
- Events for communication
- Private methods with #
- Most under 50 lines
**Examples:**
```javascript
// copy-to-clipboard (25 lines)
import { Controller } from "@hotwired/stimulus"
export default class extends Controller {
static values = { content: String }
copy() {
navigator.clipboard.writeText(this.contentValue)
this.#showFeedback()
}
#showFeedback() {
this.element.classList.add("copied")
setTimeout(() => this.element.classList.remove("copied"), 1500)
}
}
```
```javascript
// auto-click (7 lines)
import { Controller } from "@hotwired/stimulus"
export default class extends Controller {
connect() {
this.element.click()
}
}
```
```javascript
// toggle-class (31 lines)
import { Controller } from "@hotwired/stimulus"
export default class extends Controller {
static classes = ["toggle"]
static values = { open: { type: Boolean, default: false } }
toggle() {
this.openValue = !this.openValue
}
openValueChanged() {
this.element.classList.toggle(this.toggleClass, this.openValue)
}
}
```
```javascript
// auto-submit (28 lines) - debounced form submission
import { Controller } from "@hotwired/stimulus"
export default class extends Controller {
static values = { delay: { type: Number, default: 300 } }
connect() {
this.timeout = null
}
submit() {
clearTimeout(this.timeout)
this.timeout = setTimeout(() => {
this.element.requestSubmit()
}, this.delayValue)
}
disconnect() {
clearTimeout(this.timeout)
}
}
```
```javascript
// dialog (45 lines) - native HTML dialog management
import { Controller } from "@hotwired/stimulus"
export default class extends Controller {
open() {
this.element.showModal()
}
close() {
this.element.close()
this.dispatch("closed")
}
clickOutside(event) {
if (event.target === this.element) this.close()
}
}
```
```javascript
// local-time (40 lines) - relative time display
import { Controller } from "@hotwired/stimulus"
export default class extends Controller {
static values = { datetime: String }
connect() {
this.#updateTime()
}
#updateTime() {
const date = new Date(this.datetimeValue)
const now = new Date()
const diffMinutes = Math.floor((now - date) / 60000)
if (diffMinutes < 60) {
this.element.textContent = `${diffMinutes}m ago`
} else if (diffMinutes < 1440) {
this.element.textContent = `${Math.floor(diffMinutes / 60)}h ago`
} else {
this.element.textContent = `${Math.floor(diffMinutes / 1440)}d ago`
}
}
}
```
</stimulus_controllers>
<stimulus_best_practices>
## Stimulus Best Practices
**Values API** over getAttribute:
```javascript
// Good
static values = { delay: { type: Number, default: 300 } }
// Avoid
this.element.getAttribute("data-delay")
```
**Cleanup in disconnect:**
```javascript
disconnect() {
clearTimeout(this.timeout)
this.observer?.disconnect()
document.removeEventListener("keydown", this.boundHandler)
}
```
**Action filters** - `:self` prevents bubbling:
```erb
<div data-action="click->menu#toggle:self">
```
**Helper extraction** - shared utilities in separate modules:
```javascript
// app/javascript/helpers/timing.js
export function debounce(fn, delay) {
let timeout
return (...args) => {
clearTimeout(timeout)
timeout = setTimeout(() => fn(...args), delay)
}
}
```
**Event dispatching** for loose coupling:
```javascript
this.dispatch("selected", { detail: { id: this.idValue } })
```
</stimulus_best_practices>
<view_helpers>
## View Helpers (Stimulus-Integrated)
**Dialog helper:**
```ruby
def dialog_tag(id, &block)
tag.dialog(
id: id,
data: {
controller: "dialog",
action: "click->dialog#clickOutside keydown.esc->dialog#close"
},
&block
)
end
```
**Auto-submit form helper:**
```ruby
def auto_submit_form_with(model:, delay: 300, **options, &block)
form_with(
model: model,
data: {
controller: "auto-submit",
auto_submit_delay_value: delay,
action: "input->auto-submit#submit"
},
**options,
&block
)
end
```
**Copy button helper:**
```ruby
def copy_button(content:, label: "Copy")
tag.button(
label,
data: {
controller: "copy",
copy_content_value: content,
action: "click->copy#copy"
}
)
end
```
</view_helpers>
<css_architecture>
## CSS Architecture
Vanilla CSS with modern features, no preprocessors.
**CSS @layer** for cascade control:
```css
@layer reset, base, components, modules, utilities;
@layer reset {
*, *::before, *::after { box-sizing: border-box; }
}
@layer base {
body { font-family: var(--font-sans); }
}
@layer components {
.btn { /* button styles */ }
}
@layer modules {
.card { /* card module styles */ }
}
@layer utilities {
.hidden { display: none; }
}
```
**OKLCH color system** for perceptual uniformity:
```css
:root {
--color-primary: oklch(60% 0.15 250);
--color-success: oklch(65% 0.2 145);
--color-warning: oklch(75% 0.15 85);
--color-danger: oklch(55% 0.2 25);
}
```
**Dark mode** via CSS variables:
```css
:root {
--bg: oklch(98% 0 0);
--text: oklch(20% 0 0);
}
@media (prefers-color-scheme: dark) {
:root {
--bg: oklch(15% 0 0);
--text: oklch(90% 0 0);
}
}
```
**Native CSS nesting:**
```css
.card {
padding: var(--space-4);
& .title {
font-weight: bold;
}
&:hover {
background: var(--bg-hover);
}
}
```
**~60 minimal utilities** vs Tailwind's hundreds.
**Modern features used:**
- `@starting-style` for enter animations
- `color-mix()` for color manipulation
- `:has()` for parent selection
- Logical properties (`margin-inline`, `padding-block`)
- Container queries
</css_architecture>
<view_patterns>
## View Patterns
**Standard partials** - no ViewComponents:
```erb
<%# app/views/cards/_card.html.erb %>
<article id="<%= dom_id(card) %>" class="card">
<%= render "cards/header", card: card %>
<%= render "cards/body", card: card %>
<%= render "cards/footer", card: card %>
</article>
```
**Fragment caching:**
```erb
<% cache card do %>
<%= render "cards/card", card: card %>
<% end %>
```
**Collection caching:**
```erb
<%= render partial: "card", collection: @cards, cached: true %>
```
**Simple component naming** - no strict BEM:
```css
.card { }
.card .title { }
.card .actions { }
.card.golden { }
.card.closed { }
```
</view_patterns>
<caching_with_personalization>
## User-Specific Content in Caches
Move personalization to client-side JavaScript to preserve caching:
```erb
<%# Cacheable fragment %>
<% cache card do %>
<article class="card"
data-creator-id="<%= card.creator_id %>"
data-controller="ownership"
data-ownership-current-user-value="<%= Current.user.id %>">
<button data-ownership-target="ownerOnly" class="hidden">Delete</button>
</article>
<% end %>
```
```javascript
// Reveal user-specific elements after cache hit
export default class extends Controller {
static values = { currentUser: Number }
static targets = ["ownerOnly"]
connect() {
const creatorId = parseInt(this.element.dataset.creatorId)
if (creatorId === this.currentUserValue) {
this.ownerOnlyTargets.forEach(el => el.classList.remove("hidden"))
}
}
}
```
**Extract dynamic content** to separate frames:
```erb
<% cache [card, board] do %>
<article class="card">
<%= turbo_frame_tag card, :assignment,
src: card_assignment_path(card),
refresh: :morph %>
</article>
<% end %>
```
Assignment dropdown updates independently without invalidating parent cache.
</caching_with_personalization>
<broadcasting>
## Broadcasting with Turbo Streams
**Model callbacks** for real-time updates:
```ruby
class Card < ApplicationRecord
include Broadcastable
after_create_commit :broadcast_created
after_update_commit :broadcast_updated
after_destroy_commit :broadcast_removed
private
def broadcast_created
broadcast_append_to [Current.account, board], :cards
end
def broadcast_updated
broadcast_replace_to [Current.account, board], :cards
end
def broadcast_removed
broadcast_remove_to [Current.account, board], :cards
end
end
```
**Scope by tenant** using `[Current.account, resource]` pattern.
</broadcasting>

View File

@@ -1,266 +0,0 @@
# Gems - DHH Rails Style
<what_they_use>
## What 37signals Uses
**Core Rails stack:**
- turbo-rails, stimulus-rails, importmap-rails
- propshaft (asset pipeline)
**Database-backed services (Solid suite):**
- solid_queue - background jobs
- solid_cache - caching
- solid_cable - WebSockets/Action Cable
**Authentication & Security:**
- bcrypt (for any password hashing needed)
**Their own gems:**
- geared_pagination (cursor-based pagination)
- lexxy (rich text editor)
- mittens (mailer utilities)
**Utilities:**
- rqrcode (QR code generation)
- redcarpet + rouge (Markdown rendering)
- web-push (push notifications)
**Deployment & Operations:**
- kamal (Docker deployment)
- thruster (HTTP/2 proxy)
- mission_control-jobs (job monitoring)
- autotuner (GC tuning)
</what_they_use>
<what_they_avoid>
## What They Deliberately Avoid
**Authentication:**
```
devise → Custom ~150-line auth
```
Why: Full control, no password liability with magic links, simpler.
**Authorization:**
```
pundit/cancancan → Simple role checks in models
```
Why: Most apps don't need policy objects. A method on the model suffices:
```ruby
class Board < ApplicationRecord
def editable_by?(user)
user.admin? || user == creator
end
end
```
**Background Jobs:**
```
sidekiq → Solid Queue
```
Why: Database-backed means no Redis, same transactional guarantees.
**Caching:**
```
redis → Solid Cache
```
Why: Database is already there, simpler infrastructure.
**Search:**
```
elasticsearch → Custom sharded search
```
Why: Built exactly what they need, no external service dependency.
**View Layer:**
```
view_component → Standard partials
```
Why: Partials work fine. ViewComponents add complexity without clear benefit for their use case.
**API:**
```
GraphQL → REST with Turbo
```
Why: REST is sufficient when you control both ends. GraphQL complexity not justified.
**Factories:**
```
factory_bot → Fixtures
```
Why: Fixtures are simpler, faster, and encourage thinking about data relationships upfront.
**Service Objects:**
```
Interactor, Trailblazer → Fat models
```
Why: Business logic stays in models. Methods like `card.close` instead of `CardCloser.call(card)`.
**Form Objects:**
```
Reform, dry-validation → params.expect + model validations
```
Why: Rails 7.1's `params.expect` is clean enough. Contextual validations on model.
**Decorators:**
```
Draper → View helpers + partials
```
Why: Helpers and partials are simpler. No decorator indirection.
**CSS:**
```
Tailwind, Sass → Native CSS
```
Why: Modern CSS has nesting, variables, layers. No build step needed.
**Frontend:**
```
React, Vue, SPAs → Turbo + Stimulus
```
Why: Server-rendered HTML with sprinkles of JS. SPA complexity not justified.
**Testing:**
```
RSpec → Minitest
```
Why: Simpler, faster boot, less DSL magic, ships with Rails.
</what_they_avoid>
<testing_philosophy>
## Testing Philosophy
**Minitest** - simpler, faster:
```ruby
class CardTest < ActiveSupport::TestCase
test "closing creates closure" do
card = cards(:one)
assert_difference -> { Card::Closure.count } do
card.close
end
assert card.closed?
end
end
```
**Fixtures** - loaded once, deterministic:
```yaml
# test/fixtures/cards.yml
open_card:
title: Open Card
board: main
creator: alice
closed_card:
title: Closed Card
board: main
creator: bob
```
**Dynamic timestamps** with ERB:
```yaml
recent:
title: Recent
created_at: <%= 1.hour.ago %>
old:
title: Old
created_at: <%= 1.month.ago %>
```
**Time travel** for time-dependent tests:
```ruby
test "expires after 15 minutes" do
magic_link = MagicLink.create!(user: users(:alice))
travel 16.minutes
assert magic_link.expired?
end
```
**VCR** for external APIs:
```ruby
VCR.use_cassette("stripe/charge") do
charge = Stripe::Charge.create(amount: 1000)
assert charge.paid
end
```
**Tests ship with features** - same commit, not before or after.
</testing_philosophy>
<decision_framework>
## Decision Framework
Before adding a gem, ask:
1. **Can vanilla Rails do this?**
- ActiveRecord can do most things Sequel can
- ActionMailer handles email fine
- ActiveJob works for most job needs
2. **Is the complexity worth it?**
- 150 lines of custom code vs. 10,000-line gem
- You'll understand your code better
- Fewer upgrade headaches
3. **Does it add infrastructure?**
- Redis? Consider database-backed alternatives
- External service? Consider building in-house
- Simpler infrastructure = fewer failure modes
4. **Is it from someone you trust?**
- 37signals gems: battle-tested at scale
- Well-maintained, focused gems: usually fine
- Kitchen-sink gems: probably overkill
**The philosophy:**
> "Build solutions before reaching for gems."
Not anti-gem, but pro-understanding. Use gems when they genuinely solve a problem you have, not a problem you might have.
</decision_framework>
<gem_patterns>
## Gem Usage Patterns
**Pagination:**
```ruby
# geared_pagination - cursor-based
class CardsController < ApplicationController
def index
@cards = @board.cards.geared(page: params[:page])
end
end
```
**Markdown:**
```ruby
# redcarpet + rouge
class MarkdownRenderer
def self.render(text)
Redcarpet::Markdown.new(
Redcarpet::Render::HTML.new(filter_html: true),
autolink: true,
fenced_code_blocks: true
).render(text)
end
end
```
**Background jobs:**
```ruby
# solid_queue - no Redis
class ApplicationJob < ActiveJob::Base
queue_as :default
# Just works, backed by database
end
```
**Caching:**
```ruby
# solid_cache - no Redis
# config/environments/production.rb
config.cache_store = :solid_cache_store
```
</gem_patterns>

View File

@@ -1,359 +0,0 @@
# Models - DHH Rails Style
<model_concerns>
## Concerns for Horizontal Behavior
Models heavily use concerns. A typical Card model includes 14+ concerns:
```ruby
class Card < ApplicationRecord
include Assignable
include Attachments
include Broadcastable
include Closeable
include Colored
include Eventable
include Golden
include Mentions
include Multistep
include Pinnable
include Postponable
include Readable
include Searchable
include Taggable
include Watchable
end
```
Each concern is self-contained with associations, scopes, and methods.
**Naming:** Adjectives describing capability (`Closeable`, `Publishable`, `Watchable`)
</model_concerns>
<state_records>
## State as Records, Not Booleans
Instead of boolean columns, create separate records:
```ruby
# Instead of:
closed: boolean
is_golden: boolean
postponed: boolean
# Create records:
class Card::Closure < ApplicationRecord
belongs_to :card
belongs_to :creator, class_name: "User"
end
class Card::Goldness < ApplicationRecord
belongs_to :card
belongs_to :creator, class_name: "User"
end
class Card::NotNow < ApplicationRecord
belongs_to :card
belongs_to :creator, class_name: "User"
end
```
**Benefits:**
- Automatic timestamps (when it happened)
- Track who made changes
- Easy filtering via joins and `where.missing`
- Enables rich UI showing when/who
**In the model:**
```ruby
module Closeable
extend ActiveSupport::Concern
included do
has_one :closure, dependent: :destroy
end
def closed?
closure.present?
end
def close(creator: Current.user)
create_closure!(creator: creator)
end
def reopen
closure&.destroy
end
end
```
**Querying:**
```ruby
Card.joins(:closure) # closed cards
Card.where.missing(:closure) # open cards
```
</state_records>
<callbacks>
## Callbacks - Used Sparingly
Only 38 callback occurrences across 30 files in Fizzy. Guidelines:
**Use for:**
- `after_commit` for async work
- `before_save` for derived data
- `after_create_commit` for side effects
**Avoid:**
- Complex callback chains
- Business logic in callbacks
- Synchronous external calls
```ruby
class Card < ApplicationRecord
after_create_commit :notify_watchers_later
before_save :update_search_index, if: :title_changed?
private
def notify_watchers_later
NotifyWatchersJob.perform_later(self)
end
end
```
</callbacks>
<scopes>
## Scope Naming
Standard scope names:
```ruby
class Card < ApplicationRecord
scope :chronologically, -> { order(created_at: :asc) }
scope :reverse_chronologically, -> { order(created_at: :desc) }
scope :alphabetically, -> { order(title: :asc) }
scope :latest, -> { reverse_chronologically.limit(10) }
# Standard eager loading
scope :preloaded, -> { includes(:creator, :assignees, :tags) }
# Parameterized
scope :indexed_by, ->(column) { order(column => :asc) }
scope :sorted_by, ->(column, direction = :asc) { order(column => direction) }
end
```
</scopes>
<poros>
## Plain Old Ruby Objects
POROs namespaced under parent models:
```ruby
# app/models/event/description.rb
class Event::Description
def initialize(event)
@event = event
end
def to_s
# Presentation logic for event description
end
end
# app/models/card/eventable/system_commenter.rb
class Card::Eventable::SystemCommenter
def initialize(card)
@card = card
end
def comment(message)
# Business logic
end
end
# app/models/user/filtering.rb
class User::Filtering
# View context bundling
end
```
**NOT used for service objects.** Business logic stays in models.
</poros>
<verbs_predicates>
## Method Naming
**Verbs** - Actions that change state:
```ruby
card.close
card.reopen
card.gild # make golden
card.ungild
board.publish
board.archive
```
**Predicates** - Queries derived from state:
```ruby
card.closed? # closure.present?
card.golden? # goldness.present?
board.published?
```
**Avoid** generic setters:
```ruby
# Bad
card.set_closed(true)
card.update_golden_status(false)
# Good
card.close
card.ungild
```
</verbs_predicates>
<validation_philosophy>
## Validation Philosophy
Minimal validations on models. Use contextual validations on form/operation objects:
```ruby
# Model - minimal
class User < ApplicationRecord
validates :email, presence: true, format: { with: URI::MailTo::EMAIL_REGEXP }
end
# Form object - contextual
class Signup
include ActiveModel::Model
attr_accessor :email, :name, :terms_accepted
validates :email, :name, presence: true
validates :terms_accepted, acceptance: true
def save
return false unless valid?
User.create!(email: email, name: name)
end
end
```
**Prefer database constraints** over model validations for data integrity:
```ruby
# migration
add_index :users, :email, unique: true
add_foreign_key :cards, :boards
```
</validation_philosophy>
<error_handling>
## Let It Crash Philosophy
Use bang methods that raise exceptions on failure:
```ruby
# Preferred - raises on failure
@card = Card.create!(card_params)
@card.update!(title: new_title)
@comment.destroy!
# Avoid - silent failures
@card = Card.create(card_params) # returns false on failure
if @card.save
# ...
end
```
Let errors propagate naturally. Rails handles ActiveRecord::RecordInvalid with 422 responses.
</error_handling>
<default_values>
## Default Values with Lambdas
Use lambda defaults for associations with Current:
```ruby
class Card < ApplicationRecord
belongs_to :creator, class_name: "User", default: -> { Current.user }
belongs_to :account, default: -> { Current.account }
end
class Comment < ApplicationRecord
belongs_to :commenter, class_name: "User", default: -> { Current.user }
end
```
Lambdas ensure dynamic resolution at creation time.
</default_values>
<rails_71_patterns>
## Rails 7.1+ Model Patterns
**Normalizes** - clean data before validation:
```ruby
class User < ApplicationRecord
normalizes :email, with: ->(email) { email.strip.downcase }
normalizes :phone, with: ->(phone) { phone.gsub(/\D/, "") }
end
```
**Delegated Types** - replace polymorphic associations:
```ruby
class Message < ApplicationRecord
delegated_type :messageable, types: %w[Comment Reply Announcement]
end
# Now you get:
message.comment? # true if Comment
message.comment # returns the Comment
Message.comments # scope for Comment messages
```
**Store Accessor** - structured JSON storage:
```ruby
class User < ApplicationRecord
store :settings, accessors: [:theme, :notifications_enabled], coder: JSON
end
user.theme = "dark"
user.notifications_enabled = true
```
</rails_71_patterns>
<concern_guidelines>
## Concern Guidelines
- **50-150 lines** per concern (most are ~100)
- **Cohesive** - related functionality only
- **Named for capabilities** - `Closeable`, `Watchable`, not `CardHelpers`
- **Self-contained** - associations, scopes, methods together
- **Not for mere organization** - create when genuine reuse needed
**Touch chains** for cache invalidation:
```ruby
class Comment < ApplicationRecord
belongs_to :card, touch: true
end
class Card < ApplicationRecord
belongs_to :board, touch: true
end
```
When comment updates, card's `updated_at` changes, which cascades to board.
**Transaction wrapping** for related updates:
```ruby
class Card < ApplicationRecord
def close(creator: Current.user)
transaction do
create_closure!(creator: creator)
record_event(:closed)
notify_watchers_later
end
end
end
```
</concern_guidelines>

View File

@@ -1,338 +0,0 @@
# Testing - DHH Rails Style
## Core Philosophy
"Minitest with fixtures - simple, fast, deterministic." The approach prioritizes pragmatism over convention.
## Why Minitest Over RSpec
- **Simpler**: Less DSL magic, plain Ruby assertions
- **Ships with Rails**: No additional dependencies
- **Faster boot times**: Less overhead
- **Plain Ruby**: No specialized syntax to learn
## Fixtures as Test Data
Rather than factories, fixtures provide preloaded data:
- Loaded once, reused across tests
- No runtime object creation overhead
- Explicit relationship visibility
- Deterministic IDs for easier debugging
### Fixture Structure
```yaml
# test/fixtures/users.yml
david:
identity: david
account: basecamp
role: admin
jason:
identity: jason
account: basecamp
role: member
# test/fixtures/rooms.yml
watercooler:
name: Water Cooler
creator: david
direct: false
# test/fixtures/messages.yml
greeting:
body: Hello everyone!
room: watercooler
creator: david
```
### Using Fixtures in Tests
```ruby
test "sending a message" do
user = users(:david)
room = rooms(:watercooler)
# Test with fixture data
end
```
### Dynamic Fixture Values
ERB enables time-sensitive data:
```yaml
recent_card:
title: Recent Card
created_at: <%= 1.hour.ago %>
old_card:
title: Old Card
created_at: <%= 1.month.ago %>
```
## Test Organization
### Unit Tests
Verify business logic using setup blocks and standard assertions:
```ruby
class CardTest < ActiveSupport::TestCase
setup do
@card = cards(:one)
@user = users(:david)
end
test "closing a card creates a closure" do
assert_difference -> { Card::Closure.count } do
@card.close(creator: @user)
end
assert @card.closed?
assert_equal @user, @card.closure.creator
end
test "reopening a card destroys the closure" do
@card.close(creator: @user)
assert_difference -> { Card::Closure.count }, -1 do
@card.reopen
end
refute @card.closed?
end
end
```
### Integration Tests
Test full request/response cycles:
```ruby
class CardsControllerTest < ActionDispatch::IntegrationTest
setup do
@user = users(:david)
sign_in @user
end
test "closing a card" do
card = cards(:one)
post card_closure_path(card)
assert_response :success
assert card.reload.closed?
end
test "unauthorized user cannot close card" do
sign_in users(:guest)
card = cards(:one)
post card_closure_path(card)
assert_response :forbidden
refute card.reload.closed?
end
end
```
### System Tests
Browser-based tests using Capybara:
```ruby
class MessagesTest < ApplicationSystemTestCase
test "sending a message" do
sign_in users(:david)
visit room_path(rooms(:watercooler))
fill_in "Message", with: "Hello, world!"
click_button "Send"
assert_text "Hello, world!"
end
test "editing own message" do
sign_in users(:david)
visit room_path(rooms(:watercooler))
within "#message_#{messages(:greeting).id}" do
click_on "Edit"
end
fill_in "Message", with: "Updated message"
click_button "Save"
assert_text "Updated message"
end
test "drag and drop card to new column" do
sign_in users(:david)
visit board_path(boards(:main))
card = find("#card_#{cards(:one).id}")
target = find("#column_#{columns(:done).id}")
card.drag_to target
assert_selector "#column_#{columns(:done).id} #card_#{cards(:one).id}"
end
end
```
## Advanced Patterns
### Time Testing
Use `travel_to` for deterministic time-dependent assertions:
```ruby
test "card expires after 30 days" do
card = cards(:one)
travel_to 31.days.from_now do
assert card.expired?
end
end
```
### External API Testing with VCR
Record and replay HTTP interactions:
```ruby
test "fetches user data from API" do
VCR.use_cassette("user_api") do
user_data = ExternalApi.fetch_user(123)
assert_equal "John", user_data[:name]
end
end
```
### Background Job Testing
Assert job enqueueing and email delivery:
```ruby
test "closing card enqueues notification job" do
card = cards(:one)
assert_enqueued_with(job: NotifyWatchersJob, args: [card]) do
card.close
end
end
test "welcome email is sent on signup" do
assert_emails 1 do
Identity.create!(email: "new@example.com")
end
end
```
### Testing Turbo Streams
```ruby
test "message creation broadcasts to room" do
room = rooms(:watercooler)
assert_turbo_stream_broadcasts [room, :messages] do
room.messages.create!(body: "Test", creator: users(:david))
end
end
```
## Testing Principles
### 1. Test Observable Behavior
Focus on what the code does, not how it does it:
```ruby
# ❌ Testing implementation
test "calls notify method on each watcher" do
card.expects(:notify).times(3)
card.close
end
# ✅ Testing behavior
test "watchers receive notifications when card closes" do
assert_difference -> { Notification.count }, 3 do
card.close
end
end
```
### 2. Don't Mock Everything
```ruby
# ❌ Over-mocked test
test "sending message" do
room = mock("room")
user = mock("user")
message = mock("message")
room.expects(:messages).returns(stub(create!: message))
message.expects(:broadcast_create)
MessagesController.new.create
end
# ✅ Test the real thing
test "sending message" do
sign_in users(:david)
post room_messages_url(rooms(:watercooler)),
params: { message: { body: "Hello" } }
assert_response :success
assert Message.exists?(body: "Hello")
end
```
### 3. Tests Ship with Features
Same commit, not TDD-first but together. Neither before (strict TDD) nor after (deferred testing).
### 4. Security Fixes Always Include Regression Tests
Every security fix must include a test that would have caught the vulnerability.
### 5. Integration Tests Validate Complete Workflows
Don't just test individual pieces - test that they work together.
## File Organization
```
test/
├── controllers/ # Integration tests for controllers
├── fixtures/ # YAML fixtures for all models
├── helpers/ # Helper method tests
├── integration/ # API integration tests
├── jobs/ # Background job tests
├── mailers/ # Mailer tests
├── models/ # Unit tests for models
├── system/ # Browser-based system tests
└── test_helper.rb # Test configuration
```
## Test Helper Setup
```ruby
# test/test_helper.rb
ENV["RAILS_ENV"] ||= "test"
require_relative "../config/environment"
require "rails/test_help"
class ActiveSupport::TestCase
fixtures :all
parallelize(workers: :number_of_processors)
end
class ActionDispatch::IntegrationTest
include SignInHelper
end
class ApplicationSystemTestCase < ActionDispatch::SystemTestCase
driven_by :selenium, using: :headless_chrome
end
```
## Sign In Helper
```ruby
# test/support/sign_in_helper.rb
module SignInHelper
def sign_in(user)
session = user.identity.sessions.create!
cookies.signed[:session_id] = session.id
end
end
```

View File

@@ -1,737 +0,0 @@
---
name: dspy-ruby
description: Build type-safe LLM applications with DSPy.rb — Ruby's programmatic prompt framework with signatures, modules, agents, and optimization. Use when implementing predictable AI features, creating LLM signatures and modules, configuring language model providers, building agent systems with tools, optimizing prompts, or testing LLM-powered functionality in Ruby applications.
---
# DSPy.rb
> Build LLM apps like you build software. Type-safe, modular, testable.
DSPy.rb brings software engineering best practices to LLM development. Instead of tweaking prompts, define what you want with Ruby types and let DSPy handle the rest.
## Overview
DSPy.rb is a Ruby framework for building language model applications with programmatic prompts. It provides:
- **Type-safe signatures** — Define inputs/outputs with Sorbet types
- **Modular components** — Compose and reuse LLM logic
- **Automatic optimization** — Use data to improve prompts, not guesswork
- **Production-ready** — Built-in observability, testing, and error handling
## Core Concepts
### 1. Signatures
Define interfaces between your app and LLMs using Ruby types:
```ruby
class EmailClassifier < DSPy::Signature
description "Classify customer support emails by category and priority"
class Priority < T::Enum
enums do
Low = new('low')
Medium = new('medium')
High = new('high')
Urgent = new('urgent')
end
end
input do
const :email_content, String
const :sender, String
end
output do
const :category, String
const :priority, Priority # Type-safe enum with defined values
const :confidence, Float
end
end
```
### 2. Modules
Build complex workflows from simple building blocks:
- **Predict** — Basic LLM calls with signatures
- **ChainOfThought** — Step-by-step reasoning
- **ReAct** — Tool-using agents
- **CodeAct** — Dynamic code generation agents (install the `dspy-code_act` gem)
### 3. Tools & Toolsets
Create type-safe tools for agents with comprehensive Sorbet support:
```ruby
# Enum-based tool with automatic type conversion
class CalculatorTool < DSPy::Tools::Base
tool_name 'calculator'
tool_description 'Performs arithmetic operations with type-safe enum inputs'
class Operation < T::Enum
enums do
Add = new('add')
Subtract = new('subtract')
Multiply = new('multiply')
Divide = new('divide')
end
end
sig { params(operation: Operation, num1: Float, num2: Float).returns(T.any(Float, String)) }
def call(operation:, num1:, num2:)
case operation
when Operation::Add then num1 + num2
when Operation::Subtract then num1 - num2
when Operation::Multiply then num1 * num2
when Operation::Divide
return "Error: Division by zero" if num2 == 0
num1 / num2
end
end
end
# Multi-tool toolset with rich types
class DataToolset < DSPy::Tools::Toolset
toolset_name "data_processing"
class Format < T::Enum
enums do
JSON = new('json')
CSV = new('csv')
XML = new('xml')
end
end
tool :convert, description: "Convert data between formats"
tool :validate, description: "Validate data structure"
sig { params(data: String, from: Format, to: Format).returns(String) }
def convert(data:, from:, to:)
"Converted from #{from.serialize} to #{to.serialize}"
end
sig { params(data: String, format: Format).returns(T::Hash[String, T.any(String, Integer, T::Boolean)]) }
def validate(data:, format:)
{ valid: true, format: format.serialize, row_count: 42, message: "Data validation passed" }
end
end
```
### 4. Type System & Discriminators
DSPy.rb uses sophisticated type discrimination for complex data structures:
- **Automatic `_type` field injection** — DSPy adds discriminator fields to structs for type safety
- **Union type support** — `T.any()` types automatically disambiguated by `_type`
- **Reserved field name** — Avoid defining your own `_type` fields in structs
- **Recursive filtering** — `_type` fields filtered during deserialization at all nesting levels
### 5. Optimization
Improve accuracy with real data:
- **MIPROv2** — Advanced multi-prompt optimization with bootstrap sampling and Bayesian optimization
- **GEPA** — Genetic-Pareto Reflective Prompt Evolution with feedback maps, experiment tracking, and telemetry
- **Evaluation** — Comprehensive framework with built-in and custom metrics, error handling, and batch processing
## Quick Start
```ruby
# Install
gem 'dspy'
# Configure
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
end
# Define a task
class SentimentAnalysis < DSPy::Signature
description "Analyze sentiment of text"
input do
const :text, String
end
output do
const :sentiment, String # positive, negative, neutral
const :score, Float # 0.0 to 1.0
end
end
# Use it
analyzer = DSPy::Predict.new(SentimentAnalysis)
result = analyzer.call(text: "This product is amazing!")
puts result.sentiment # => "positive"
puts result.score # => 0.92
```
## Provider Adapter Gems
Two strategies for connecting to LLM providers:
### Per-provider adapters (direct SDK access)
```ruby
# Gemfile
gem 'dspy'
gem 'dspy-openai' # OpenAI, OpenRouter, Ollama
gem 'dspy-anthropic' # Claude
gem 'dspy-gemini' # Gemini
```
Each adapter gem pulls in the official SDK (`openai`, `anthropic`, `gemini-ai`).
### Unified adapter via RubyLLM (recommended for multi-provider)
```ruby
# Gemfile
gem 'dspy'
gem 'dspy-ruby_llm' # Routes to any provider via ruby_llm
gem 'ruby_llm'
```
RubyLLM handles provider routing based on the model name. Use the `ruby_llm/` prefix:
```ruby
DSPy.configure do |c|
c.lm = DSPy::LM.new('ruby_llm/gemini-2.5-flash', structured_outputs: true)
# c.lm = DSPy::LM.new('ruby_llm/claude-sonnet-4-20250514', structured_outputs: true)
# c.lm = DSPy::LM.new('ruby_llm/gpt-4o-mini', structured_outputs: true)
end
```
## Events System
DSPy.rb ships with a structured event bus for observing runtime behavior.
### Module-Scoped Subscriptions (preferred for agents)
```ruby
class MyAgent < DSPy::Module
subscribe 'lm.tokens', :track_tokens, scope: :descendants
def track_tokens(_event, attrs)
@total_tokens += attrs.fetch(:total_tokens, 0)
end
end
```
### Global Subscriptions (for observability/integrations)
```ruby
subscription_id = DSPy.events.subscribe('score.create') do |event, attrs|
Langfuse.export_score(attrs)
end
# Wildcards supported
DSPy.events.subscribe('llm.*') { |name, attrs| puts "[#{name}] tokens=#{attrs[:total_tokens]}" }
```
Event names use dot-separated namespaces (`llm.generate`, `react.iteration_complete`). Every event includes module metadata (`module_path`, `module_leaf`, `module_scope.ancestry_token`) for filtering.
## Lifecycle Callbacks
Rails-style lifecycle hooks ship with every `DSPy::Module`:
- **`before`** — Runs ahead of `forward` for setup (metrics, context loading)
- **`around`** — Wraps `forward`, calls `yield`, and lets you pair setup/teardown logic
- **`after`** — Fires after `forward` returns for cleanup or persistence
```ruby
class InstrumentedModule < DSPy::Module
before :setup_metrics
around :manage_context
after :log_metrics
def forward(question:)
@predictor.call(question: question)
end
private
def setup_metrics
@start_time = Time.now
end
def manage_context
load_context
result = yield
save_context
result
end
def log_metrics
duration = Time.now - @start_time
Rails.logger.info "Prediction completed in #{duration}s"
end
end
```
Execution order: before → around (before yield) → forward → around (after yield) → after. Callbacks are inherited from parent classes and execute in registration order.
## Fiber-Local LM Context
Override the language model temporarily using fiber-local storage:
```ruby
fast_model = DSPy::LM.new("openai/gpt-4o-mini", api_key: ENV['OPENAI_API_KEY'])
DSPy.with_lm(fast_model) do
result = classifier.call(text: "test") # Uses fast_model inside this block
end
# Back to global LM outside the block
```
**LM resolution hierarchy**: Instance-level LM → Fiber-local LM (`DSPy.with_lm`) → Global LM (`DSPy.configure`).
Use `configure_predictor` for fine-grained control over agent internals:
```ruby
agent = DSPy::ReAct.new(MySignature, tools: tools)
agent.configure { |c| c.lm = default_model }
agent.configure_predictor('thought_generator') { |c| c.lm = powerful_model }
```
## Evaluation Framework
Systematically test LLM application performance with `DSPy::Evals`:
```ruby
metric = DSPy::Metrics.exact_match(field: :answer, case_sensitive: false)
evaluator = DSPy::Evals.new(predictor, metric: metric)
result = evaluator.evaluate(test_examples, display_table: true)
puts "Pass Rate: #{(result.pass_rate * 100).round(1)}%"
```
Built-in metrics: `exact_match`, `contains`, `numeric_difference`, `composite_and`. Custom metrics return `true`/`false` or a `DSPy::Prediction` with `score:` and `feedback:` fields.
Use `DSPy::Example` for typed test data and `export_scores: true` to push results to Langfuse.
## GEPA Optimization
GEPA (Genetic-Pareto Reflective Prompt Evolution) uses reflection-driven instruction rewrites:
```ruby
gem 'dspy-gepa'
teleprompter = DSPy::Teleprompt::GEPA.new(
metric: metric,
reflection_lm: DSPy::ReflectionLM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY']),
feedback_map: feedback_map,
config: { max_metric_calls: 600, minibatch_size: 6 }
)
result = teleprompter.compile(program, trainset: train, valset: val)
optimized_program = result.optimized_program
```
The metric must return `DSPy::Prediction.new(score:, feedback:)` so the reflection model can reason about failures. Use `feedback_map` to target individual predictors in composite modules.
## Typed Context Pattern
Replace opaque string context blobs with `T::Struct` inputs. Each field gets its own `description:` annotation in the JSON schema the LLM sees:
```ruby
class NavigationContext < T::Struct
const :workflow_hint, T.nilable(String),
description: "Current workflow phase guidance for the agent"
const :action_log, T::Array[String], default: [],
description: "Compact one-line-per-action history of research steps taken"
const :iterations_remaining, Integer,
description: "Budget remaining. Each tool call costs 1 iteration."
end
class ToolSelectionSignature < DSPy::Signature
input do
const :query, String
const :context, NavigationContext # Structured, not an opaque string
end
output do
const :tool_name, String
const :tool_args, String, description: "JSON-encoded arguments"
end
end
```
Benefits: type safety at compile time, per-field descriptions in the LLM schema, easy to test as value objects, extensible by adding `const` declarations.
## Schema Formats (BAML / TOON)
Control how DSPy describes signature structure to the LLM:
- **JSON Schema** (default) — Standard format, works with `structured_outputs: true`
- **BAML** (`schema_format: :baml`) — 84% token reduction for Enhanced Prompting mode. Requires `sorbet-baml` gem.
- **TOON** (`schema_format: :toon, data_format: :toon`) — Table-oriented format for both schemas and data. Enhanced Prompting mode only.
BAML and TOON apply only when `structured_outputs: false`. With `structured_outputs: true`, the provider receives JSON Schema directly.
## Storage System
Persist and reload optimized programs with `DSPy::Storage::ProgramStorage`:
```ruby
storage = DSPy::Storage::ProgramStorage.new(storage_path: "./dspy_storage")
storage.save_program(result.optimized_program, result, metadata: { optimizer: 'MIPROv2' })
```
Supports checkpoint management, optimization history tracking, and import/export between environments.
## Rails Integration
### Directory Structure
Organize DSPy components using Rails conventions:
```
app/
entities/ # T::Struct types shared across signatures
signatures/ # DSPy::Signature definitions
tools/ # DSPy::Tools::Base implementations
concerns/ # Shared tool behaviors (error handling, etc.)
modules/ # DSPy::Module orchestrators
services/ # Plain Ruby services that compose DSPy modules
config/
initializers/
dspy.rb # DSPy + provider configuration
feature_flags.rb # Model selection per role
spec/
signatures/ # Schema validation tests
tools/ # Tool unit tests
modules/ # Integration tests with VCR
vcr_cassettes/ # Recorded HTTP interactions
```
### Initializer
```ruby
# config/initializers/dspy.rb
Rails.application.config.after_initialize do
next if Rails.env.test? && ENV["DSPY_ENABLE_IN_TEST"].blank?
RubyLLM.configure do |config|
config.gemini_api_key = ENV["GEMINI_API_KEY"] if ENV["GEMINI_API_KEY"].present?
config.anthropic_api_key = ENV["ANTHROPIC_API_KEY"] if ENV["ANTHROPIC_API_KEY"].present?
config.openai_api_key = ENV["OPENAI_API_KEY"] if ENV["OPENAI_API_KEY"].present?
end
model = ENV.fetch("DSPY_MODEL", "ruby_llm/gemini-2.5-flash")
DSPy.configure do |config|
config.lm = DSPy::LM.new(model, structured_outputs: true)
config.logger = Rails.logger
end
# Langfuse observability (optional)
if ENV["LANGFUSE_PUBLIC_KEY"].present? && ENV["LANGFUSE_SECRET_KEY"].present?
DSPy::Observability.configure!
end
end
```
### Feature-Flagged Model Selection
Use different models for different roles (fast/cheap for classification, powerful for synthesis):
```ruby
# config/initializers/feature_flags.rb
module FeatureFlags
SELECTOR_MODEL = ENV.fetch("DSPY_SELECTOR_MODEL", "ruby_llm/gemini-2.5-flash-lite")
SYNTHESIZER_MODEL = ENV.fetch("DSPY_SYNTHESIZER_MODEL", "ruby_llm/gemini-2.5-flash")
end
```
Then override per-tool or per-predictor:
```ruby
class ClassifyTool < DSPy::Tools::Base
def call(query:)
predictor = DSPy::Predict.new(ClassifyQuery)
predictor.configure { |c| c.lm = DSPy::LM.new(FeatureFlags::SELECTOR_MODEL, structured_outputs: true) }
predictor.call(query: query)
end
end
```
## Schema-Driven Signatures
**Prefer typed schemas over string descriptions.** Let the type system communicate structure to the LLM rather than prose in the signature description.
### Entities as Shared Types
Define reusable `T::Struct` and `T::Enum` types in `app/entities/` and reference them across signatures:
```ruby
# app/entities/search_strategy.rb
class SearchStrategy < T::Enum
enums do
SingleSearch = new("single_search")
DateDecomposition = new("date_decomposition")
end
end
# app/entities/scored_item.rb
class ScoredItem < T::Struct
const :id, String
const :score, Float, description: "Relevance score 0.0-1.0"
const :verdict, String, description: "relevant, maybe, or irrelevant"
const :reason, String, default: ""
end
```
### Schema vs Description: When to Use Each
**Use schemas (T::Struct/T::Enum)** for:
- Multi-field outputs with specific types
- Enums with defined values the LLM must pick from
- Nested structures, arrays of typed objects
- Outputs consumed by code (not displayed to users)
**Use string descriptions** for:
- Simple single-field outputs where the type is `String`
- Natural language generation (summaries, answers)
- Fields where constraint guidance helps (e.g., `description: "YYYY-MM-DD format"`)
**Rule of thumb**: If you'd write a `case` statement on the output, it should be a `T::Enum`. If you'd call `.each` on it, it should be `T::Array[SomeStruct]`.
## Tool Patterns
### Tools That Wrap Predictions
A common pattern: tools encapsulate a DSPy prediction, adding error handling, model selection, and serialization:
```ruby
class RerankTool < DSPy::Tools::Base
tool_name "rerank"
tool_description "Score and rank search results by relevance"
MAX_ITEMS = 200
MIN_ITEMS_FOR_LLM = 5
sig { params(query: String, items: T::Array[T::Hash[Symbol, T.untyped]]).returns(T::Hash[Symbol, T.untyped]) }
def call(query:, items: [])
return { scored_items: items, reranked: false } if items.size < MIN_ITEMS_FOR_LLM
capped_items = items.first(MAX_ITEMS)
predictor = DSPy::Predict.new(RerankSignature)
predictor.configure { |c| c.lm = DSPy::LM.new(FeatureFlags::SYNTHESIZER_MODEL, structured_outputs: true) }
result = predictor.call(query: query, items: capped_items)
{ scored_items: result.scored_items, reranked: true }
rescue => e
Rails.logger.warn "[RerankTool] LLM rerank failed: #{e.message}"
{ error: "Rerank failed: #{e.message}", scored_items: items, reranked: false }
end
end
```
**Key patterns:**
- Short-circuit LLM calls when unnecessary (small data, trivial cases)
- Cap input size to prevent token overflow
- Per-tool model selection via `configure`
- Graceful error handling with fallback data
### Error Handling Concern
```ruby
module ErrorHandling
extend ActiveSupport::Concern
private
def safe_predict(signature_class, **inputs)
predictor = DSPy::Predict.new(signature_class)
yield predictor if block_given?
predictor.call(**inputs)
rescue Faraday::Error, Net::HTTPError => e
Rails.logger.error "[#{self.class.name}] API error: #{e.message}"
nil
rescue JSON::ParserError => e
Rails.logger.error "[#{self.class.name}] Invalid LLM output: #{e.message}"
nil
end
end
```
## Observability
### Tracing with DSPy::Context
Wrap operations in spans for Langfuse/OpenTelemetry visibility:
```ruby
result = DSPy::Context.with_span(
operation: "tool_selector.select",
"dspy.module" => "ToolSelector",
"tool_selector.tools" => tool_names.join(",")
) do
@predictor.call(query: query, context: context, available_tools: schemas)
end
```
### Setup for Langfuse
```ruby
# Gemfile
gem 'dspy-o11y'
gem 'dspy-o11y-langfuse'
# .env
LANGFUSE_PUBLIC_KEY=pk-...
LANGFUSE_SECRET_KEY=sk-...
DSPY_TELEMETRY_BATCH_SIZE=5
```
Every `DSPy::Predict`, `DSPy::ReAct`, and tool call is automatically traced when observability is configured.
### Score Reporting
Report evaluation scores to Langfuse:
```ruby
DSPy.score(name: "relevance", value: 0.85, trace_id: current_trace_id)
```
## Testing
### VCR Setup for Rails
```ruby
VCR.configure do |config|
config.cassette_library_dir = "spec/vcr_cassettes"
config.hook_into :webmock
config.configure_rspec_metadata!
config.filter_sensitive_data('<GEMINI_API_KEY>') { ENV['GEMINI_API_KEY'] }
config.filter_sensitive_data('<OPENAI_API_KEY>') { ENV['OPENAI_API_KEY'] }
end
```
### Signature Schema Tests
Test that signatures produce valid schemas without calling any LLM:
```ruby
RSpec.describe ClassifyResearchQuery do
it "has required input fields" do
schema = described_class.input_json_schema
expect(schema[:required]).to include("query")
end
it "has typed output fields" do
schema = described_class.output_json_schema
expect(schema[:properties]).to have_key(:search_strategy)
end
end
```
### Tool Tests with Mocked Predictions
```ruby
RSpec.describe RerankTool do
let(:tool) { described_class.new }
it "skips LLM for small result sets" do
expect(DSPy::Predict).not_to receive(:new)
result = tool.call(query: "test", items: [{ id: "1" }])
expect(result[:reranked]).to be false
end
it "calls LLM for large result sets", :vcr do
items = 10.times.map { |i| { id: i.to_s, title: "Item #{i}" } }
result = tool.call(query: "relevant items", items: items)
expect(result[:reranked]).to be true
end
end
```
## Resources
- [core-concepts.md](./references/core-concepts.md) — Signatures, modules, predictors, type system deep-dive
- [toolsets.md](./references/toolsets.md) — Tools::Base, Tools::Toolset DSL, type safety, testing
- [providers.md](./references/providers.md) — Provider adapters, RubyLLM, fiber-local LM context, compatibility matrix
- [optimization.md](./references/optimization.md) — MIPROv2, GEPA, evaluation framework, storage system
- [observability.md](./references/observability.md) — Event system, dspy-o11y gems, Langfuse, score reporting
- [signature-template.rb](./assets/signature-template.rb) — Signature scaffold with T::Enum, Date/Time, defaults, union types
- [module-template.rb](./assets/module-template.rb) — Module scaffold with .call(), lifecycle callbacks, fiber-local LM
- [config-template.rb](./assets/config-template.rb) — Rails initializer with RubyLLM, observability, feature flags
## Key URLs
- Homepage: https://oss.vicente.services/dspy.rb/
- GitHub: https://github.com/vicentereig/dspy.rb
- Documentation: https://oss.vicente.services/dspy.rb/getting-started/
## Guidelines for Claude
When helping users with DSPy.rb:
1. **Schema over prose** — Define output structure with `T::Struct` and `T::Enum` types, not string descriptions
2. **Entities in `app/entities/`** — Extract shared types so signatures stay thin
3. **Per-tool model selection** — Use `predictor.configure { |c| c.lm = ... }` to pick the right model per task
4. **Short-circuit LLM calls** — Skip the LLM for trivial cases (small data, cached results)
5. **Cap input sizes** — Prevent token overflow by limiting array sizes before sending to LLM
6. **Test schemas without LLM** — Validate `input_json_schema` and `output_json_schema` in unit tests
7. **VCR for integration tests** — Record real HTTP interactions, never mock LLM responses by hand
8. **Trace with spans** — Wrap tool calls in `DSPy::Context.with_span` for observability
9. **Graceful degradation** — Always rescue LLM errors and return fallback data
### Signature Best Practices
**Keep description concise** — The signature `description` should state the goal, not the field details:
```ruby
# Good — concise goal
class ParseOutline < DSPy::Signature
description 'Extract block-level structure from HTML as a flat list of skeleton sections.'
input do
const :html, String, description: 'Raw HTML to parse'
end
output do
const :sections, T::Array[Section], description: 'Block elements: headings, paragraphs, code blocks, lists'
end
end
```
**Use defaults over nilable arrays** — For OpenAI structured outputs compatibility:
```ruby
# Good — works with OpenAI structured outputs
class ASTNode < T::Struct
const :children, T::Array[ASTNode], default: []
end
```
### Recursive Types with `$defs`
DSPy.rb supports recursive types in structured outputs using JSON Schema `$defs`:
```ruby
class TreeNode < T::Struct
const :value, String
const :children, T::Array[TreeNode], default: [] # Self-reference
end
```
The schema generator automatically creates `#/$defs/TreeNode` references for recursive types, compatible with OpenAI and Gemini structured outputs.
### Field Descriptions for T::Struct
DSPy.rb extends T::Struct to support field-level `description:` kwargs that flow to JSON Schema:
```ruby
class ASTNode < T::Struct
const :node_type, NodeType, description: 'The type of node (heading, paragraph, etc.)'
const :text, String, default: "", description: 'Text content of the node'
const :level, Integer, default: 0 # No description — field is self-explanatory
const :children, T::Array[ASTNode], default: []
end
```
**When to use field descriptions**: complex field semantics, enum-like strings, constrained values, nested structs with ambiguous names. **When to skip**: self-explanatory fields like `name`, `id`, `url`, or boolean flags.
## Version
Current: 0.34.3

View File

@@ -1,187 +0,0 @@
# frozen_string_literal: true
# =============================================================================
# DSPy.rb Configuration Template — v0.34.3 API
#
# Rails initializer patterns for DSPy.rb with RubyLLM, observability,
# and feature-flagged model selection.
#
# Key patterns:
# - Use after_initialize for Rails setup
# - Use dspy-ruby_llm for multi-provider routing
# - Use structured_outputs: true for reliable parsing
# - Use dspy-o11y + dspy-o11y-langfuse for observability
# - Use ENV-based feature flags for model selection
# =============================================================================
# =============================================================================
# Gemfile Dependencies
# =============================================================================
#
# # Core
# gem 'dspy'
#
# # Provider adapter (choose one strategy):
#
# # Strategy A: Unified adapter via RubyLLM (recommended)
# gem 'dspy-ruby_llm'
# gem 'ruby_llm'
#
# # Strategy B: Per-provider adapters (direct SDK access)
# gem 'dspy-openai' # OpenAI, OpenRouter, Ollama
# gem 'dspy-anthropic' # Claude
# gem 'dspy-gemini' # Gemini
#
# # Observability (optional)
# gem 'dspy-o11y'
# gem 'dspy-o11y-langfuse'
#
# # Optimization (optional)
# gem 'dspy-miprov2' # MIPROv2 optimizer
# gem 'dspy-gepa' # GEPA optimizer
#
# # Schema formats (optional)
# gem 'sorbet-baml' # BAML schema format (84% token reduction)
# =============================================================================
# Rails Initializer — config/initializers/dspy.rb
# =============================================================================
Rails.application.config.after_initialize do
# Skip in test unless explicitly enabled
next if Rails.env.test? && ENV["DSPY_ENABLE_IN_TEST"].blank?
# Configure RubyLLM provider credentials
RubyLLM.configure do |config|
config.gemini_api_key = ENV["GEMINI_API_KEY"] if ENV["GEMINI_API_KEY"].present?
config.anthropic_api_key = ENV["ANTHROPIC_API_KEY"] if ENV["ANTHROPIC_API_KEY"].present?
config.openai_api_key = ENV["OPENAI_API_KEY"] if ENV["OPENAI_API_KEY"].present?
end
# Configure DSPy with unified RubyLLM adapter
model = ENV.fetch("DSPY_MODEL", "ruby_llm/gemini-2.5-flash")
DSPy.configure do |config|
config.lm = DSPy::LM.new(model, structured_outputs: true)
config.logger = Rails.logger
end
# Enable Langfuse observability (optional)
if ENV["LANGFUSE_PUBLIC_KEY"].present? && ENV["LANGFUSE_SECRET_KEY"].present?
DSPy::Observability.configure!
end
end
# =============================================================================
# Feature Flags — config/initializers/feature_flags.rb
# =============================================================================
# Use different models for different roles:
# - Fast/cheap for classification, routing, simple tasks
# - Powerful for synthesis, reasoning, complex analysis
module FeatureFlags
SELECTOR_MODEL = ENV.fetch("DSPY_SELECTOR_MODEL", "ruby_llm/gemini-2.5-flash-lite")
SYNTHESIZER_MODEL = ENV.fetch("DSPY_SYNTHESIZER_MODEL", "ruby_llm/gemini-2.5-flash")
REASONING_MODEL = ENV.fetch("DSPY_REASONING_MODEL", "ruby_llm/claude-sonnet-4-20250514")
end
# Usage in tools/modules:
#
# class ClassifyTool < DSPy::Tools::Base
# def call(query:)
# predictor = DSPy::Predict.new(ClassifySignature)
# predictor.configure { |c| c.lm = DSPy::LM.new(FeatureFlags::SELECTOR_MODEL, structured_outputs: true) }
# predictor.call(query: query)
# end
# end
# =============================================================================
# Environment Variables — .env
# =============================================================================
#
# # Provider API keys (set the ones you need)
# GEMINI_API_KEY=...
# ANTHROPIC_API_KEY=...
# OPENAI_API_KEY=...
#
# # DSPy model configuration
# DSPY_MODEL=ruby_llm/gemini-2.5-flash
# DSPY_SELECTOR_MODEL=ruby_llm/gemini-2.5-flash-lite
# DSPY_SYNTHESIZER_MODEL=ruby_llm/gemini-2.5-flash
# DSPY_REASONING_MODEL=ruby_llm/claude-sonnet-4-20250514
#
# # Langfuse observability (optional)
# LANGFUSE_PUBLIC_KEY=pk-...
# LANGFUSE_SECRET_KEY=sk-...
# DSPY_TELEMETRY_BATCH_SIZE=5
#
# # Test environment
# DSPY_ENABLE_IN_TEST=1 # Set to enable DSPy in test env
# =============================================================================
# Per-Provider Configuration (without RubyLLM)
# =============================================================================
# OpenAI (dspy-openai gem)
# DSPy.configure do |c|
# c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
# end
# Anthropic (dspy-anthropic gem)
# DSPy.configure do |c|
# c.lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY'])
# end
# Gemini (dspy-gemini gem)
# DSPy.configure do |c|
# c.lm = DSPy::LM.new('gemini/gemini-2.5-flash', api_key: ENV['GEMINI_API_KEY'])
# end
# Ollama (dspy-openai gem, local models)
# DSPy.configure do |c|
# c.lm = DSPy::LM.new('ollama/llama3.2', base_url: 'http://localhost:11434')
# end
# OpenRouter (dspy-openai gem, 200+ models)
# DSPy.configure do |c|
# c.lm = DSPy::LM.new('openrouter/anthropic/claude-3.5-sonnet',
# api_key: ENV['OPENROUTER_API_KEY'],
# base_url: 'https://openrouter.ai/api/v1')
# end
# =============================================================================
# VCR Test Configuration — spec/support/dspy.rb
# =============================================================================
# VCR.configure do |config|
# config.cassette_library_dir = "spec/vcr_cassettes"
# config.hook_into :webmock
# config.configure_rspec_metadata!
# config.filter_sensitive_data('<GEMINI_API_KEY>') { ENV['GEMINI_API_KEY'] }
# config.filter_sensitive_data('<OPENAI_API_KEY>') { ENV['OPENAI_API_KEY'] }
# config.filter_sensitive_data('<ANTHROPIC_API_KEY>') { ENV['ANTHROPIC_API_KEY'] }
# end
# =============================================================================
# Schema Format Configuration (optional)
# =============================================================================
# BAML schema format — 84% token reduction for Enhanced Prompting mode
# DSPy.configure do |c|
# c.lm = DSPy::LM.new('openai/gpt-4o-mini',
# api_key: ENV['OPENAI_API_KEY'],
# schema_format: :baml # Requires sorbet-baml gem
# )
# end
# TOON schema + data format — table-oriented format
# DSPy.configure do |c|
# c.lm = DSPy::LM.new('openai/gpt-4o-mini',
# api_key: ENV['OPENAI_API_KEY'],
# schema_format: :toon, # How DSPy describes the signature
# data_format: :toon # How inputs/outputs are rendered in prompts
# )
# end
#
# Note: BAML and TOON apply only when structured_outputs: false.
# With structured_outputs: true, the provider receives JSON Schema directly.

View File

@@ -1,300 +0,0 @@
# frozen_string_literal: true
# =============================================================================
# DSPy.rb Module Template — v0.34.3 API
#
# Modules orchestrate predictors, tools, and business logic.
#
# Key patterns:
# - Use .call() to invoke (not .forward())
# - Access results with result.field (not result[:field])
# - Use DSPy::Tools::Base for tools (not DSPy::Tool)
# - Use lifecycle callbacks (before/around/after) for cross-cutting concerns
# - Use DSPy.with_lm for temporary model overrides
# - Use configure_predictor for fine-grained agent control
# =============================================================================
# --- Basic Module ---
class BasicClassifier < DSPy::Module
def initialize
super
@predictor = DSPy::Predict.new(ClassificationSignature)
end
def forward(text:)
@predictor.call(text: text)
end
end
# Usage:
# classifier = BasicClassifier.new
# result = classifier.call(text: "This is a test")
# result.category # => "technical"
# result.confidence # => 0.95
# --- Module with Chain of Thought ---
class ReasoningClassifier < DSPy::Module
def initialize
super
@predictor = DSPy::ChainOfThought.new(ClassificationSignature)
end
def forward(text:)
result = @predictor.call(text: text)
# ChainOfThought adds result.reasoning automatically
result
end
end
# --- Module with Lifecycle Callbacks ---
class InstrumentedModule < DSPy::Module
before :setup_metrics
around :manage_context
after :log_completion
def initialize
super
@predictor = DSPy::Predict.new(AnalysisSignature)
@start_time = nil
end
def forward(query:)
@predictor.call(query: query)
end
private
# Runs before forward
def setup_metrics
@start_time = Time.now
Rails.logger.info "Starting prediction"
end
# Wraps forward — must call yield
def manage_context
load_user_context
result = yield
save_updated_context(result)
result
end
# Runs after forward completes
def log_completion
duration = Time.now - @start_time
Rails.logger.info "Prediction completed in #{duration}s"
end
def load_user_context = nil
def save_updated_context(_result) = nil
end
# Execution order: before → around (before yield) → forward → around (after yield) → after
# Callbacks are inherited from parent classes and execute in registration order.
# --- Module with Tools ---
class SearchTool < DSPy::Tools::Base
tool_name "search"
tool_description "Search for information by query"
sig { params(query: String, max_results: Integer).returns(T::Array[T::Hash[Symbol, String]]) }
def call(query:, max_results: 5)
# Implementation here
[{ title: "Result 1", url: "https://example.com" }]
end
end
class FinishTool < DSPy::Tools::Base
tool_name "finish"
tool_description "Submit the final answer"
sig { params(answer: String).returns(String) }
def call(answer:)
answer
end
end
class ResearchAgent < DSPy::Module
def initialize
super
tools = [SearchTool.new, FinishTool.new]
@agent = DSPy::ReAct.new(
ResearchSignature,
tools: tools,
max_iterations: 5
)
end
def forward(question:)
@agent.call(question: question)
end
end
# --- Module with Per-Task Model Selection ---
class SmartRouter < DSPy::Module
def initialize
super
@classifier = DSPy::Predict.new(RouteSignature)
@analyzer = DSPy::ChainOfThought.new(AnalysisSignature)
end
def forward(text:)
# Use fast model for classification
DSPy.with_lm(fast_model) do
route = @classifier.call(text: text)
if route.requires_deep_analysis
# Switch to powerful model for analysis
DSPy.with_lm(powerful_model) do
@analyzer.call(text: text)
end
else
route
end
end
end
private
def fast_model
@fast_model ||= DSPy::LM.new(
ENV.fetch("DSPY_SELECTOR_MODEL", "ruby_llm/gemini-2.5-flash-lite"),
structured_outputs: true
)
end
def powerful_model
@powerful_model ||= DSPy::LM.new(
ENV.fetch("DSPY_SYNTHESIZER_MODEL", "ruby_llm/gemini-2.5-flash"),
structured_outputs: true
)
end
end
# --- Module with configure_predictor ---
class ConfiguredAgent < DSPy::Module
def initialize
super
tools = [SearchTool.new, FinishTool.new]
@agent = DSPy::ReAct.new(ResearchSignature, tools: tools)
# Set default model for all internal predictors
@agent.configure { |c| c.lm = DSPy::LM.new('ruby_llm/gemini-2.5-flash', structured_outputs: true) }
# Override specific predictor with a more capable model
@agent.configure_predictor('thought_generator') do |c|
c.lm = DSPy::LM.new('ruby_llm/claude-sonnet-4-20250514', structured_outputs: true)
end
end
def forward(question:)
@agent.call(question: question)
end
end
# Available internal predictors by agent type:
# DSPy::ReAct → thought_generator, observation_processor
# DSPy::CodeAct → code_generator, observation_processor
# DSPy::DeepSearch → seed_predictor, search_predictor, reader_predictor, reason_predictor
# --- Module with Event Subscriptions ---
class TokenTrackingModule < DSPy::Module
subscribe 'lm.tokens', :track_tokens, scope: :descendants
def initialize
super
@predictor = DSPy::Predict.new(AnalysisSignature)
@total_tokens = 0
end
def forward(query:)
@predictor.call(query: query)
end
def track_tokens(_event, attrs)
@total_tokens += attrs.fetch(:total_tokens, 0)
end
def token_usage
@total_tokens
end
end
# Module-scoped subscriptions automatically scope to the module instance and descendants.
# Use scope: :self_only to restrict delivery to the module itself (ignoring children).
# --- Tool That Wraps a Prediction ---
class RerankTool < DSPy::Tools::Base
tool_name "rerank"
tool_description "Score and rank search results by relevance"
MAX_ITEMS = 200
MIN_ITEMS_FOR_LLM = 5
sig { params(query: String, items: T::Array[T::Hash[Symbol, T.untyped]]).returns(T::Hash[Symbol, T.untyped]) }
def call(query:, items: [])
# Short-circuit: skip LLM for small sets
return { scored_items: items, reranked: false } if items.size < MIN_ITEMS_FOR_LLM
# Cap to prevent token overflow
capped_items = items.first(MAX_ITEMS)
predictor = DSPy::Predict.new(RerankSignature)
predictor.configure { |c| c.lm = DSPy::LM.new("ruby_llm/gemini-2.5-flash", structured_outputs: true) }
result = predictor.call(query: query, items: capped_items)
{ scored_items: result.scored_items, reranked: true }
rescue => e
Rails.logger.warn "[RerankTool] LLM rerank failed: #{e.message}"
{ error: "Rerank failed: #{e.message}", scored_items: items, reranked: false }
end
end
# Key patterns for tools wrapping predictions:
# - Short-circuit LLM calls when unnecessary (small data, trivial cases)
# - Cap input size to prevent token overflow
# - Per-tool model selection via configure
# - Graceful error handling with fallback data
# --- Multi-Step Pipeline ---
class AnalysisPipeline < DSPy::Module
def initialize
super
@classifier = DSPy::Predict.new(ClassifySignature)
@analyzer = DSPy::ChainOfThought.new(AnalyzeSignature)
@summarizer = DSPy::Predict.new(SummarizeSignature)
end
def forward(text:)
classification = @classifier.call(text: text)
analysis = @analyzer.call(text: text, category: classification.category)
@summarizer.call(analysis: analysis.reasoning, category: classification.category)
end
end
# --- Observability with Spans ---
class TracedModule < DSPy::Module
def initialize
super
@predictor = DSPy::Predict.new(AnalysisSignature)
end
def forward(query:)
DSPy::Context.with_span(
operation: "traced_module.analyze",
"dspy.module" => self.class.name,
"query.length" => query.length.to_s
) do
@predictor.call(query: query)
end
end
end

View File

@@ -1,221 +0,0 @@
# frozen_string_literal: true
# =============================================================================
# DSPy.rb Signature Template — v0.34.3 API
#
# Signatures define the interface between your application and LLMs.
# They specify inputs, outputs, and task descriptions using Sorbet types.
#
# Key patterns:
# - Use T::Enum classes for controlled outputs (not inline T.enum([...]))
# - Use description: kwarg on fields to guide the LLM
# - Use default values for optional fields
# - Use Date/DateTime/Time for temporal data (auto-converted)
# - Access results with result.field (not result[:field])
# - Invoke with predictor.call() (not predictor.forward())
# =============================================================================
# --- Basic Signature ---
class SentimentAnalysis < DSPy::Signature
description "Analyze sentiment of text"
class Sentiment < T::Enum
enums do
Positive = new('positive')
Negative = new('negative')
Neutral = new('neutral')
end
end
input do
const :text, String
end
output do
const :sentiment, Sentiment
const :score, Float, description: "Confidence score from 0.0 to 1.0"
end
end
# Usage:
# predictor = DSPy::Predict.new(SentimentAnalysis)
# result = predictor.call(text: "This product is amazing!")
# result.sentiment # => Sentiment::Positive
# result.score # => 0.92
# --- Signature with Date/Time Types ---
class EventScheduler < DSPy::Signature
description "Schedule events based on requirements"
input do
const :event_name, String
const :start_date, Date # ISO 8601: YYYY-MM-DD
const :end_date, T.nilable(Date) # Optional date
const :preferred_time, DateTime # ISO 8601 with timezone
const :deadline, Time # Stored as UTC
end
output do
const :scheduled_date, Date # LLM returns ISO string, auto-converted
const :event_datetime, DateTime # Preserves timezone
const :created_at, Time # Converted to UTC
end
end
# Date/Time format handling:
# Date → ISO 8601 (YYYY-MM-DD)
# DateTime → ISO 8601 with timezone (YYYY-MM-DDTHH:MM:SS+00:00)
# Time → ISO 8601, automatically converted to UTC
# --- Signature with Default Values ---
class SmartSearch < DSPy::Signature
description "Search with intelligent defaults"
input do
const :query, String
const :max_results, Integer, default: 10
const :language, String, default: "English"
const :include_metadata, T::Boolean, default: false
end
output do
const :results, T::Array[String]
const :total_found, Integer
const :search_time_ms, Float, default: 0.0 # Fallback if LLM omits
const :cached, T::Boolean, default: false
end
end
# Input defaults reduce boilerplate:
# search = DSPy::Predict.new(SmartSearch)
# result = search.call(query: "Ruby programming")
# # max_results=10, language="English", include_metadata=false are applied
# --- Signature with Nested Structs and Field Descriptions ---
class EntityExtraction < DSPy::Signature
description "Extract named entities from text"
class EntityType < T::Enum
enums do
Person = new('person')
Organization = new('organization')
Location = new('location')
DateEntity = new('date')
end
end
class Entity < T::Struct
const :name, String, description: "The entity text as it appears in the source"
const :type, EntityType
const :confidence, Float, description: "Extraction confidence from 0.0 to 1.0"
const :start_offset, Integer, default: 0
end
input do
const :text, String
const :entity_types, T::Array[EntityType], default: [],
description: "Filter to these entity types; empty means all types"
end
output do
const :entities, T::Array[Entity]
const :total_found, Integer
end
end
# --- Signature with Union Types ---
class FlexibleClassification < DSPy::Signature
description "Classify input with flexible result type"
class Category < T::Enum
enums do
Technical = new('technical')
Business = new('business')
Personal = new('personal')
end
end
input do
const :text, String
end
output do
const :category, Category
const :result, T.any(Float, String),
description: "Numeric score or text explanation depending on classification"
const :confidence, Float
end
end
# --- Signature with Recursive Types ---
class DocumentParser < DSPy::Signature
description "Parse document into tree structure"
class NodeType < T::Enum
enums do
Heading = new('heading')
Paragraph = new('paragraph')
List = new('list')
CodeBlock = new('code_block')
end
end
class TreeNode < T::Struct
const :node_type, NodeType, description: "The type of document element"
const :text, String, default: "", description: "Text content of the node"
const :level, Integer, default: 0
const :children, T::Array[TreeNode], default: [] # Self-reference → $defs in JSON Schema
end
input do
const :html, String, description: "Raw HTML to parse"
end
output do
const :root, TreeNode
const :word_count, Integer
end
end
# The schema generator creates #/$defs/TreeNode references for recursive types,
# compatible with OpenAI and Gemini structured outputs.
# Use `default: []` instead of `T.nilable(T::Array[...])` for OpenAI compatibility.
# --- Vision Signature ---
class ImageAnalysis < DSPy::Signature
description "Analyze an image and answer questions about its content"
input do
const :image, DSPy::Image, description: "The image to analyze"
const :question, String, description: "Question about the image content"
end
output do
const :answer, String
const :confidence, Float, description: "Confidence in the answer (0.0-1.0)"
end
end
# Vision usage:
# predictor = DSPy::Predict.new(ImageAnalysis)
# result = predictor.call(
# image: DSPy::Image.from_file("path/to/image.jpg"),
# question: "What objects are visible?"
# )
# result.answer # => "The image shows..."
# --- Accessing Schemas Programmatically ---
#
# SentimentAnalysis.input_json_schema # => { type: "object", properties: { ... } }
# SentimentAnalysis.output_json_schema # => { type: "object", properties: { ... } }
#
# # Field descriptions propagate to JSON Schema
# Entity.field_descriptions[:name] # => "The entity text as it appears in the source"
# Entity.field_descriptions[:confidence] # => "Extraction confidence from 0.0 to 1.0"

View File

@@ -1,674 +0,0 @@
# DSPy.rb Core Concepts
## Signatures
Signatures define the interface between application code and language models. They specify inputs, outputs, and a task description using Sorbet types for compile-time and runtime type safety.
### Structure
```ruby
class ClassifyEmail < DSPy::Signature
description "Classify customer support emails by urgency and category"
input do
const :subject, String
const :body, String
end
output do
const :category, String
const :urgency, String
end
end
```
### Supported Types
| Type | JSON Schema | Notes |
|------|-------------|-------|
| `String` | `string` | Required string |
| `Integer` | `integer` | Whole numbers |
| `Float` | `number` | Decimal numbers |
| `T::Boolean` | `boolean` | true/false |
| `T::Array[X]` | `array` | Typed arrays |
| `T::Hash[K, V]` | `object` | Typed key-value maps |
| `T.nilable(X)` | nullable | Optional fields |
| `Date` | `string` (ISO 8601) | Auto-converted |
| `DateTime` | `string` (ISO 8601) | Preserves timezone |
| `Time` | `string` (ISO 8601) | Converted to UTC |
### Date and Time Types
Date, DateTime, and Time fields serialize to ISO 8601 strings and auto-convert back to Ruby objects on output.
```ruby
class EventScheduler < DSPy::Signature
description "Schedule events based on requirements"
input do
const :start_date, Date # ISO 8601: YYYY-MM-DD
const :preferred_time, DateTime # ISO 8601 with timezone
const :deadline, Time # Converted to UTC
const :end_date, T.nilable(Date) # Optional date
end
output do
const :scheduled_date, Date # String from LLM, auto-converted to Date
const :event_datetime, DateTime # Preserves timezone info
const :created_at, Time # Converted to UTC
end
end
predictor = DSPy::Predict.new(EventScheduler)
result = predictor.call(
start_date: "2024-01-15",
preferred_time: "2024-01-15T10:30:45Z",
deadline: Time.now,
end_date: nil
)
result.scheduled_date.class # => Date
result.event_datetime.class # => DateTime
```
Timezone conventions follow ActiveRecord: Time objects convert to UTC, DateTime objects preserve timezone, Date objects are timezone-agnostic.
### Enums with T::Enum
Define constrained output values using `T::Enum` classes. Do not use inline `T.enum([...])` syntax.
```ruby
class SentimentAnalysis < DSPy::Signature
description "Analyze sentiment of text"
class Sentiment < T::Enum
enums do
Positive = new('positive')
Negative = new('negative')
Neutral = new('neutral')
end
end
input do
const :text, String
end
output do
const :sentiment, Sentiment
const :confidence, Float
end
end
predictor = DSPy::Predict.new(SentimentAnalysis)
result = predictor.call(text: "This product is amazing!")
result.sentiment # => #<Sentiment::Positive>
result.sentiment.serialize # => "positive"
result.confidence # => 0.92
```
Enum matching is case-insensitive. The LLM returning `"POSITIVE"` matches `new('positive')`.
### Default Values
Default values work on both inputs and outputs. Input defaults reduce caller boilerplate. Output defaults provide fallbacks when the LLM omits optional fields.
```ruby
class SmartSearch < DSPy::Signature
description "Search with intelligent defaults"
input do
const :query, String
const :max_results, Integer, default: 10
const :language, String, default: "English"
end
output do
const :results, T::Array[String]
const :total_found, Integer
const :cached, T::Boolean, default: false
end
end
search = DSPy::Predict.new(SmartSearch)
result = search.call(query: "Ruby programming")
# max_results defaults to 10, language defaults to "English"
# If LLM omits `cached`, it defaults to false
```
### Field Descriptions
Add `description:` to any field to guide the LLM on expected content. These descriptions appear in the generated JSON schema sent to the model.
```ruby
class ASTNode < T::Struct
const :node_type, String, description: "The type of AST node (heading, paragraph, code_block)"
const :text, String, default: "", description: "Text content of the node"
const :level, Integer, default: 0, description: "Heading level 1-6, only for heading nodes"
const :children, T::Array[ASTNode], default: []
end
ASTNode.field_descriptions[:node_type] # => "The type of AST node ..."
ASTNode.field_descriptions[:children] # => nil (no description set)
```
Field descriptions also work inside signature `input` and `output` blocks:
```ruby
class ExtractEntities < DSPy::Signature
description "Extract named entities from text"
input do
const :text, String, description: "Raw text to analyze"
const :language, String, default: "en", description: "ISO 639-1 language code"
end
output do
const :entities, T::Array[String], description: "List of extracted entity names"
const :count, Integer, description: "Total number of unique entities found"
end
end
```
### Schema Formats
DSPy.rb supports three schema formats for communicating type structure to LLMs.
#### JSON Schema (default)
Verbose but universally supported. Access via `YourSignature.output_json_schema`.
#### BAML Schema
Compact format that reduces schema tokens by 80-85%. Requires the `sorbet-baml` gem.
```ruby
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'],
schema_format: :baml
)
end
```
BAML applies only in Enhanced Prompting mode (`structured_outputs: false`). When `structured_outputs: true`, the provider receives JSON Schema directly.
#### TOON Schema + Data Format
Table-oriented text format that shrinks both schema definitions and prompt values.
```ruby
DSPy.configure do |c|
c.lm = DSPy::LM.new('openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'],
schema_format: :toon,
data_format: :toon
)
end
```
`schema_format: :toon` replaces the schema block in the system prompt. `data_format: :toon` renders input values and output templates inside `toon` fences. Only works with Enhanced Prompting mode. The `sorbet-toon` gem is included automatically as a dependency.
### Recursive Types
Structs that reference themselves produce `$defs` entries in the generated JSON schema, using `$ref` pointers to avoid infinite recursion.
```ruby
class ASTNode < T::Struct
const :node_type, String
const :text, String, default: ""
const :children, T::Array[ASTNode], default: []
end
```
The schema generator detects the self-reference in `T::Array[ASTNode]` and emits:
```json
{
"$defs": {
"ASTNode": { "type": "object", "properties": { ... } }
},
"properties": {
"children": {
"type": "array",
"items": { "$ref": "#/$defs/ASTNode" }
}
}
}
```
Access the schema with accumulated definitions via `YourSignature.output_json_schema_with_defs`.
### Union Types with T.any()
Specify fields that accept multiple types:
```ruby
output do
const :result, T.any(Float, String)
end
```
For struct unions, DSPy.rb automatically adds a `_type` discriminator field to each struct's JSON schema. The LLM returns `_type` in its response, and DSPy converts the hash to the correct struct instance.
```ruby
class CreateTask < T::Struct
const :title, String
const :priority, String
end
class DeleteTask < T::Struct
const :task_id, String
const :reason, T.nilable(String)
end
class TaskRouter < DSPy::Signature
description "Route user request to the appropriate task action"
input do
const :request, String
end
output do
const :action, T.any(CreateTask, DeleteTask)
end
end
result = DSPy::Predict.new(TaskRouter).call(request: "Create a task for Q4 review")
result.action.class # => CreateTask
result.action.title # => "Q4 Review"
```
Pattern matching works on the result:
```ruby
case result.action
when CreateTask then puts "Creating: #{result.action.title}"
when DeleteTask then puts "Deleting: #{result.action.task_id}"
end
```
Union types also work inside arrays for heterogeneous collections:
```ruby
output do
const :events, T::Array[T.any(LoginEvent, PurchaseEvent)]
end
```
Limit unions to 2-4 types for reliable LLM comprehension. Use clear struct names since they become the `_type` discriminator values.
---
## Modules
Modules are composable building blocks that wrap predictors. Define a `forward` method; invoke the module with `.call()`.
### Basic Structure
```ruby
class SentimentAnalyzer < DSPy::Module
def initialize
super
@predictor = DSPy::Predict.new(SentimentSignature)
end
def forward(text:)
@predictor.call(text: text)
end
end
analyzer = SentimentAnalyzer.new
result = analyzer.call(text: "I love this product!")
result.sentiment # => "positive"
result.confidence # => 0.9
```
**API rules:**
- Invoke modules and predictors with `.call()`, not `.forward()`.
- Access result fields with `result.field`, not `result[:field]`.
### Module Composition
Combine multiple modules through explicit method calls in `forward`:
```ruby
class DocumentProcessor < DSPy::Module
def initialize
super
@classifier = DocumentClassifier.new
@summarizer = DocumentSummarizer.new
end
def forward(document:)
classification = @classifier.call(content: document)
summary = @summarizer.call(content: document)
{
document_type: classification.document_type,
summary: summary.summary
}
end
end
```
### Lifecycle Callbacks
Modules support `before`, `after`, and `around` callbacks on `forward`. Declare them as class-level macros referencing private methods.
#### Execution order
1. `before` callbacks (in registration order)
2. `around` callbacks (before `yield`)
3. `forward` method
4. `around` callbacks (after `yield`)
5. `after` callbacks (in registration order)
```ruby
class InstrumentedModule < DSPy::Module
before :setup_metrics
after :log_metrics
around :manage_context
def initialize
super
@predictor = DSPy::Predict.new(MySignature)
@metrics = {}
end
def forward(question:)
@predictor.call(question: question)
end
private
def setup_metrics
@metrics[:start_time] = Time.now
end
def manage_context
load_context
result = yield
save_context
result
end
def log_metrics
@metrics[:duration] = Time.now - @metrics[:start_time]
end
end
```
Multiple callbacks of the same type execute in registration order. Callbacks inherit from parent classes; parent callbacks run first.
#### Around callbacks
Around callbacks must call `yield` to execute the wrapped method and return the result:
```ruby
def with_retry
retries = 0
begin
yield
rescue StandardError => e
retries += 1
retry if retries < 3
raise e
end
end
```
### Instruction Update Contract
Teleprompters (GEPA, MIPROv2) require modules to expose immutable update hooks. Include `DSPy::Mixins::InstructionUpdatable` and implement `with_instruction` and `with_examples`, each returning a new instance:
```ruby
class SentimentPredictor < DSPy::Module
include DSPy::Mixins::InstructionUpdatable
def initialize
super
@predictor = DSPy::Predict.new(SentimentSignature)
end
def with_instruction(instruction)
clone = self.class.new
clone.instance_variable_set(:@predictor, @predictor.with_instruction(instruction))
clone
end
def with_examples(examples)
clone = self.class.new
clone.instance_variable_set(:@predictor, @predictor.with_examples(examples))
clone
end
end
```
If a module omits these hooks, teleprompters raise `DSPy::InstructionUpdateError` instead of silently mutating state.
---
## Predictors
Predictors are execution engines that take a signature and produce structured results from a language model. DSPy.rb provides four predictor types.
### Predict
Direct LLM call with typed input/output. Fastest option, lowest token usage.
```ruby
classifier = DSPy::Predict.new(ClassifyText)
result = classifier.call(text: "Technical document about APIs")
result.sentiment # => #<Sentiment::Positive>
result.topics # => ["APIs", "technical"]
result.confidence # => 0.92
```
### ChainOfThought
Adds a `reasoning` field to the output automatically. The model generates step-by-step reasoning before the final answer. Do not define a `:reasoning` field in the signature output when using ChainOfThought.
```ruby
class SolveMathProblem < DSPy::Signature
description "Solve mathematical word problems step by step"
input do
const :problem, String
end
output do
const :answer, String
# :reasoning is added automatically by ChainOfThought
end
end
solver = DSPy::ChainOfThought.new(SolveMathProblem)
result = solver.call(problem: "Sarah has 15 apples. She gives 7 away and buys 12 more.")
result.reasoning # => "Step by step: 15 - 7 = 8, then 8 + 12 = 20"
result.answer # => "20 apples"
```
Use ChainOfThought for complex analysis, multi-step reasoning, or when explainability matters.
### ReAct
Reasoning + Action agent that uses tools in an iterative loop. Define tools by subclassing `DSPy::Tools::Base`. Group related tools with `DSPy::Tools::Toolset`.
```ruby
class WeatherTool < DSPy::Tools::Base
extend T::Sig
tool_name "weather"
tool_description "Get weather information for a location"
sig { params(location: String).returns(String) }
def call(location:)
{ location: location, temperature: 72, condition: "sunny" }.to_json
end
end
class TravelSignature < DSPy::Signature
description "Help users plan travel"
input do
const :destination, String
end
output do
const :recommendations, String
end
end
agent = DSPy::ReAct.new(
TravelSignature,
tools: [WeatherTool.new],
max_iterations: 5
)
result = agent.call(destination: "Tokyo, Japan")
result.recommendations # => "Visit Senso-ji Temple early morning..."
result.history # => Array of reasoning steps, actions, observations
result.iterations # => 3
result.tools_used # => ["weather"]
```
Use toolsets to expose multiple tool methods from a single class:
```ruby
text_tools = DSPy::Tools::TextProcessingToolset.to_tools
agent = DSPy::ReAct.new(MySignature, tools: text_tools)
```
### CodeAct
Think-Code-Observe agent that synthesizes and executes Ruby code. Ships as a separate gem.
```ruby
# Gemfile
gem 'dspy-code_act', '~> 0.29'
```
```ruby
programmer = DSPy::CodeAct.new(ProgrammingSignature, max_iterations: 10)
result = programmer.call(task: "Calculate the factorial of 20")
```
### Predictor Comparison
| Predictor | Speed | Token Usage | Best For |
|-----------|-------|-------------|----------|
| Predict | Fastest | Low | Classification, extraction |
| ChainOfThought | Moderate | Medium-High | Complex reasoning, analysis |
| ReAct | Slower | High | Multi-step tasks with tools |
| CodeAct | Slowest | Very High | Dynamic programming, calculations |
### Concurrent Predictions
Process multiple independent predictions simultaneously using `Async::Barrier`:
```ruby
require 'async'
require 'async/barrier'
analyzer = DSPy::Predict.new(ContentAnalyzer)
documents = ["Text one", "Text two", "Text three"]
Async do
barrier = Async::Barrier.new
tasks = documents.map do |doc|
barrier.async { analyzer.call(content: doc) }
end
barrier.wait
predictions = tasks.map(&:wait)
predictions.each { |p| puts p.sentiment }
end
```
Add `gem 'async', '~> 2.29'` to the Gemfile. Handle errors within each `barrier.async` block to prevent one failure from cancelling others:
```ruby
barrier.async do
begin
analyzer.call(content: doc)
rescue StandardError => e
nil
end
end
```
### Few-Shot Examples and Instruction Tuning
```ruby
classifier = DSPy::Predict.new(SentimentAnalysis)
examples = [
DSPy::FewShotExample.new(
input: { text: "Love it!" },
output: { sentiment: "positive", confidence: 0.95 }
)
]
optimized = classifier.with_examples(examples)
tuned = classifier.with_instruction("Be precise and confident.")
```
---
## Type System
### Automatic Type Conversion
DSPy.rb v0.9.0+ automatically converts LLM JSON responses to typed Ruby objects:
- **Enums**: String values become `T::Enum` instances (case-insensitive)
- **Structs**: Nested hashes become `T::Struct` objects
- **Arrays**: Elements convert recursively
- **Defaults**: Missing fields use declared defaults
### Discriminators for Union Types
When a field uses `T.any()` with struct types, DSPy adds a `_type` field to each struct's schema. On deserialization, `_type` selects the correct struct class:
```json
{
"action": {
"_type": "CreateTask",
"title": "Review Q4 Report"
}
}
```
DSPy matches `"CreateTask"` against the union members and instantiates the correct struct. No manual discriminator field is needed.
### Recursive Types
Structs referencing themselves are supported. The schema generator tracks visited types and produces `$ref` pointers under `$defs`:
```ruby
class TreeNode < T::Struct
const :label, String
const :children, T::Array[TreeNode], default: []
end
```
The generated schema uses `"$ref": "#/$defs/TreeNode"` for the children array items, preventing infinite schema expansion.
### Nesting Depth
- 1-2 levels: reliable across all providers.
- 3-4 levels: works but increases schema complexity.
- 5+ levels: may trigger OpenAI depth validation warnings and reduce LLM accuracy. Flatten deeply nested structures or split into multiple signatures.
### Tips
- Prefer `T::Array[X], default: []` over `T.nilable(T::Array[X])` -- the nilable form causes schema issues with OpenAI structured outputs.
- Use clear struct names for union types since they become `_type` discriminator values.
- Limit union types to 2-4 members for reliable model comprehension.
- Check schema compatibility with `DSPy::OpenAI::LM::SchemaConverter.validate_compatibility(schema)`.

View File

@@ -1,366 +0,0 @@
# DSPy.rb Observability
DSPy.rb provides an event-driven observability system built on OpenTelemetry. The system replaces monkey-patching with structured event emission, pluggable listeners, automatic span creation, and non-blocking Langfuse export.
## Event System
### Emitting Events
Emit structured events with `DSPy.event`:
```ruby
DSPy.event('lm.tokens', {
'gen_ai.system' => 'openai',
'gen_ai.request.model' => 'gpt-4',
input_tokens: 150,
output_tokens: 50,
total_tokens: 200
})
```
Event names are **strings** with dot-separated namespaces (e.g., `'llm.generate'`, `'react.iteration_complete'`, `'chain_of_thought.reasoning_complete'`). Do not use symbols for event names.
Attributes must be JSON-serializable. DSPy automatically merges context (trace ID, module stack) and creates OpenTelemetry spans.
### Global Subscriptions
Subscribe to events across the entire application with `DSPy.events.subscribe`:
```ruby
# Exact event name
subscription_id = DSPy.events.subscribe('lm.tokens') do |event_name, attrs|
puts "Tokens used: #{attrs[:total_tokens]}"
end
# Wildcard pattern -- matches llm.generate, llm.stream, etc.
DSPy.events.subscribe('llm.*') do |event_name, attrs|
track_llm_usage(attrs)
end
# Catch-all wildcard
DSPy.events.subscribe('*') do |event_name, attrs|
log_everything(event_name, attrs)
end
```
Use global subscriptions for cross-cutting concerns: observability exporters (Langfuse, Datadog), centralized logging, metrics collection.
### Module-Scoped Subscriptions
Declare listeners inside a `DSPy::Module` subclass. Subscriptions automatically scope to the module instance and its descendants:
```ruby
class ResearchReport < DSPy::Module
subscribe 'lm.tokens', :track_tokens, scope: :descendants
def initialize
super
@outliner = DSPy::Predict.new(OutlineSignature)
@writer = DSPy::Predict.new(SectionWriterSignature)
@token_count = 0
end
def forward(question:)
outline = @outliner.call(question: question)
outline.sections.map do |title|
draft = @writer.call(question: question, section_title: title)
{ title: title, body: draft.paragraph }
end
end
def track_tokens(_event, attrs)
@token_count += attrs.fetch(:total_tokens, 0)
end
end
```
The `scope:` parameter accepts:
- `:descendants` (default) -- receives events from the module **and** every nested module invoked inside it.
- `DSPy::Module::SubcriptionScope::SelfOnly` -- restricts delivery to events emitted by the module instance itself; ignores descendants.
Inspect active subscriptions with `registered_module_subscriptions`. Tear down with `unsubscribe_module_events`.
### Unsubscribe and Cleanup
Remove a global listener by subscription ID:
```ruby
id = DSPy.events.subscribe('llm.*') { |name, attrs| }
DSPy.events.unsubscribe(id)
```
Build tracker classes that manage their own subscription lifecycle:
```ruby
class TokenBudgetTracker
def initialize(budget:)
@budget = budget
@usage = 0
@subscriptions = []
@subscriptions << DSPy.events.subscribe('lm.tokens') do |_event, attrs|
@usage += attrs.fetch(:total_tokens, 0)
warn("Budget hit") if @usage >= @budget
end
end
def unsubscribe
@subscriptions.each { |id| DSPy.events.unsubscribe(id) }
@subscriptions.clear
end
end
```
### Clearing Listeners in Tests
Call `DSPy.events.clear_listeners` in `before`/`after` blocks to prevent cross-contamination between test cases:
```ruby
RSpec.configure do |config|
config.after(:each) { DSPy.events.clear_listeners }
end
```
## dspy-o11y Gems
Three gems compose the observability stack:
| Gem | Purpose |
|---|---|
| `dspy` | Core event bus (`DSPy.event`, `DSPy.events`) -- always available |
| `dspy-o11y` | OpenTelemetry spans, `AsyncSpanProcessor`, `DSPy::Context.with_span` helpers |
| `dspy-o11y-langfuse` | Langfuse adapter -- configures OTLP exporter targeting Langfuse endpoints |
### Installation
```ruby
# Gemfile
gem 'dspy'
gem 'dspy-o11y' # core spans + helpers
gem 'dspy-o11y-langfuse' # Langfuse/OpenTelemetry adapter (optional)
```
If the optional gems are absent, DSPy falls back to logging-only mode with no errors.
## Langfuse Integration
### Environment Variables
```bash
# Required
export LANGFUSE_PUBLIC_KEY=pk-lf-your-public-key
export LANGFUSE_SECRET_KEY=sk-lf-your-secret-key
# Optional (defaults to https://cloud.langfuse.com)
export LANGFUSE_HOST=https://us.cloud.langfuse.com
# Tuning (optional)
export DSPY_TELEMETRY_BATCH_SIZE=100 # spans per export batch (default 100)
export DSPY_TELEMETRY_QUEUE_SIZE=1000 # max queued spans (default 1000)
export DSPY_TELEMETRY_EXPORT_INTERVAL=60 # seconds between timed exports (default 60)
export DSPY_TELEMETRY_SHUTDOWN_TIMEOUT=10 # seconds to drain on shutdown (default 10)
```
### Automatic Configuration
Call `DSPy::Observability.configure!` once at boot (it is already called automatically when `require 'dspy'` runs and Langfuse env vars are present):
```ruby
require 'dspy'
# If LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are set,
# DSPy::Observability.configure! runs automatically and:
# 1. Configures the OpenTelemetry SDK with an OTLP exporter
# 2. Creates dual output: structured logs AND OpenTelemetry spans
# 3. Exports spans to Langfuse using proper authentication
# 4. Falls back gracefully if gems are missing
```
Verify status with `DSPy::Observability.enabled?`.
### Automatic Tracing
With observability enabled, every `DSPy::Module#forward` call, LM request, and tool invocation creates properly nested spans. Langfuse receives hierarchical traces:
```
Trace: abc-123-def
+-- ChainOfThought.forward [2000ms] (observation type: chain)
+-- llm.generate [1000ms] (observation type: generation)
Model: gpt-4-0613
Tokens: 100 in / 50 out / 150 total
```
DSPy maps module classes to Langfuse observation types automatically via `DSPy::ObservationType.for_module_class`:
| Module | Observation Type |
|---|---|
| `DSPy::LM` (raw chat) | `generation` |
| `DSPy::ChainOfThought` | `chain` |
| `DSPy::ReAct` | `agent` |
| Tool invocations | `tool` |
| Memory/retrieval | `retriever` |
| Embedding engines | `embedding` |
| Evaluation modules | `evaluator` |
| Generic operations | `span` |
## Score Reporting
### DSPy.score API
Report evaluation scores with `DSPy.score`:
```ruby
# Numeric (default)
DSPy.score('accuracy', 0.95)
# With comment
DSPy.score('relevance', 0.87, comment: 'High semantic similarity')
# Boolean
DSPy.score('is_valid', 1, data_type: DSPy::Scores::DataType::Boolean)
# Categorical
DSPy.score('sentiment', 'positive', data_type: DSPy::Scores::DataType::Categorical)
# Explicit trace binding
DSPy.score('accuracy', 0.95, trace_id: 'custom-trace-id')
```
Available data types: `DSPy::Scores::DataType::Numeric`, `::Boolean`, `::Categorical`.
### score.create Events
Every `DSPy.score` call emits a `'score.create'` event. Subscribe to react:
```ruby
DSPy.events.subscribe('score.create') do |event_name, attrs|
puts "#{attrs[:score_name]} = #{attrs[:score_value]}"
# Also available: attrs[:score_id], attrs[:score_data_type],
# attrs[:score_comment], attrs[:trace_id], attrs[:observation_id],
# attrs[:timestamp]
end
```
### Async Langfuse Export with DSPy::Scores::Exporter
Configure the exporter to send scores to Langfuse in the background:
```ruby
exporter = DSPy::Scores::Exporter.configure(
public_key: ENV['LANGFUSE_PUBLIC_KEY'],
secret_key: ENV['LANGFUSE_SECRET_KEY'],
host: 'https://cloud.langfuse.com'
)
# Scores are now exported automatically via a background Thread::Queue
DSPy.score('accuracy', 0.95)
# Shut down gracefully (waits up to 5 seconds by default)
exporter.shutdown
```
The exporter subscribes to `'score.create'` events internally, queues them for async processing, and retries with exponential backoff on failure.
### Automatic Export with DSPy::Evals
Pass `export_scores: true` to `DSPy::Evals` to export per-example scores and an aggregate batch score automatically:
```ruby
evaluator = DSPy::Evals.new(
program,
metric: my_metric,
export_scores: true,
score_name: 'qa_accuracy'
)
result = evaluator.evaluate(test_examples)
```
## DSPy::Context.with_span
Create manual spans for custom operations. Requires `dspy-o11y`.
```ruby
DSPy::Context.with_span(operation: 'custom.retrieval', 'retrieval.source' => 'pinecone') do |span|
results = pinecone_client.query(embedding)
span&.set_attribute('retrieval.count', results.size) if span
results
end
```
Pass semantic attributes as keyword arguments alongside `operation:`. The block receives an OpenTelemetry span object (or `nil` when observability is disabled). The span automatically nests under the current parent span and records `duration.ms`, `langfuse.observation.startTime`, and `langfuse.observation.endTime`.
Assign a Langfuse observation type to custom spans:
```ruby
DSPy::Context.with_span(
operation: 'evaluate.batch',
**DSPy::ObservationType::Evaluator.langfuse_attributes,
'batch.size' => examples.length
) do |span|
run_evaluation(examples)
end
```
Scores reported inside a `with_span` block automatically inherit the current trace context.
## Module Stack Metadata
When `DSPy::Module#forward` runs, the context layer maintains a module stack. Every event includes:
```ruby
{
module_path: [
{ id: "root_uuid", class: "DeepSearch", label: nil },
{ id: "planner_uuid", class: "DSPy::Predict", label: "planner" }
],
module_root: { id: "root_uuid", class: "DeepSearch", label: nil },
module_leaf: { id: "planner_uuid", class: "DSPy::Predict", label: "planner" },
module_scope: {
ancestry_token: "root_uuid>planner_uuid",
depth: 2
}
}
```
| Key | Meaning |
|---|---|
| `module_path` | Ordered array of `{id, class, label}` entries from root to leaf |
| `module_root` | The outermost module in the current call chain |
| `module_leaf` | The innermost (currently executing) module |
| `module_scope.ancestry_token` | Stable string of joined UUIDs representing the nesting path |
| `module_scope.depth` | Integer depth of the current module in the stack |
Labels are set via `module_scope_label=` on a module instance or derived automatically from named predictors. Use this metadata to power Langfuse filters, scoped metrics, or custom event routing.
## Dedicated Export Worker
The `DSPy::Observability::AsyncSpanProcessor` (from `dspy-o11y`) keeps telemetry export off the hot path:
- Runs on a `Concurrent::SingleThreadExecutor` -- LLM workflows never compete with OTLP networking.
- Buffers finished spans in a `Thread::Queue` (max size configurable via `DSPY_TELEMETRY_QUEUE_SIZE`).
- Drains spans in batches of `DSPY_TELEMETRY_BATCH_SIZE` (default 100). When the queue reaches batch size, an immediate async export fires.
- A background timer thread triggers periodic export every `DSPY_TELEMETRY_EXPORT_INTERVAL` seconds (default 60).
- Applies exponential backoff (`0.1 * 2^attempt` seconds) on export failures, up to `DEFAULT_MAX_RETRIES` (3).
- On shutdown, flushes all remaining spans within `DSPY_TELEMETRY_SHUTDOWN_TIMEOUT` seconds, then terminates the executor.
- Drops the oldest span when the queue is full, logging `'observability.span_dropped'`.
No application code interacts with the processor directly. Configure it entirely through environment variables.
## Built-in Events Reference
| Event Name | Emitted By | Key Attributes |
|---|---|---|
| `lm.tokens` | `DSPy::LM` | `gen_ai.system`, `gen_ai.request.model`, `input_tokens`, `output_tokens`, `total_tokens` |
| `chain_of_thought.reasoning_complete` | `DSPy::ChainOfThought` | `dspy.signature`, `cot.reasoning_steps`, `cot.reasoning_length`, `cot.has_reasoning` |
| `react.iteration_complete` | `DSPy::ReAct` | `iteration`, `thought`, `action`, `observation` |
| `codeact.iteration_complete` | `dspy-code_act` gem | `iteration`, `code_executed`, `execution_result` |
| `optimization.trial_complete` | Teleprompters (MIPROv2) | `trial_number`, `score` |
| `score.create` | `DSPy.score` | `score_name`, `score_value`, `score_data_type`, `trace_id` |
| `span.start` | `DSPy::Context.with_span` | `trace_id`, `span_id`, `parent_span_id`, `operation` |
## Best Practices
- Use dot-separated string names for events. Follow OpenTelemetry `gen_ai.*` conventions for LLM attributes.
- Always call `unsubscribe` (or `unsubscribe_module_events` for scoped subscriptions) when a tracker is no longer needed to prevent memory leaks.
- Call `DSPy.events.clear_listeners` in test teardown to avoid cross-contamination.
- Wrap risky listener logic in a rescue block. The event system isolates listener failures, but explicit rescue prevents silent swallowing of domain errors.
- Prefer module-scoped `subscribe` for agent internals. Reserve global `DSPy.events.subscribe` for infrastructure-level concerns.

View File

@@ -1,603 +0,0 @@
# DSPy.rb Optimization
## MIPROv2
MIPROv2 (Multi-prompt Instruction Proposal with Retrieval Optimization) is the primary instruction tuner in DSPy.rb. It proposes new instructions and few-shot demonstrations per predictor, evaluates them on mini-batches, and retains candidates that improve the metric. It ships as a separate gem to keep the Gaussian Process dependency tree out of apps that do not need it.
### Installation
```ruby
# Gemfile
gem "dspy"
gem "dspy-miprov2"
```
Bundler auto-requires `dspy/miprov2`. No additional `require` statement is needed.
### AutoMode presets
Use `DSPy::Teleprompt::MIPROv2::AutoMode` for preconfigured optimizers:
```ruby
light = DSPy::Teleprompt::MIPROv2::AutoMode.light(metric: metric) # 6 trials, greedy
medium = DSPy::Teleprompt::MIPROv2::AutoMode.medium(metric: metric) # 12 trials, adaptive
heavy = DSPy::Teleprompt::MIPROv2::AutoMode.heavy(metric: metric) # 18 trials, Bayesian
```
| Preset | Trials | Strategy | Use case |
|----------|--------|------------|-----------------------------------------------------|
| `light` | 6 | `:greedy` | Quick wins on small datasets or during prototyping. |
| `medium` | 12 | `:adaptive`| Balanced exploration vs. runtime for most pilots. |
| `heavy` | 18 | `:bayesian`| Highest accuracy targets or multi-stage programs. |
### Manual configuration with dry-configurable
`DSPy::Teleprompt::MIPROv2` includes `Dry::Configurable`. Configure at the class level (defaults for all instances) or instance level (overrides class defaults).
**Class-level defaults:**
```ruby
DSPy::Teleprompt::MIPROv2.configure do |config|
config.optimization_strategy = :bayesian
config.num_trials = 30
config.bootstrap_sets = 10
end
```
**Instance-level overrides:**
```ruby
optimizer = DSPy::Teleprompt::MIPROv2.new(metric: metric)
optimizer.configure do |config|
config.num_trials = 15
config.num_instruction_candidates = 6
config.bootstrap_sets = 5
config.max_bootstrapped_examples = 4
config.max_labeled_examples = 16
config.optimization_strategy = :adaptive # :greedy, :adaptive, :bayesian
config.early_stopping_patience = 3
config.init_temperature = 1.0
config.final_temperature = 0.1
config.minibatch_size = nil # nil = auto
config.auto_seed = 42
end
```
The `optimization_strategy` setting accepts symbols (`:greedy`, `:adaptive`, `:bayesian`) and coerces them internally to `DSPy::Teleprompt::OptimizationStrategy` T::Enum values.
The old `config:` constructor parameter is removed. Passing `config:` raises `ArgumentError`.
### Auto presets via configure
Instead of `AutoMode`, set the preset through the configure block:
```ruby
optimizer = DSPy::Teleprompt::MIPROv2.new(metric: metric)
optimizer.configure do |config|
config.auto_preset = DSPy::Teleprompt::AutoPreset.deserialize("medium")
end
```
### Compile and inspect
```ruby
program = DSPy::Predict.new(MySignature)
result = optimizer.compile(
program,
trainset: train_examples,
valset: val_examples
)
optimized_program = result.optimized_program
puts "Best score: #{result.best_score_value}"
```
The `result` object exposes:
- `optimized_program` -- ready-to-use predictor with updated instruction and demos.
- `optimization_trace[:trial_logs]` -- per-trial record of instructions, demos, and scores.
- `metadata[:optimizer]` -- `"MIPROv2"`, useful when persisting experiments from multiple optimizers.
### Multi-stage programs
MIPROv2 generates dataset summaries for each predictor and proposes per-stage instructions. For a ReAct agent with `thought_generator` and `observation_processor` predictors, the optimizer handles credit assignment internally. The metric only needs to evaluate the final output.
### Bootstrap sampling
During the bootstrap phase MIPROv2:
1. Generates dataset summaries from the training set.
2. Bootstraps few-shot demonstrations by running the baseline program.
3. Proposes candidate instructions grounded in the summaries and bootstrapped examples.
4. Evaluates each candidate on mini-batches drawn from the validation set.
Control the bootstrap phase with `bootstrap_sets`, `max_bootstrapped_examples`, and `max_labeled_examples`.
### Bayesian optimization
When `optimization_strategy` is `:bayesian` (or when using the `heavy` preset), MIPROv2 fits a Gaussian Process surrogate over past trial scores to select the next candidate. This replaces random search with informed exploration, reducing the number of trials needed to find high-scoring instructions.
---
## GEPA
GEPA (Genetic-Pareto Reflective Prompt Evolution) is a feedback-driven optimizer. It runs the program on a small batch, collects scores and textual feedback, and asks a reflection LM to rewrite the instruction. Improved candidates are retained on a Pareto frontier.
### Installation
```ruby
# Gemfile
gem "dspy"
gem "dspy-gepa"
```
The `dspy-gepa` gem depends on the `gepa` core optimizer gem automatically.
### Metric contract
GEPA metrics return `DSPy::Prediction` with both a numeric score and a feedback string. Do not return a plain boolean.
```ruby
metric = lambda do |example, prediction|
expected = example.expected_values[:label]
predicted = prediction.label
score = predicted == expected ? 1.0 : 0.0
feedback = if score == 1.0
"Correct (#{expected}) for: \"#{example.input_values[:text][0..60]}\""
else
"Misclassified (expected #{expected}, got #{predicted}) for: \"#{example.input_values[:text][0..60]}\""
end
DSPy::Prediction.new(score: score, feedback: feedback)
end
```
Keep the score in `[0, 1]`. Always include a short feedback message explaining what happened -- GEPA hands this text to the reflection model so it can reason about failures.
### Feedback maps
`feedback_map` targets individual predictors inside a composite module. Each entry receives keyword arguments and returns a `DSPy::Prediction`:
```ruby
feedback_map = {
'self' => lambda do |predictor_output:, predictor_inputs:, module_inputs:, module_outputs:, captured_trace:|
expected = module_inputs.expected_values[:label]
predicted = predictor_output.label
DSPy::Prediction.new(
score: predicted == expected ? 1.0 : 0.0,
feedback: "Classifier saw \"#{predictor_inputs[:text][0..80]}\" -> #{predicted} (expected #{expected})"
)
end
}
```
For single-predictor programs, key the map with `'self'`. For multi-predictor chains, add entries per component so the reflection LM sees localized context at each step. Omit `feedback_map` entirely if the top-level metric already covers the basics.
### Configuring the teleprompter
```ruby
teleprompter = DSPy::Teleprompt::GEPA.new(
metric: metric,
reflection_lm: DSPy::ReflectionLM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY']),
feedback_map: feedback_map,
config: {
max_metric_calls: 600,
minibatch_size: 6,
skip_perfect_score: false
}
)
```
Key configuration knobs:
| Knob | Purpose |
|----------------------|-------------------------------------------------------------------------------------------|
| `max_metric_calls` | Hard budget on evaluation calls. Set to at least the validation set size plus a few minibatches. |
| `minibatch_size` | Examples per reflective replay batch. Smaller = cheaper iterations, noisier scores. |
| `skip_perfect_score` | Set `true` to stop early when a candidate reaches score `1.0`. |
### Minibatch sizing
| Goal | Suggested size | Rationale |
|-------------------------------------------------|----------------|------------------------------------------------------------|
| Explore many candidates within a tight budget | 3--6 | Cheap iterations, more prompt variants, noisier metrics. |
| Stable metrics when each rollout is costly | 8--12 | Smoother scores, fewer candidates unless budget is raised. |
| Investigate specific failure modes | 3--4 then 8+ | Start with breadth, increase once patterns emerge. |
### Compile and evaluate
```ruby
program = DSPy::Predict.new(MySignature)
result = teleprompter.compile(program, trainset: train, valset: val)
optimized_program = result.optimized_program
test_metrics = evaluate(optimized_program, test)
```
The `result` object exposes:
- `optimized_program` -- predictor with updated instruction and few-shot examples.
- `best_score_value` -- validation score for the best candidate.
- `metadata` -- candidate counts, trace hashes, and telemetry IDs.
### Reflection LM
Swap `DSPy::ReflectionLM` for any callable object that accepts the reflection prompt hash and returns a string. The default reflection signature extracts the new instruction from triple backticks in the response.
### Experiment tracking
Plug `GEPA::Logging::ExperimentTracker` into a persistence layer:
```ruby
tracker = GEPA::Logging::ExperimentTracker.new
tracker.with_subscriber { |event| MyModel.create!(payload: event) }
teleprompter = DSPy::Teleprompt::GEPA.new(
metric: metric,
reflection_lm: reflection_lm,
experiment_tracker: tracker,
config: { max_metric_calls: 900 }
)
```
The tracker emits Pareto update events, merge decisions, and candidate evolution records as JSONL.
### Pareto frontier
GEPA maintains a diverse candidate pool and samples from the Pareto frontier instead of mutating only the top-scoring program. This balances exploration and prevents the search from collapsing onto a single lineage.
Enable the merge proposer after multiple strong lineages emerge:
```ruby
config: {
max_metric_calls: 900,
enable_merge_proposer: true
}
```
Premature merges eat budget without meaningful gains. Gate merge on having several validated candidates first.
### Advanced options
- `acceptance_strategy:` -- plug in bespoke Pareto filters or early-stop heuristics.
- Telemetry spans emit via `GEPA::Telemetry`. Enable global observability with `DSPy.configure { |c| c.observability = true }` to stream spans to an OpenTelemetry exporter.
---
## Evaluation Framework
`DSPy::Evals` provides batch evaluation of predictors against test datasets with built-in and custom metrics.
### Basic usage
```ruby
metric = proc do |example, prediction|
prediction.answer == example.expected_values[:answer]
end
evaluator = DSPy::Evals.new(predictor, metric: metric)
result = evaluator.evaluate(
test_examples,
display_table: true,
display_progress: true
)
puts "Pass rate: #{(result.pass_rate * 100).round(1)}%"
puts "Passed: #{result.passed_examples}/#{result.total_examples}"
```
### DSPy::Example
Convert raw data into `DSPy::Example` instances before passing to optimizers or evaluators. Each example carries `input_values` and `expected_values`:
```ruby
examples = rows.map do |row|
DSPy::Example.new(
input_values: { text: row[:text] },
expected_values: { label: row[:label] }
)
end
train, val, test = split_examples(examples, train_ratio: 0.6, val_ratio: 0.2, seed: 42)
```
Hold back a test set from the optimization loop. Optimizers work on train/val; only the test set proves generalization.
### Built-in metrics
```ruby
# Exact match -- prediction must exactly equal expected value
metric = DSPy::Metrics.exact_match(field: :answer, case_sensitive: true)
# Contains -- prediction must contain expected substring
metric = DSPy::Metrics.contains(field: :answer, case_sensitive: false)
# Numeric difference -- numeric output within tolerance
metric = DSPy::Metrics.numeric_difference(field: :answer, tolerance: 0.01)
# Composite AND -- all sub-metrics must pass
metric = DSPy::Metrics.composite_and(
DSPy::Metrics.exact_match(field: :answer),
DSPy::Metrics.contains(field: :reasoning)
)
```
### Custom metrics
```ruby
quality_metric = lambda do |example, prediction|
return false unless prediction
score = 0.0
score += 0.5 if prediction.answer == example.expected_values[:answer]
score += 0.3 if prediction.explanation && prediction.explanation.length > 50
score += 0.2 if prediction.confidence && prediction.confidence > 0.8
score >= 0.7
end
evaluator = DSPy::Evals.new(predictor, metric: quality_metric)
```
Access prediction fields with dot notation (`prediction.answer`), not hash notation.
### Observability hooks
Register callbacks without editing the evaluator:
```ruby
DSPy::Evals.before_example do |payload|
example = payload[:example]
DSPy.logger.info("Evaluating example #{example.id}") if example.respond_to?(:id)
end
DSPy::Evals.after_batch do |payload|
result = payload[:result]
Langfuse.event(
name: 'eval.batch',
metadata: {
total: result.total_examples,
passed: result.passed_examples,
score: result.score
}
)
end
```
Available hooks: `before_example`, `after_example`, `before_batch`, `after_batch`.
### Langfuse score export
Enable `export_scores: true` to emit `score.create` events for each evaluated example and a batch score at the end:
```ruby
evaluator = DSPy::Evals.new(
predictor,
metric: metric,
export_scores: true,
score_name: 'qa_accuracy' # default: 'evaluation'
)
result = evaluator.evaluate(test_examples)
# Emits per-example scores + overall batch score via DSPy::Scores::Exporter
```
Scores attach to the current trace context automatically and flow to Langfuse asynchronously.
### Evaluation results
```ruby
result = evaluator.evaluate(test_examples)
result.score # Overall score (0.0 to 1.0)
result.passed_count # Examples that passed
result.failed_count # Examples that failed
result.error_count # Examples that errored
result.results.each do |r|
r.passed # Boolean
r.score # Numeric score
r.error # Error message if the example errored
end
```
### Integration with optimizers
```ruby
metric = proc do |example, prediction|
expected = example.expected_values[:answer].to_s.strip.downcase
predicted = prediction.answer.to_s.strip.downcase
!expected.empty? && predicted.include?(expected)
end
optimizer = DSPy::Teleprompt::MIPROv2::AutoMode.medium(metric: metric)
result = optimizer.compile(
DSPy::Predict.new(QASignature),
trainset: train_examples,
valset: val_examples
)
evaluator = DSPy::Evals.new(result.optimized_program, metric: metric)
test_result = evaluator.evaluate(test_examples, display_table: true)
puts "Test accuracy: #{(test_result.pass_rate * 100).round(2)}%"
```
---
## Storage System
`DSPy::Storage` persists optimization results, tracks history, and manages multiple versions of optimized programs.
### ProgramStorage (low-level)
```ruby
storage = DSPy::Storage::ProgramStorage.new(storage_path: "./dspy_storage")
# Save
saved = storage.save_program(
result.optimized_program,
result,
metadata: {
signature_class: 'ClassifyText',
optimizer: 'MIPROv2',
examples_count: examples.size
}
)
puts "Stored with ID: #{saved.program_id}"
# Load
saved = storage.load_program(program_id)
predictor = saved.program
score = saved.optimization_result[:best_score_value]
# List
storage.list_programs.each do |p|
puts "#{p[:program_id]} -- score: #{p[:best_score]} -- saved: #{p[:saved_at]}"
end
```
### StorageManager (recommended)
```ruby
manager = DSPy::Storage::StorageManager.new
# Save with tags
saved = manager.save_optimization_result(
result,
tags: ['production', 'sentiment-analysis'],
description: 'Optimized sentiment classifier v2'
)
# Find programs
programs = manager.find_programs(
optimizer: 'MIPROv2',
min_score: 0.85,
tags: ['production']
)
recent = manager.find_programs(
max_age_days: 7,
signature_class: 'ClassifyText'
)
# Get best program for a signature
best = manager.get_best_program('ClassifyText')
predictor = best.program
```
Global shorthand:
```ruby
DSPy::Storage::StorageManager.save(result, metadata: { version: '2.0' })
DSPy::Storage::StorageManager.load(program_id)
DSPy::Storage::StorageManager.best('ClassifyText')
```
### Checkpoints
Create and restore checkpoints during long-running optimizations:
```ruby
# Save a checkpoint
manager.create_checkpoint(
current_result,
'iteration_50',
metadata: { iteration: 50, current_score: 0.87 }
)
# Restore
restored = manager.restore_checkpoint('iteration_50')
program = restored.program
# Auto-checkpoint every N iterations
if iteration % 10 == 0
manager.create_checkpoint(current_result, "auto_checkpoint_#{iteration}")
end
```
### Import and export
Share programs between environments:
```ruby
storage = DSPy::Storage::ProgramStorage.new
# Export
storage.export_programs(['abc123', 'def456'], './export_backup.json')
# Import
imported = storage.import_programs('./export_backup.json')
puts "Imported #{imported.size} programs"
```
### Optimization history
```ruby
history = manager.get_optimization_history
history[:summary][:total_programs]
history[:summary][:avg_score]
history[:optimizer_stats].each do |optimizer, stats|
puts "#{optimizer}: #{stats[:count]} programs, best: #{stats[:best_score]}"
end
history[:trends][:improvement_percentage]
```
### Program comparison
```ruby
comparison = manager.compare_programs(id_a, id_b)
comparison[:comparison][:score_difference]
comparison[:comparison][:better_program]
comparison[:comparison][:age_difference_hours]
```
### Storage configuration
```ruby
config = DSPy::Storage::StorageManager::StorageConfig.new
config.storage_path = Rails.root.join('dspy_storage')
config.auto_save = true
config.save_intermediate_results = false
config.max_stored_programs = 100
manager = DSPy::Storage::StorageManager.new(config: config)
```
### Cleanup
Remove old programs. Cleanup retains the best performing and most recent programs using a weighted score (70% performance, 30% recency):
```ruby
deleted_count = manager.cleanup_old_programs
```
### Storage events
The storage system emits structured log events for monitoring:
- `dspy.storage.save_start`, `dspy.storage.save_complete`, `dspy.storage.save_error`
- `dspy.storage.load_start`, `dspy.storage.load_complete`, `dspy.storage.load_error`
- `dspy.storage.delete`, `dspy.storage.export`, `dspy.storage.import`, `dspy.storage.cleanup`
### File layout
```
dspy_storage/
programs/
abc123def456.json
789xyz012345.json
history.json
```
---
## API rules
- Call predictors with `.call()`, not `.forward()`.
- Access prediction fields with dot notation (`result.answer`), not hash notation (`result[:answer]`).
- GEPA metrics return `DSPy::Prediction.new(score:, feedback:)`, not a boolean.
- MIPROv2 metrics may return `true`/`false`, a numeric score, or `DSPy::Prediction`.

View File

@@ -1,418 +0,0 @@
# DSPy.rb LLM Providers
## Adapter Architecture
DSPy.rb ships provider SDKs as separate adapter gems. Install only the adapters the project needs. Each adapter gem depends on the official SDK for its provider and auto-loads when present -- no explicit `require` necessary.
```ruby
# Gemfile
gem 'dspy' # core framework (no provider SDKs)
gem 'dspy-openai' # OpenAI, OpenRouter, Ollama
gem 'dspy-anthropic' # Claude
gem 'dspy-gemini' # Gemini
gem 'dspy-ruby_llm' # RubyLLM unified adapter (12+ providers)
```
---
## Per-Provider Adapters
### dspy-openai
Covers any endpoint that speaks the OpenAI chat-completions protocol: OpenAI itself, OpenRouter, and Ollama.
**SDK dependency:** `openai ~> 0.17`
```ruby
# OpenAI
lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
# OpenRouter -- access 200+ models behind a single key
lm = DSPy::LM.new('openrouter/x-ai/grok-4-fast:free',
api_key: ENV['OPENROUTER_API_KEY']
)
# Ollama -- local models, no API key required
lm = DSPy::LM.new('ollama/llama3.2')
# Remote Ollama instance
lm = DSPy::LM.new('ollama/llama3.2',
base_url: 'https://my-ollama.example.com/v1',
api_key: 'optional-auth-token'
)
```
All three sub-adapters share the same request handling, structured-output support, and error reporting. Swap providers without changing higher-level DSPy code.
For OpenRouter models that lack native structured-output support, disable it explicitly:
```ruby
lm = DSPy::LM.new('openrouter/deepseek/deepseek-chat-v3.1:free',
api_key: ENV['OPENROUTER_API_KEY'],
structured_outputs: false
)
```
### dspy-anthropic
Provides the Claude adapter. Install it for any `anthropic/*` model id.
**SDK dependency:** `anthropic ~> 1.12`
```ruby
lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514',
api_key: ENV['ANTHROPIC_API_KEY']
)
```
Structured outputs default to tool-based JSON extraction (`structured_outputs: true`). Set `structured_outputs: false` to use enhanced-prompting extraction instead.
```ruby
# Tool-based extraction (default, most reliable)
lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514',
api_key: ENV['ANTHROPIC_API_KEY'],
structured_outputs: true
)
# Enhanced prompting extraction
lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514',
api_key: ENV['ANTHROPIC_API_KEY'],
structured_outputs: false
)
```
### dspy-gemini
Provides the Gemini adapter. Install it for any `gemini/*` model id.
**SDK dependency:** `gemini-ai ~> 4.3`
```ruby
lm = DSPy::LM.new('gemini/gemini-2.5-flash',
api_key: ENV['GEMINI_API_KEY']
)
```
**Environment variable:** `GEMINI_API_KEY` (also accepts `GOOGLE_API_KEY`).
---
## RubyLLM Unified Adapter
The `dspy-ruby_llm` gem provides a single adapter that routes to 12+ providers through [RubyLLM](https://rubyllm.com). Use it when a project talks to multiple providers or needs access to Bedrock, VertexAI, DeepSeek, or Mistral without dedicated adapter gems.
**SDK dependency:** `ruby_llm ~> 1.3`
### Model ID Format
Prefix every model id with `ruby_llm/`:
```ruby
lm = DSPy::LM.new('ruby_llm/gpt-4o-mini')
lm = DSPy::LM.new('ruby_llm/claude-sonnet-4-20250514')
lm = DSPy::LM.new('ruby_llm/gemini-2.5-flash')
```
The adapter detects the provider from RubyLLM's model registry automatically. For models not in the registry, pass `provider:` explicitly:
```ruby
lm = DSPy::LM.new('ruby_llm/llama3.2', provider: 'ollama')
lm = DSPy::LM.new('ruby_llm/anthropic/claude-3-opus',
api_key: ENV['OPENROUTER_API_KEY'],
provider: 'openrouter'
)
```
### Using Existing RubyLLM Configuration
When RubyLLM is already configured globally, omit the `api_key:` argument. DSPy reuses the global config automatically:
```ruby
RubyLLM.configure do |config|
config.openai_api_key = ENV['OPENAI_API_KEY']
config.anthropic_api_key = ENV['ANTHROPIC_API_KEY']
end
# No api_key needed -- picks up the global config
DSPy.configure do |c|
c.lm = DSPy::LM.new('ruby_llm/gpt-4o-mini')
end
```
When an `api_key:` (or any of `base_url:`, `timeout:`, `max_retries:`) is passed, DSPy creates a **scoped context** instead of reusing the global config.
### Cloud-Hosted Providers (Bedrock, VertexAI)
Configure RubyLLM globally first, then reference the model:
```ruby
# AWS Bedrock
RubyLLM.configure do |c|
c.bedrock_api_key = ENV['AWS_ACCESS_KEY_ID']
c.bedrock_secret_key = ENV['AWS_SECRET_ACCESS_KEY']
c.bedrock_region = 'us-east-1'
end
lm = DSPy::LM.new('ruby_llm/anthropic.claude-3-5-sonnet', provider: 'bedrock')
# Google VertexAI
RubyLLM.configure do |c|
c.vertexai_project_id = 'your-project-id'
c.vertexai_location = 'us-central1'
end
lm = DSPy::LM.new('ruby_llm/gemini-pro', provider: 'vertexai')
```
### Supported Providers Table
| Provider | Example Model ID | Notes |
|-------------|--------------------------------------------|---------------------------------|
| OpenAI | `ruby_llm/gpt-4o-mini` | Auto-detected from registry |
| Anthropic | `ruby_llm/claude-sonnet-4-20250514` | Auto-detected from registry |
| Gemini | `ruby_llm/gemini-2.5-flash` | Auto-detected from registry |
| DeepSeek | `ruby_llm/deepseek-chat` | Auto-detected from registry |
| Mistral | `ruby_llm/mistral-large` | Auto-detected from registry |
| Ollama | `ruby_llm/llama3.2` | Use `provider: 'ollama'` |
| AWS Bedrock | `ruby_llm/anthropic.claude-3-5-sonnet` | Configure RubyLLM globally |
| VertexAI | `ruby_llm/gemini-pro` | Configure RubyLLM globally |
| OpenRouter | `ruby_llm/anthropic/claude-3-opus` | Use `provider: 'openrouter'` |
| Perplexity | `ruby_llm/llama-3.1-sonar-large` | Use `provider: 'perplexity'` |
| GPUStack | `ruby_llm/model-name` | Use `provider: 'gpustack'` |
---
## Rails Initializer Pattern
Configure DSPy inside an `after_initialize` block so Rails credentials and environment are fully loaded:
```ruby
# config/initializers/dspy.rb
Rails.application.config.after_initialize do
return if Rails.env.test? # skip in test -- use VCR cassettes instead
DSPy.configure do |config|
config.lm = DSPy::LM.new(
'openai/gpt-4o-mini',
api_key: Rails.application.credentials.openai_api_key,
structured_outputs: true
)
config.logger = if Rails.env.production?
Dry.Logger(:dspy, formatter: :json) do |logger|
logger.add_backend(stream: Rails.root.join("log/dspy.log"))
end
else
Dry.Logger(:dspy) do |logger|
logger.add_backend(level: :debug, stream: $stdout)
end
end
end
end
```
Key points:
- Wrap in `after_initialize` so `Rails.application.credentials` is available.
- Return early in the test environment. Rely on VCR cassettes for deterministic LLM responses.
- Set `structured_outputs: true` (the default) for provider-native JSON extraction.
- Use `Dry.Logger` with `:json` formatter in production for structured log parsing.
---
## Fiber-Local LM Context
`DSPy.with_lm` sets a temporary language-model override scoped to the current Fiber. Every predictor call inside the block uses the override; outside the block the previous LM takes effect again.
```ruby
fast = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
powerful = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY'])
classifier = Classifier.new
# Uses the global LM
result = classifier.call(text: "Hello")
# Temporarily switch to the fast model
DSPy.with_lm(fast) do
result = classifier.call(text: "Hello") # uses gpt-4o-mini
end
# Temporarily switch to the powerful model
DSPy.with_lm(powerful) do
result = classifier.call(text: "Hello") # uses claude-sonnet-4
end
```
### LM Resolution Hierarchy
DSPy resolves the active language model in this order:
1. **Instance-level LM** -- set directly on a module instance via `configure`
2. **Fiber-local LM** -- set via `DSPy.with_lm`
3. **Global LM** -- set via `DSPy.configure`
Instance-level configuration always wins, even inside a `DSPy.with_lm` block:
```ruby
classifier = Classifier.new
classifier.configure { |c| c.lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY']) }
fast = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
DSPy.with_lm(fast) do
classifier.call(text: "Test") # still uses claude-sonnet-4 (instance-level wins)
end
```
### configure_predictor for Fine-Grained Agent Control
Complex agents (`ReAct`, `CodeAct`, `DeepResearch`, `DeepSearch`) contain internal predictors. Use `configure` for a blanket override and `configure_predictor` to target a specific sub-predictor:
```ruby
agent = DSPy::ReAct.new(MySignature, tools: tools)
# Set a default LM for the agent and all its children
agent.configure { |c| c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY']) }
# Override just the reasoning predictor with a more capable model
agent.configure_predictor('thought_generator') do |c|
c.lm = DSPy::LM.new('anthropic/claude-sonnet-4-20250514', api_key: ENV['ANTHROPIC_API_KEY'])
end
result = agent.call(question: "Summarize the report")
```
Both methods support chaining:
```ruby
agent
.configure { |c| c.lm = cheap_model }
.configure_predictor('thought_generator') { |c| c.lm = expensive_model }
```
#### Available Predictors by Agent Type
| Agent | Internal Predictors |
|----------------------|------------------------------------------------------------------|
| `DSPy::ReAct` | `thought_generator`, `observation_processor` |
| `DSPy::CodeAct` | `code_generator`, `observation_processor` |
| `DSPy::DeepResearch` | `planner`, `synthesizer`, `qa_reviewer`, `reporter` |
| `DSPy::DeepSearch` | `seed_predictor`, `search_predictor`, `reader_predictor`, `reason_predictor` |
#### Propagation Rules
- Configuration propagates recursively to children and grandchildren.
- Children with an already-configured LM are **not** overwritten by a later parent `configure` call.
- Configure the parent first, then override specific children.
---
## Feature-Flagged Model Selection
Use a `FeatureFlags` module backed by ENV vars to centralize model selection. Each tool or agent reads its model from the flags, falling back to a global default.
```ruby
module FeatureFlags
module_function
def default_model
ENV.fetch('DSPY_DEFAULT_MODEL', 'openai/gpt-4o-mini')
end
def default_api_key
ENV.fetch('DSPY_DEFAULT_API_KEY') { ENV.fetch('OPENAI_API_KEY', nil) }
end
def model_for(tool_name)
env_key = "DSPY_MODEL_#{tool_name.upcase}"
ENV.fetch(env_key, default_model)
end
def api_key_for(tool_name)
env_key = "DSPY_API_KEY_#{tool_name.upcase}"
ENV.fetch(env_key, default_api_key)
end
end
```
### Per-Tool Model Override
Override an individual tool's model without touching application code:
```bash
# .env
DSPY_DEFAULT_MODEL=openai/gpt-4o-mini
DSPY_DEFAULT_API_KEY=sk-...
# Override the classifier to use Claude
DSPY_MODEL_CLASSIFIER=anthropic/claude-sonnet-4-20250514
DSPY_API_KEY_CLASSIFIER=sk-ant-...
# Override the summarizer to use Gemini
DSPY_MODEL_SUMMARIZER=gemini/gemini-2.5-flash
DSPY_API_KEY_SUMMARIZER=...
```
Wire each agent to its flag at initialization:
```ruby
class ClassifierAgent < DSPy::Module
def initialize
super
model = FeatureFlags.model_for('classifier')
api_key = FeatureFlags.api_key_for('classifier')
@predictor = DSPy::Predict.new(ClassifySignature)
configure { |c| c.lm = DSPy::LM.new(model, api_key: api_key) }
end
def forward(text:)
@predictor.call(text: text)
end
end
```
This pattern keeps model routing declarative and avoids scattering `DSPy::LM.new` calls across the codebase.
---
## Compatibility Matrix
Feature support across direct adapter gems. All features listed assume `structured_outputs: true` (the default).
| Feature | OpenAI | Anthropic | Gemini | Ollama | OpenRouter | RubyLLM |
|----------------------|--------|-----------|--------|----------|------------|-------------|
| Structured Output | Native JSON mode | Tool-based extraction | Native JSON schema | OpenAI-compatible JSON | Varies by model | Via `with_schema` |
| Vision (Images) | File + URL | File + Base64 | File + Base64 | Limited | Varies | Delegates to underlying provider |
| Image URLs | Yes | No | No | No | Varies | Depends on provider |
| Tool Calling | Yes | Yes | Yes | Varies | Varies | Yes |
| Streaming | Yes | Yes | Yes | Yes | Yes | Yes |
**Notes:**
- **Structured Output** is enabled by default on every adapter. Set `structured_outputs: false` to fall back to enhanced-prompting extraction.
- **Vision / Image URLs:** Only OpenAI supports passing a URL directly. For Anthropic and Gemini, load images from file or Base64:
```ruby
DSPy::Image.from_url("https://example.com/img.jpg") # OpenAI only
DSPy::Image.from_file("path/to/image.jpg") # all providers
DSPy::Image.from_base64(data, mime_type: "image/jpeg") # all providers
```
- **RubyLLM** delegates to the underlying provider, so feature support matches the provider column in the table.
### Choosing an Adapter Strategy
| Scenario | Recommended Adapter |
|-------------------------------------------|--------------------------------|
| Single provider (OpenAI, Claude, or Gemini) | Dedicated gem (`dspy-openai`, `dspy-anthropic`, `dspy-gemini`) |
| Multi-provider with per-agent model routing | `dspy-ruby_llm` |
| AWS Bedrock or Google VertexAI | `dspy-ruby_llm` |
| Local development with Ollama | `dspy-openai` (Ollama sub-adapter) or `dspy-ruby_llm` |
| OpenRouter for cost optimization | `dspy-openai` (OpenRouter sub-adapter) |
### Current Recommended Models
| Provider | Model ID | Use Case |
|-----------|---------------------------------------|-----------------------|
| OpenAI | `openai/gpt-4o-mini` | Fast, cost-effective |
| Anthropic | `anthropic/claude-sonnet-4-20250514` | Balanced reasoning |
| Gemini | `gemini/gemini-2.5-flash` | Fast, cost-effective |
| Ollama | `ollama/llama3.2` | Local, zero API cost |

View File

@@ -1,502 +0,0 @@
# DSPy.rb Toolsets
## Tools::Base
`DSPy::Tools::Base` is the base class for single-purpose tools. Each subclass exposes one operation to an LLM agent through a `call` method.
### Defining a Tool
Set the tool's identity with the `tool_name` and `tool_description` class-level DSL methods. Define the `call` instance method with a Sorbet `sig` declaration so DSPy.rb can generate the JSON schema the LLM uses to invoke the tool.
```ruby
class WeatherLookup < DSPy::Tools::Base
extend T::Sig
tool_name "weather_lookup"
tool_description "Look up current weather for a given city"
sig { params(city: String, units: T.nilable(String)).returns(String) }
def call(city:, units: nil)
# Fetch weather data and return a string summary
"72F and sunny in #{city}"
end
end
```
Key points:
- Inherit from `DSPy::Tools::Base`, not `DSPy::Tool`.
- Use `tool_name` (class method) to set the name the LLM sees. Without it, the class name is lowercased as a fallback.
- Use `tool_description` (class method) to set the human-readable description surfaced in the tool schema.
- The `call` method must use **keyword arguments**. Positional arguments are supported but keyword arguments produce better schemas.
- Always attach a Sorbet `sig` to `call`. Without a signature, the generated schema has empty properties and the LLM cannot determine parameter types.
### Schema Generation
`call_schema_object` introspects the Sorbet signature on `call` and returns a hash representing the JSON Schema `parameters` object:
```ruby
WeatherLookup.call_schema_object
# => {
# type: "object",
# properties: {
# city: { type: "string", description: "Parameter city" },
# units: { type: "string", description: "Parameter units (optional)" }
# },
# required: ["city"]
# }
```
`call_schema` wraps this in the full LLM tool-calling format:
```ruby
WeatherLookup.call_schema
# => {
# type: "function",
# function: {
# name: "call",
# description: "Call the WeatherLookup tool",
# parameters: { ... }
# }
# }
```
### Using Tools with ReAct
Pass tool instances in an array to `DSPy::ReAct`:
```ruby
agent = DSPy::ReAct.new(
MySignature,
tools: [WeatherLookup.new, AnotherTool.new]
)
result = agent.call(question: "What is the weather in Berlin?")
puts result.answer
```
Access output fields with dot notation (`result.answer`), not hash access (`result[:answer]`).
---
## Tools::Toolset
`DSPy::Tools::Toolset` groups multiple related methods into a single class. Each exposed method becomes an independent tool from the LLM's perspective.
### Defining a Toolset
```ruby
class DatabaseToolset < DSPy::Tools::Toolset
extend T::Sig
toolset_name "db"
tool :query, description: "Run a read-only SQL query"
tool :insert, description: "Insert a record into a table"
tool :delete, description: "Delete a record by ID"
sig { params(sql: String).returns(String) }
def query(sql:)
# Execute read query
end
sig { params(table: String, data: T::Hash[String, String]).returns(String) }
def insert(table:, data:)
# Insert record
end
sig { params(table: String, id: Integer).returns(String) }
def delete(table:, id:)
# Delete record
end
end
```
### DSL Methods
**`toolset_name(name)`** -- Set the prefix for all generated tool names. If omitted, the class name minus `Toolset` suffix is lowercased (e.g., `DatabaseToolset` becomes `database`).
```ruby
toolset_name "db"
# tool :query produces a tool named "db_query"
```
**`tool(method_name, tool_name:, description:)`** -- Expose a method as a tool.
- `method_name` (Symbol, required) -- the instance method to expose.
- `tool_name:` (String, optional) -- override the default `<toolset_name>_<method_name>` naming.
- `description:` (String, optional) -- description shown to the LLM. Defaults to a humanized version of the method name.
```ruby
tool :word_count, tool_name: "text_wc", description: "Count lines, words, and characters"
# Produces a tool named "text_wc" instead of "text_word_count"
```
### Converting to a Tool Array
Call `to_tools` on the class (not an instance) to get an array of `ToolProxy` objects compatible with `DSPy::Tools::Base`:
```ruby
agent = DSPy::ReAct.new(
AnalyzeText,
tools: DatabaseToolset.to_tools
)
```
Each `ToolProxy` wraps one method, delegates `call` to the underlying toolset instance, and generates its own JSON schema from the method's Sorbet signature.
### Shared State
All tool proxies from a single `to_tools` call share one toolset instance. Store shared state (connections, caches, configuration) in the toolset's `initialize`:
```ruby
class ApiToolset < DSPy::Tools::Toolset
extend T::Sig
toolset_name "api"
tool :get, description: "Make a GET request"
tool :post, description: "Make a POST request"
sig { params(base_url: String).void }
def initialize(base_url:)
@base_url = base_url
@client = HTTP.persistent(base_url)
end
sig { params(path: String).returns(String) }
def get(path:)
@client.get("#{@base_url}#{path}").body.to_s
end
sig { params(path: String, body: String).returns(String) }
def post(path:, body:)
@client.post("#{@base_url}#{path}", body: body).body.to_s
end
end
```
---
## Type Safety
Sorbet signatures on tool methods drive both JSON schema generation and automatic type coercion of LLM responses.
### Basic Types
```ruby
sig { params(
text: String,
count: Integer,
score: Float,
enabled: T::Boolean,
threshold: Numeric
).returns(String) }
def analyze(text:, count:, score:, enabled:, threshold:)
# ...
end
```
| Sorbet Type | JSON Schema |
|------------------|----------------------------------------------------|
| `String` | `{"type": "string"}` |
| `Integer` | `{"type": "integer"}` |
| `Float` | `{"type": "number"}` |
| `Numeric` | `{"type": "number"}` |
| `T::Boolean` | `{"type": "boolean"}` |
| `T::Enum` | `{"type": "string", "enum": [...]}` |
| `T::Struct` | `{"type": "object", "properties": {...}}` |
| `T::Array[Type]` | `{"type": "array", "items": {...}}` |
| `T::Hash[K, V]` | `{"type": "object", "additionalProperties": {...}}`|
| `T.nilable(Type)`| `{"type": [original, "null"]}` |
| `T.any(T1, T2)` | `{"oneOf": [{...}, {...}]}` |
| `T.class_of(X)` | `{"type": "string"}` |
### T::Enum Parameters
Define a `T::Enum` and reference it in a tool signature. DSPy.rb generates a JSON Schema `enum` constraint and automatically deserializes the LLM's string response into the correct enum instance.
```ruby
class Priority < T::Enum
enums do
Low = new('low')
Medium = new('medium')
High = new('high')
Critical = new('critical')
end
end
class Status < T::Enum
enums do
Pending = new('pending')
InProgress = new('in-progress')
Completed = new('completed')
end
end
sig { params(priority: Priority, status: Status).returns(String) }
def update_task(priority:, status:)
"Updated to #{priority.serialize} / #{status.serialize}"
end
```
The generated schema constrains the parameter to valid values:
```json
{
"priority": {
"type": "string",
"enum": ["low", "medium", "high", "critical"]
}
}
```
**Case-insensitive matching**: When the LLM returns `"HIGH"` or `"High"` instead of `"high"`, DSPy.rb first tries an exact `try_deserialize`, then falls back to a case-insensitive lookup. This prevents failures caused by LLM casing variations.
### T::Struct Parameters
Use `T::Struct` for complex nested objects. DSPy.rb generates nested JSON Schema properties and recursively coerces the LLM's hash response into struct instances.
```ruby
class TaskMetadata < T::Struct
prop :id, String
prop :priority, Priority
prop :tags, T::Array[String]
prop :estimated_hours, T.nilable(Float), default: nil
end
class TaskRequest < T::Struct
prop :title, String
prop :description, String
prop :status, Status
prop :metadata, TaskMetadata
prop :assignees, T::Array[String]
end
sig { params(task: TaskRequest).returns(String) }
def create_task(task:)
"Created: #{task.title} (#{task.status.serialize})"
end
```
The LLM sees the full nested object schema and DSPy.rb reconstructs the struct tree from the JSON response, including enum fields inside nested structs.
### Nilable Parameters
Mark optional parameters with `T.nilable(...)` and provide a default value of `nil` in the method signature. These parameters are excluded from the JSON Schema `required` array.
```ruby
sig { params(
query: String,
max_results: T.nilable(Integer),
filter: T.nilable(String)
).returns(String) }
def search(query:, max_results: nil, filter: nil)
# query is required; max_results and filter are optional
end
```
### Collections
Typed arrays and hashes generate precise item/value schemas:
```ruby
sig { params(
tags: T::Array[String],
priorities: T::Array[Priority],
config: T::Hash[String, T.any(String, Integer, Float)]
).returns(String) }
def configure(tags:, priorities:, config:)
# Array elements and hash values are validated and coerced
end
```
### Union Types
`T.any(...)` generates a `oneOf` JSON Schema. When one of the union members is a `T::Struct`, DSPy.rb uses the `_type` discriminator field to select the correct struct class during coercion.
```ruby
sig { params(value: T.any(String, Integer, Float)).returns(String) }
def handle_flexible(value:)
# Accepts multiple types
end
```
---
## Built-in Toolsets
### TextProcessingToolset
`DSPy::Tools::TextProcessingToolset` provides Unix-style text analysis and manipulation operations. Toolset name prefix: `text`.
| Tool Name | Method | Description |
|-----------------------------------|-------------------|--------------------------------------------|
| `text_grep` | `grep` | Search for patterns with optional case-insensitive and count-only modes |
| `text_wc` | `word_count` | Count lines, words, and characters |
| `text_rg` | `ripgrep` | Fast pattern search with context lines |
| `text_extract_lines` | `extract_lines` | Extract a range of lines by number |
| `text_filter_lines` | `filter_lines` | Keep or reject lines matching a regex |
| `text_unique_lines` | `unique_lines` | Deduplicate lines, optionally preserving order |
| `text_sort_lines` | `sort_lines` | Sort lines alphabetically or numerically |
| `text_summarize_text` | `summarize_text` | Produce a statistical summary (counts, averages, frequent words) |
Usage:
```ruby
agent = DSPy::ReAct.new(
AnalyzeText,
tools: DSPy::Tools::TextProcessingToolset.to_tools
)
result = agent.call(text: log_contents, question: "How many error lines are there?")
puts result.answer
```
### GitHubCLIToolset
`DSPy::Tools::GitHubCLIToolset` wraps the `gh` CLI for read-oriented GitHub operations. Toolset name prefix: `github`.
| Tool Name | Method | Description |
|------------------------|-------------------|---------------------------------------------------|
| `github_list_issues` | `list_issues` | List issues filtered by state, labels, assignee |
| `github_list_prs` | `list_prs` | List pull requests filtered by state, author, base|
| `github_get_issue` | `get_issue` | Retrieve details of a single issue |
| `github_get_pr` | `get_pr` | Retrieve details of a single pull request |
| `github_api_request` | `api_request` | Make an arbitrary GET request to the GitHub API |
| `github_traffic_views` | `traffic_views` | Fetch repository traffic view counts |
| `github_traffic_clones`| `traffic_clones` | Fetch repository traffic clone counts |
This toolset uses `T::Enum` parameters (`IssueState`, `PRState`, `ReviewState`) for state filters, demonstrating enum-based tool signatures in practice.
```ruby
agent = DSPy::ReAct.new(
RepoAnalysis,
tools: DSPy::Tools::GitHubCLIToolset.to_tools
)
```
---
## Testing
### Unit Testing Individual Tools
Test `DSPy::Tools::Base` subclasses by instantiating and calling `call` directly:
```ruby
RSpec.describe WeatherLookup do
subject(:tool) { described_class.new }
it "returns weather for a city" do
result = tool.call(city: "Berlin")
expect(result).to include("Berlin")
end
it "exposes the correct tool name" do
expect(tool.name).to eq("weather_lookup")
end
it "generates a valid schema" do
schema = described_class.call_schema_object
expect(schema[:required]).to include("city")
expect(schema[:properties]).to have_key(:city)
end
end
```
### Unit Testing Toolsets
Test toolset methods directly on an instance. Verify tool generation with `to_tools`:
```ruby
RSpec.describe DatabaseToolset do
subject(:toolset) { described_class.new }
it "executes a query" do
result = toolset.query(sql: "SELECT 1")
expect(result).to be_a(String)
end
it "generates tools with correct names" do
tools = described_class.to_tools
names = tools.map(&:name)
expect(names).to contain_exactly("db_query", "db_insert", "db_delete")
end
it "generates tool descriptions" do
tools = described_class.to_tools
query_tool = tools.find { |t| t.name == "db_query" }
expect(query_tool.description).to eq("Run a read-only SQL query")
end
end
```
### Mocking Predictions Inside Tools
When a tool calls a DSPy predictor internally, stub the predictor to isolate tool logic from LLM calls:
```ruby
class SmartSearchTool < DSPy::Tools::Base
extend T::Sig
tool_name "smart_search"
tool_description "Search with query expansion"
sig { void }
def initialize
@expander = DSPy::Predict.new(QueryExpansionSignature)
end
sig { params(query: String).returns(String) }
def call(query:)
expanded = @expander.call(query: query)
perform_search(expanded.expanded_query)
end
private
def perform_search(query)
# actual search logic
end
end
RSpec.describe SmartSearchTool do
subject(:tool) { described_class.new }
before do
expansion_result = double("result", expanded_query: "expanded test query")
allow_any_instance_of(DSPy::Predict).to receive(:call).and_return(expansion_result)
end
it "expands the query before searching" do
allow(tool).to receive(:perform_search).with("expanded test query").and_return("found 3 results")
result = tool.call(query: "test")
expect(result).to eq("found 3 results")
end
end
```
### Testing Enum Coercion
Verify that string values from LLM responses deserialize into the correct enum instances:
```ruby
RSpec.describe "enum coercion" do
it "handles case-insensitive enum values" do
toolset = GitHubCLIToolset.new
# The LLM may return "OPEN" instead of "open"
result = toolset.list_issues(state: IssueState::Open)
expect(result).to be_a(String)
end
end
```
---
## Constraints
- All exposed tool methods must use **keyword arguments**. Positional-only parameters generate schemas but keyword arguments produce more reliable LLM interactions.
- Each exposed method becomes a **separate, independent tool**. Method chaining or multi-step sequences within a single tool call are not supported.
- Shared state across tool proxies is scoped to a single `to_tools` call. Separate `to_tools` invocations create separate toolset instances.
- Methods without a Sorbet `sig` produce an empty parameter schema. The LLM will not know what arguments to pass.

View File

@@ -0,0 +1,221 @@
---
name: fastapi-style
description: This skill should be used when writing Python and FastAPI code following opinionated best practices. It applies when building APIs, creating Pydantic models, working with SQLAlchemy, or any FastAPI application. Triggers on FastAPI code generation, API design, refactoring requests, code review, or when discussing async Python patterns. Embodies thin routers, rich Pydantic models, dependency injection, async-first design, and the "explicit is better than implicit" philosophy.
---
<objective>
Apply opinionated FastAPI conventions to Python API code. This skill provides comprehensive domain expertise for building maintainable, performant FastAPI applications following established patterns from production codebases.
</objective>
<essential_principles>
## Core Philosophy
"Explicit is better than implicit. Simple is better than complex."
**The FastAPI Way:**
- Thin routers, rich Pydantic models with validation
- Dependency injection for everything
- Async-first with SQLAlchemy 2.0
- Type hints everywhere - let the tools help you
- Settings via pydantic-settings, not raw env vars
- Database-backed solutions where possible
**What to deliberately avoid:**
- Flask patterns (global request context)
- Django ORM in FastAPI (use SQLAlchemy 2.0)
- Synchronous database calls (use async)
- Manual JSON serialization (Pydantic handles it)
- Global state (use dependency injection)
- `*` imports (explicit imports only)
- Circular imports (proper module structure)
**Development Philosophy:**
- Type everything - mypy should pass
- Fail fast with descriptive errors
- Write-time validation over read-time checks
- Database constraints complement Pydantic validation
- Tests are documentation
</essential_principles>
<intake>
What are you working on?
1. **Routers** - Route organization, dependency injection, response models
2. **Models** - Pydantic schemas, SQLAlchemy models, validation patterns
3. **Database** - SQLAlchemy 2.0 async, Alembic migrations, transactions
4. **Testing** - pytest, httpx TestClient, fixtures, async testing
5. **Security** - OAuth2, JWT, permissions, CORS, rate limiting
6. **Background Tasks** - Celery, ARQ, or FastAPI BackgroundTasks
7. **Code Review** - Review code against FastAPI best practices
8. **General Guidance** - Philosophy and conventions
**Specify a number or describe your task.**
</intake>
<routing>
| Response | Reference to Read |
|----------|-------------------|
| 1, router, route, endpoint | [routers.md](./references/routers.md) |
| 2, model, pydantic, schema, sqlalchemy | [models.md](./references/models.md) |
| 3, database, db, alembic, migration, transaction | [database.md](./references/database.md) |
| 4, test, testing, pytest, fixture | [testing.md](./references/testing.md) |
| 5, security, auth, oauth, jwt, permission | [security.md](./references/security.md) |
| 6, background, task, celery, arq, queue | [background_tasks.md](./references/background_tasks.md) |
| 7, review | Read all references, then review code |
| 8, general task | Read relevant references based on context |
**After reading relevant references, apply patterns to the user's code.**
</routing>
<quick_reference>
## Project Structure
```
app/
├── main.py # FastAPI app creation, middleware
├── config.py # Settings via pydantic-settings
├── dependencies.py # Shared dependencies
├── database.py # Database session, engine
├── models/ # SQLAlchemy models
│ ├── __init__.py
│ ├── base.py # Base model class
│ └── user.py
├── schemas/ # Pydantic models
│ ├── __init__.py
│ └── user.py
├── routers/ # API routers
│ ├── __init__.py
│ └── users.py
├── services/ # Business logic (if needed)
├── utils/ # Shared utilities
└── tests/
├── conftest.py # Fixtures
└── test_users.py
```
## Naming Conventions
**Pydantic Schemas:**
- `UserCreate` - input for creation
- `UserUpdate` - input for updates (all fields Optional)
- `UserRead` - output representation
- `UserInDB` - internal with hashed password
**SQLAlchemy Models:** Singular nouns (`User`, `Item`, `Order`)
**Routers:** Plural resource names (`users.py`, `items.py`)
**Dependencies:** Verb phrases (`get_current_user`, `get_db_session`)
## Type Hints
```python
# Always type function signatures
async def get_user(
user_id: int,
db: AsyncSession = Depends(get_db),
) -> User:
...
# Use Annotated for dependency injection
from typing import Annotated
CurrentUser = Annotated[User, Depends(get_current_user)]
DBSession = Annotated[AsyncSession, Depends(get_db)]
```
## Response Patterns
```python
# Explicit response_model
@router.get("/users/{user_id}", response_model=UserRead)
async def get_user(user_id: int, db: DBSession) -> User:
...
# Status codes
@router.post("/users", status_code=status.HTTP_201_CREATED)
async def create_user(...) -> UserRead:
...
# Multiple response types
@router.get("/users/{user_id}", responses={404: {"model": ErrorResponse}})
```
## Error Handling
```python
from fastapi import HTTPException, status
# Specific exceptions
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="User not found",
)
# Custom exception handlers
@app.exception_handler(ValidationError)
async def validation_exception_handler(request, exc):
return JSONResponse(status_code=422, content={"detail": exc.errors()})
```
## Dependency Injection
```python
# Simple dependency
async def get_db() -> AsyncGenerator[AsyncSession, None]:
async with async_session() as session:
yield session
# Parameterized dependency
def get_pagination(
skip: int = Query(0, ge=0),
limit: int = Query(100, ge=1, le=1000),
) -> dict:
return {"skip": skip, "limit": limit}
# Class-based dependency
class CommonQueryParams:
def __init__(self, q: str | None = None, skip: int = 0, limit: int = 100):
self.q = q
self.skip = skip
self.limit = limit
```
</quick_reference>
<reference_index>
## Domain Knowledge
All detailed patterns in `references/`:
| File | Topics |
|------|--------|
| [routers.md](./references/routers.md) | Route organization, dependency injection, response models, middleware, versioning |
| [models.md](./references/models.md) | Pydantic schemas, SQLAlchemy models, validation, serialization, mixins |
| [database.md](./references/database.md) | SQLAlchemy 2.0 async, Alembic migrations, transactions, connection pooling |
| [testing.md](./references/testing.md) | pytest, httpx TestClient, fixtures, async testing, mocking patterns |
| [security.md](./references/security.md) | OAuth2, JWT, permissions, CORS, rate limiting, secrets management |
| [background_tasks.md](./references/background_tasks.md) | FastAPI BackgroundTasks, Celery, ARQ, task patterns |
</reference_index>
<success_criteria>
Code follows FastAPI best practices when:
- Routers are thin, focused on HTTP concerns only
- Pydantic models handle all validation and serialization
- SQLAlchemy 2.0 async patterns used correctly
- Dependencies injected, not imported as globals
- Type hints on all function signatures
- Settings via pydantic-settings
- Tests use pytest with async support
- Error handling is explicit and informative
- Security follows OAuth2/JWT standards
- Background tasks use appropriate tool for the job
</success_criteria>
<credits>
Based on FastAPI best practices from the official documentation, real-world production patterns, and the Python community's collective wisdom.
**Key Resources:**
- [FastAPI Documentation](https://fastapi.tiangolo.com/)
- [SQLAlchemy 2.0 Documentation](https://docs.sqlalchemy.org/)
- [Pydantic V2 Documentation](https://docs.pydantic.dev/)
</credits>

View File

@@ -45,6 +45,7 @@ Each todo is a markdown file with YAML frontmatter and structured sections. Use
**Required sections:**
- **Problem Statement** - What is broken, missing, or needs improvement?
- **Assessment (Pressure Test)** - For code review findings: verification results and engineering judgment
- **Findings** - Investigation results, root cause, key discoveries
- **Proposed Solutions** - Multiple options with pros/cons, effort, risk
- **Recommended Action** - Clear plan (filled during triage)
@@ -56,6 +57,12 @@ Each todo is a markdown file with YAML frontmatter and structured sections. Use
- **Resources** - Links to errors, tests, PRs, documentation
- **Notes** - Additional context or decisions
**Assessment section fields (for code review findings):**
- Assessment: Clear & Correct | Unclear | Likely Incorrect | YAGNI
- Recommended Action: Fix now | Clarify | Push back | Skip
- Verified: Code, Tests, Usage, Prior Decisions (Yes/No with details)
- Technical Justification: Why this finding is valid or should be skipped
**YAML frontmatter fields:**
```yaml
---

View File

@@ -19,6 +19,22 @@ What is broken, missing, or needs improvement? Provide clear context about why t
- Email service is missing proper error handling for rate-limit scenarios
- Documentation doesn't cover the new authentication flow
## Assessment (Pressure Test)
*(For findings from code review or automated agents)*
| Criterion | Result |
|-----------|--------|
| **Assessment** | Clear & Correct / Unclear / Likely Incorrect / YAGNI |
| **Recommended Action** | Fix now / Clarify / Push back / Skip |
| **Verified Code?** | Yes/No - [what was checked] |
| **Verified Tests?** | Yes/No - [existing coverage] |
| **Verified Usage?** | Yes/No - [how code is used] |
| **Prior Decisions?** | Yes/No - [any intentional design] |
**Technical Justification:**
[If pushing back or marking YAGNI, provide specific technical reasoning. Reference codebase constraints, requirements, or trade-offs.]
## Findings
Investigation results, root cause analysis, and key discoveries.

View File

@@ -0,0 +1,369 @@
---
name: python-package-writer
description: This skill should be used when writing Python packages following production-ready patterns and philosophy. It applies when creating new Python packages, refactoring existing packages, designing package APIs, or when clean, minimal, well-tested Python library code is needed. Triggers on requests like "create a package", "write a Python library", "design a package API", or mentions of PyPI publishing.
---
# Python Package Writer
Write Python packages following battle-tested patterns from production-ready libraries. Emphasis on simplicity, minimal dependencies, comprehensive testing, and modern packaging standards (pyproject.toml, type hints, pytest).
## Core Philosophy
**Simplicity over cleverness.** Zero or minimal dependencies. Explicit code over magic. Framework integration without framework coupling. Every pattern serves production use cases.
## Package Structure (src layout)
The modern recommended layout with proper namespace isolation:
```
package-name/
├── pyproject.toml # All metadata and configuration
├── README.md
├── LICENSE
├── py.typed # PEP 561 marker for type hints
├── src/
│ └── package_name/ # Actual package code
│ ├── __init__.py # Entry point, exports, version
│ ├── core.py # Core functionality
│ ├── models.py # Data models (Pydantic/dataclasses)
│ ├── exceptions.py # Custom exceptions
│ └── py.typed # Type hint marker (also here)
└── tests/
├── conftest.py # Pytest fixtures
├── test_core.py
└── test_models.py
```
## Entry Point Structure
Every package follows this pattern in `src/package_name/__init__.py`:
```python
"""Package description - one line."""
# Public API exports
from package_name.core import Client, process_data
from package_name.models import Config, Result
from package_name.exceptions import PackageError, ValidationError
__version__ = "1.0.0"
__all__ = [
"Client",
"process_data",
"Config",
"Result",
"PackageError",
"ValidationError",
]
```
## pyproject.toml Configuration
Modern packaging with all metadata in one file:
```toml
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "package-name"
version = "1.0.0"
description = "Brief description of what the package does"
readme = "README.md"
license = "MIT"
requires-python = ">=3.10"
authors = [
{ name = "Your Name", email = "you@example.com" }
]
classifiers = [
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Developers",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Typing :: Typed",
]
keywords = ["keyword1", "keyword2"]
# Zero or minimal runtime dependencies
dependencies = []
[project.optional-dependencies]
dev = [
"pytest>=8.0",
"pytest-cov>=4.0",
"ruff>=0.4",
"mypy>=1.0",
]
# Optional integrations
fastapi = ["fastapi>=0.100", "pydantic>=2.0"]
[project.urls]
Homepage = "https://github.com/username/package-name"
Documentation = "https://package-name.readthedocs.io"
Repository = "https://github.com/username/package-name"
Changelog = "https://github.com/username/package-name/blob/main/CHANGELOG.md"
[tool.hatch.build.targets.wheel]
packages = ["src/package_name"]
[tool.ruff]
target-version = "py310"
line-length = 88
[tool.ruff.lint]
select = ["E", "F", "I", "N", "W", "UP", "B", "C4", "SIM"]
[tool.mypy]
python_version = "3.10"
strict = true
warn_return_any = true
warn_unused_ignores = true
[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "-ra -q"
[tool.coverage.run]
source = ["src/package_name"]
branch = true
```
## Configuration Pattern
Use module-level configuration with dataclasses or simple attributes:
```python
# src/package_name/config.py
from dataclasses import dataclass, field
from os import environ
from typing import Any
@dataclass
class Config:
"""Package configuration with sensible defaults."""
timeout: int = 30
retries: int = 3
api_key: str | None = field(default=None)
debug: bool = False
def __post_init__(self) -> None:
# Environment variable fallbacks
if self.api_key is None:
self.api_key = environ.get("PACKAGE_API_KEY")
# Module-level singleton (optional)
_config: Config | None = None
def get_config() -> Config:
"""Get or create the global config instance."""
global _config
if _config is None:
_config = Config()
return _config
def configure(**kwargs: Any) -> Config:
"""Configure the package with custom settings."""
global _config
_config = Config(**kwargs)
return _config
```
## Error Handling
Simple hierarchy with informative messages:
```python
# src/package_name/exceptions.py
class PackageError(Exception):
"""Base exception for all package errors."""
pass
class ConfigError(PackageError):
"""Invalid configuration."""
pass
class ValidationError(PackageError):
"""Data validation failed."""
def __init__(self, message: str, field: str | None = None) -> None:
self.field = field
super().__init__(message)
class APIError(PackageError):
"""External API error."""
def __init__(self, message: str, status_code: int | None = None) -> None:
self.status_code = status_code
super().__init__(message)
# Validate early with ValueError
def process(data: bytes) -> str:
if not data:
raise ValueError("Data cannot be empty")
if len(data) > 1_000_000:
raise ValueError(f"Data too large: {len(data)} bytes (max 1MB)")
return data.decode("utf-8")
```
## Type Hints
Always use type hints with modern syntax (Python 3.10+):
```python
# Use built-in generics, not typing module
from collections.abc import Callable, Iterator, Mapping, Sequence
def process_items(
items: list[str],
transform: Callable[[str], str] | None = None,
*,
batch_size: int = 100,
) -> Iterator[str]:
"""Process items with optional transformation."""
for item in items:
if transform:
yield transform(item)
else:
yield item
# Use | for unions, not Union
def get_value(key: str) -> str | None:
return _cache.get(key)
# Use Self for return type annotations (Python 3.11+)
from typing import Self
class Client:
def configure(self, **kwargs: str) -> Self:
# Update configuration
return self
```
## Testing (pytest)
```python
# tests/conftest.py
import pytest
from package_name import Config, configure
@pytest.fixture
def config() -> Config:
"""Fresh config for each test."""
return configure(timeout=5, debug=True)
@pytest.fixture
def sample_data() -> bytes:
"""Sample input data."""
return b"test data content"
# tests/test_core.py
import pytest
from package_name import process_data, PackageError
class TestProcessData:
"""Tests for process_data function."""
def test_basic_functionality(self, sample_data: bytes) -> None:
result = process_data(sample_data)
assert result == "test data content"
def test_empty_input_raises_error(self) -> None:
with pytest.raises(ValueError, match="cannot be empty"):
process_data(b"")
def test_with_transform(self, sample_data: bytes) -> None:
result = process_data(sample_data, transform=str.upper)
assert result == "TEST DATA CONTENT"
class TestConfig:
"""Tests for configuration."""
def test_defaults(self) -> None:
config = Config()
assert config.timeout == 30
assert config.retries == 3
def test_env_fallback(self, monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setenv("PACKAGE_API_KEY", "test-key")
config = Config()
assert config.api_key == "test-key"
```
## FastAPI Integration
Optional FastAPI integration pattern:
```python
# src/package_name/fastapi.py
"""FastAPI integration - only import if FastAPI is installed."""
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from fastapi import FastAPI
from package_name.config import get_config
def init_app(app: "FastAPI") -> None:
"""Initialize package with FastAPI app."""
config = get_config()
@app.on_event("startup")
async def startup() -> None:
# Initialize connections, caches, etc.
pass
@app.on_event("shutdown")
async def shutdown() -> None:
# Cleanup resources
pass
# Usage in FastAPI app:
# from package_name.fastapi import init_app
# init_app(app)
```
## Anti-Patterns to Avoid
- `__getattr__` magic (use explicit imports)
- Global mutable state (use configuration objects)
- `*` imports in `__init__.py` (explicit `__all__`)
- Many runtime dependencies
- Committing `.venv/` or `__pycache__/`
- Not including `py.typed` marker
- Using `setup.py` (use `pyproject.toml`)
- Mixing src layout and flat layout
- `print()` for debugging (use logging)
- Bare `except:` clauses
## Reference Files
For deeper patterns, see:
- **[references/package-structure.md](./references/package-structure.md)** - Directory layouts, module organization
- **[references/pyproject-config.md](./references/pyproject-config.md)** - Complete pyproject.toml examples
- **[references/testing-patterns.md](./references/testing-patterns.md)** - pytest patterns, fixtures, CI setup
- **[references/type-hints.md](./references/type-hints.md)** - Modern typing patterns
- **[references/fastapi-integration.md](./references/fastapi-integration.md)** - FastAPI/Pydantic integration
- **[references/publishing.md](./references/publishing.md)** - PyPI publishing, CI/CD
- **[references/resources.md](./references/resources.md)** - Links to exemplary Python packages