Building a resilient AI Client in Ruby with Stoplight and Ruby_LLM
Calling external AI providers (like OpenAI, Gemini, or Claude) is a common task in modern web applications. However, these services can sometimes be slow, return errors, or become completely unavailable. This can lead to cascading failures that bring your entire application down. To prevent this, you can use the Circuit Breaker pattern, which isolates failing services and even allows for a graceful failover to a backup provider. In Ruby, the
stoplight gem provides a straightforward way to implement this pattern and build a resilient client, in this article we will use it with ruby_llm to handle AI providers. To see this implementation in action and explore the details, be sure to check out the repository containing the code examples used in this article.What is the Circuit Breaker Pattern?
The Circuit Breaker pattern monitors calls to a service. If failures reach a certain threshold, it "trips" (opens the circuit), preventing further calls to that service for a set "cool-off" period. This gives the failing service time to recover and prevents your application from wasting resources on requests that are likely to fail. A circuit breaker has three states:
- Closed (green light): The normal state. Requests pass through, and failures are recorded.
- Open (red light): After a failure threshold is met, the circuit opens. All requests are immediately rejected without attempting to call the service
- Half-Open (yellow light): After the cool-off period, the circuit moves to this state. It allows a limited number of test requests to pass through. If they succeed, the circuit moves back to Closed; if they fail, it returns to Open
Basic Example
For use
ruby_llm gem, first we need to set the api keys to the config file. You can install the Rails integration with rails generate ruby_llm:install too.# config/initializers/ruby_llm.rb RubyLLM.configure do |config| config.openai_api_key = ENV['OPENAI_API_KEY'] config.gemini_api_key = ENV["GEMINI_API_KEY"] # You could add Anthropic, etc. here end
Here's the simplest way to use
stoplight after install it with bundle or manually:# Create a light # It will "trip" after 3 failures in a row light = Stoplight("my-ai-service", threshold: 3) do # Code that might fail, e.g., an API call MyAIProvider.new.call end # Run the code begin light.run rescue Stoplight::Error::RedLight # The circuit is open! puts "Service is unavailable, falling back." end
This code creates a simple circuit. If
MyAIProvider.new.call fails 3 times, the light turns "red" (Open) and will immediately raise a Stoplight::Error::RedLight on subsequent calls. For a real application, you need to persist the state of your circuits. If you just keep the state in memory, an application restart would reset all circuits, potentially flooding a service that is still down. We configure stoplight to use Redis in production for a persistent data store.# config/initializers/stoplight.rb Stoplight.configure do |config| # We can disable notifiers if we don't want external alerts config.error_notifier = ->(_) { } config.notifiers = [] if Rails.env.test? || Rails.env.development? # Use in-memory data store for dev/test config.data_store = Stoplight::DataStore::Memory.new else # Use Redis for production config.data_store = Stoplight::DataStore::Redis.new(Redis.new, warn_on_clock_skew: false) end end
By using Redis, the state of all circuit breakers (Closed, Open, or Half-Open) is maintained even if your application restarts.
Implementing a Resilient AI Client with Failover
Let's build a client that uses
stoplight for circuit breaking and ruby_llm to orchestrate calls and failover. Since it wraps provider-specific errors, our list is much cleaner. First, let's define our core settings for the circuit breakers.# These settings will be used to create our circuits module AIProviderSettings # If 3 consecutive failures occur, trip the circuit CIRCUIT_THRESHOLD = 3 # Stay open for 30 minutes before moving to Half-Open FAILURE_COOLDOWN_S = 1800 # 30 minutes # Only these errors will count as "failures" # ruby_llm conveniently wraps provider errors (like OpenAI::Error) # into a standard RubyLLM::Error. TRACKING_ERRORS = [ RubyLLM::Error, Faraday::Error, Timeout::Error, Net::ReadTimeout, Net::OpenTimeout ] end
Next, we'll define our model priority list. We'll try the most powerful model first, then a fast one, and finally a standard one.
ruby_llm can work with any of them.# Our priority list for failover # We'll try 'gpt-4o' first, then 'gemini-2.5-flash', then 'gpt-4o-mini' MODEL_PRIORITY = ['gpt-4o', 'gemini-2.5-flash', 'gpt-4o-mini']
Now, let's create our failover method. This method will take a
ruby_llm chat object (which holds the conversation history) and the new prompt. It will loop through our priority list, using stoplight to check each model.def ask_with_failover(chat, prompt) last_error = nil MODEL_PRIORITY.each do |model_name| light = Stoplight("ai_models:#{model_name}", threshold: AIProviderSettings::CIRCUIT_THRESHOLD, cool_off_time: AIProviderSettings::FAILURE_COOLDOWN_S, tracked_errors: AIProviderSettings::TRACKING_ERRORS, data_store: ($STOPLIGHT_DATA_STORE || Stoplight::DataStore::Memory.new) ) begin return light.run do chat.with_model(model_name) chat.ask(prompt) end rescue Stoplight::Error::RedLight => e last_error = e next rescue *AIProviderSettings::TRACKING_ERRORS => e last_error = e next end end raise StandardError, "All AI models failed. Last error: #{last_error&.message}" end
How to use it:
# Create a new chat session # The default model here doesn't matter, as our method overrides it chat = RubyLLM.chat # First question: might use gpt-4o response1 = ask_with_failover(chat, "What is the Circuit Breaker pattern?") puts response1.content puts "-> Used model: #{response1.model_id}" # Second question: imagine gpt-4o is now down. # The circuit will trip, and it will fall back to gemini-2.5-flash # The chat history is preserved! response2 = ask_with_failover(chat, "How does it relate to 'stoplight'?") puts response2.content puts "-> Used model: #{response2.model_id}"
You can expand it to use with different
ruby_llm features like extract info from documents, generate images, transcribe audio, analyze videos, etc.Advantages of this Approach
- Resilience: Prevents cascading failures from a single failing model.
- Better User Experience: Your application can gracefully degrade from a premium model to a standard one instead of returning an error.
- Gives Services Time to Recover: By stopping requests to an open circuit, you give the failing model time to recover.
-
Clean Abstraction:
ruby_llmhandles all the different SDKs, andstoplighthandles all the stateful error logic. -
Preserves Context: By using
chat.with_model, the failover is seamless, and the new model still gets the entire conversation history.
When to Use this Pattern
Use a Circuit Breaker when:
- You are calling any external, third-party API that you don't control (especially LLMs).
- You want to provide multiple levels of service (e.g., a premium model that falls back to a cheaper one).
- You need to prevent a single component's failure from taking down the entire system.
Conclusion
This implementation leverages the
stoplight gem for robust circuit breaking and ruby_llm for clean AI abstraction. By intelligently monitoring failures, tripping circuits, and providing a failover mechanism between different models, it transforms a fragile integration into a stable and fault-tolerant one.