Rails Makes LLMs Boring (And That's Good): A Service Layer Implementation

In our previous article on Rails and AI, we explored how Rails’ conventions create an ideal foundation for AI-powered applications. Now let’s dive deeper into the LLM service layer that powers our D&D character generator, showcasing how Rails’ pragmatic approach handles complex AI integrations without the overhead of microservices or intricate event buses.

Many teams assume that integrating AI requires complex architectural changes—splitting applications into microservices, implementing event-driven messaging patterns, or adopting specialized frameworks. This perceived complexity often leads to overengineered solutions. Rails, with its strong service object patterns and modular design principles, provides everything needed to architect maintainable, well-structured LLM integrations while keeping the development model simple and familiar.

Service Layer Foundation

Following the pragmatic development approach we discussed earlier, we begin with a mock service that establishes our interface and enables rapid development. This approach lets us develop and test our character generator’s AI features without waiting for external API integration.

# app/services/llm/providers/mock.rb
module Llm
  module Providers
    class Mock < Base
      # Main chat interface that processes incoming messages and returns appropriate mock responses
      # @param messages [Array<Hash>] Array of message objects with 'role' and 'content' keys
      # @param system_prompt [String, nil] Optional system prompt to guide response generation
      # @return [Hash] Mock response matching the expected schema
      def chat(messages:, system_prompt: nil)
        # Log incoming request for debugging and monitoring
        log_request(:chat, messages: messages, system_prompt: system_prompt)
        
        # Extract the last user message which contains our request
        # We only care about the most recent message for determining response type
        last_message = messages.last
        return {} unless last_message['role'] == 'user'
        
        # Parse the request type from the message content
        # This determines which mock response or generator to use
        request = last_message['content']
        
        # Pattern match against request content to determine response type
        # Each type maps to a specific mock response or generator method
        response = case request
                  when /background/i
                    load_mock_response('character_background.json')
                  when /traits/i
                    load_mock_response('character_traits.json')
                  when /equipment/i
                    suggest_equipment
                  when /spells/i
                    suggest_spells
                  else
                    { error: 'Unknown request type' }
                  end
        
        # Log the response for debugging and monitoring
        log_response(:chat, response)
        response
      end
      
      private
      
      # Loads mock response data from JSON fixtures
      # @param filename [String] Name of the JSON fixture file to load
      # @raise [ProviderError] If file cannot be loaded or parsed
      def load_mock_response(filename)
        # Construct path to fixture file using Rails conventions
        path = Rails.root.join('test', 'fixtures', 'files', 'mock_responses', filename)
        JSON.parse(File.read(path))
      rescue StandardError => e
        # Log error and raise with helpful message
        log_error(:load_mock_response, e)
        raise Llm::Service::ProviderError, "Failed to load mock response: #{e.message}"
      end
    end
  end
end

This mock implementation offers several advantages:

Fast Development: It starts simple yet offers space for evolution.
Clear Interface: The service establishes the contract that real providers will follow.
Test Reliability: Our tests can use known responses rather than unpredictable LLM outputs.
Cost Saving: Developers can work without API keys or usage costs.

Prompt Service and Templates

Before our service can chat with LLMs, it needs well-crafted prompts. Our PromptService manages the templates and contextual data that shape our AI conversations. This is a critical abstraction in our architecture—prompt engineering is often one of the most challenging and iterative aspects of working with LLMs, requiring frequent refinement as we learn what produces the best results.

By extracting prompt management into a dedicated service, we gain multiple benefits: we can separately version and test prompts, enable non-developers to contribute to prompt crafting, cache frequently used templates for performance, and maintain provider-specific variations when different LLMs require different instructions. This separation of concerns follows Rails’ convention over configuration philosophy while providing flexibility where it matters most.


# app/services/llm/prompt_service.rb
module Llm
  class PromptService
    class << self
      # Factory method for generating prompts with context
      # @param request_type [Symbol] Type of prompt to generate
      # @param provider [Symbol] LLM provider to generate for
      # @param context [Hash] Additional context for template rendering
      def generate(request_type:, provider:, **context)
        new.generate(request_type: request_type, provider: provider, **context)
      end
    end

    # Main prompt generation method that loads, validates, and renders templates
    # @param request_type [Symbol] Type of prompt to generate
    # @param provider [Symbol] LLM provider to generate for
    # @param context [Hash] Additional context for template rendering
    # @return [Hash] Rendered prompt with system and user messages
    def generate(request_type:, provider:, **context)
      # Load and validate the appropriate template
      template = load_template(request_type, provider)
      validate_template!(template)
      # Render the template with provided context
      render_template(template, provider, context)
    end

    private

    # Loads template from cache or disk with fallback to default
    # @param request_type [Symbol] Type of prompt template to load
    # @param provider [Symbol] Provider-specific template to look for
    def load_template(request_type, provider)
      # Try to load from cache first to avoid disk reads
      Rails.cache.fetch(cache_key(request_type, provider)) do
        load_template_from_disk(request_type, provider)
      end
    rescue TemplateNotFoundError => e
      # Fallback to default template if provider-specific one isn't found
      Rails.logger.warn "[PromptService] #{e.message}, using default"
      load_default_template(request_type)
    end

    # Validates template structure and required fields
    # @param template [Hash] Template to validate
    # @raise [ValidationError] If template is invalid
    def validate_template!(template)
      raise ValidationError, "Template must be a hash" unless template.is_a?(Hash)
      raise ValidationError, "Missing system_prompt" unless template['system_prompt'].is_a?(String)
      raise ValidationError, "Missing user_prompt" unless template['user_prompt'].is_a?(String)

      # Skip schema validation for string responses
      return if template['response_format'] == 'string'

      # Validate schema structure if present
      if template['schema']
        validate_schema!(template['schema'])
      end
    end

    # Renders template with provided context using Mustache
    # @param template [Hash] Template to render
    # @param provider [Symbol] Provider context for rendering
    # @param context [Hash] Variables for template interpolation
    def render_template(template, provider, context)
      # Create a copy to avoid modifying the cached version
      rendered = template.deep_dup

      # Render both prompts using Mustache templating
      rendered['system_prompt'] = Mustache.render(template['system_prompt'], context)
      rendered['user_prompt'] = Mustache.render(template['user_prompt'], context)

      rendered
    end
  end
end

We load YAML templates, cache them for speed, and validate their structure—ensuring our AI dungeon master has a solid script to work from. YAML keeps templates human-readable and editable outside code—perfect for tweaking prompts without redeploying the application. This approach also makes it easy to maintain multiple prompt variations for different providers or use cases.

Externalizing prompts into YAML templates provides several significant advantages over hardcoding them:

Non-technical Collaboration: Product managers, subject matter experts, and content designers can directly contribute to and refine prompts without developer intervention.
Rapid Iteration: Prompt engineering requires experimentation and fine-tuning. Separating templates from code allows for quick adjustments without deployment cycles.
Environment Customization: Different environments (development, staging, production) can use different prompt strategies with the same codebase.
Provider-Specific Optimizations: We can maintain optimized versions for each LLM provider, leveraging their unique capabilities or working around limitations.
Version Control: Changes to prompts are tracked in git separately from code changes, making reviews and rollbacks more manageable.

This thoughtful separation of concerns exemplifies Rails’ philosophy—make the common case easy and provide flexibility where it matters most.

A typical YAML template looks like this:

# config/prompts/default/character_background.yml
system_prompt: |
  You are a D&D character background generator. Create a compelling backstory
  for a {{level}}-level {{class_type}} that fits with their {{alignment}} alignment.
  The story should explain how they acquired their skills and abilities.

user_prompt: |
  Generate a background story for {{name}}, a {{race}} {{class_type}} with 
  the following abilities:
  {{#abilities}}
  - {{name}}: {{score}}
  {{/abilities}}

schema:
  type: object
  required:
    - background
    - personality_traits
  properties:
    background:
      type: string
      description: A detailed background story for the character
    personality_traits:
      type: array
      items:
        type: string
      minItems: 2
      maxItems: 4
      description: List of personality traits that define the character

The Service Layer

With our mock foundation and prompt service in place, let’s look at how the main service layer orchestrates LLM interactions. This service abstracts the communication with LLM providers, handling retry logic, error management, and providing a consistent interface for our application.

First, let’s establish our error hierarchy and constants for resilient interactions:

# app/services/llm/service.rb
module Llm
  class Service
    # Define error hierarchy for specific error handling
    class Error < StandardError; end
    class ConfigurationError < Error; end
    class ProviderError < Error; end
    class RateLimitError < ProviderError; end
    
    # Constants for retry behavior
    MAX_RETRIES = 3 # Maximum number of retry attempts
    RETRY_DELAY = 1 # Base delay in seconds between retries

This error hierarchy enables targeted exception handling. Rather than catching all errors together, we can respond differently to configuration issues versus provider-specific problems. The retry constants balance persistence with respect for API limits.

Next, we implement convenient class methods that provide a clean interface to the service:

    class << self
      # Basic chat interface for simple requests
      # @param messages [Array<Hash>] Array of conversation messages
      # @param system_prompt [String, nil] Optional system context
      def chat(messages:, system_prompt: nil)
        new.chat(messages: messages, system_prompt: system_prompt)
      end
      
      # Schema-validated chat for structured responses
      # @param messages [Array<Hash>] Array of conversation messages
      # @param system_prompt [String, nil] Optional system context
      # @param schema [Hash] JSON schema for response validation
      def chat_with_schema(messages:, system_prompt: nil, schema:)
        new.chat_with_schema(messages: messages, system_prompt: system_prompt, schema: schema)
      end
      
      # Connection test method for health checks
      # @return [Boolean] True if connection successful
      def test_connection
        new.test_connection
      end
    end
    
    def initialize
      # Initialize with appropriate provider from factory
      @provider = Llm::Factory.create_provider
    end

These class methods follow the same pattern we saw in the PromptService—they provide a clean, stateless interface to our service while delegating to instance methods for the actual work. The initialize method leverages our factory pattern, allowing us to dynamically select the appropriate provider based on configuration.

Our service offers two primary interaction methods—chat and chat_with_schema—which reflect two distinct needs when working with LLMs:

Basic text interactions: Sometimes we simply need free-form text responses, like generating creative content or conversational replies. The chat method handles these scenarios with minimal overhead.
Structured data extraction: For cases where we need predictable, structured outputs that our application can reliably process (like generating D&D character traits as JSON), the chat_with_schema method enforces response validation against a JSON schema. This is crucial for integrating AI outputs into database models and ensuring type safety.

This dual approach provides flexibility without sacrificing reliability—we can use the simpler method when appropriate, but have structured validation when our application requires consistent data formats.

    # Instance method implementation of basic chat
    def chat(messages:, system_prompt: nil)
      Rails.logger.info "[Llm::Service] Sending chat request with #{messages.length} messages"
      
      # Use retry mechanism for resilience
      with_retries do
        @provider.chat(messages: messages, system_prompt: system_prompt)
      end
    rescue StandardError => e
      # Log error and wrap in our error type
      Rails.logger.error "[Llm::Service] Error in chat: #{e.class} - #{e.message}"
      raise ProviderError, "Failed to process chat request: #{e.message}"
    end
    
    def chat_with_schema(messages:, system_prompt: nil, schema:)
      Rails.logger.info "[Llm::Service] Sending schema-validated chat request"
      Rails.logger.debug "[Llm::Service] Using schema: #{schema.inspect}"
      
      with_retries do
        @provider.chat_with_schema(
          messages: messages,
          system_prompt: system_prompt,
          schema: schema
        )
      end
    rescue StandardError => e
      Rails.logger.error "[Llm::Service] Error in schema request: #{e.message}"
      raise ProviderError, "Failed to process structured chat: #{e.message}"
    end
    
    def test_connection
      Rails.logger.info "[Llm::Service] Testing connection to provider"
      
      with_retries do
        chat(messages: [{ role: 'user', content: 'test' }])
        true
      end
    rescue StandardError => e
      Rails.logger.error "[Llm::Service] Connection test failed: #{e.message}"
      false
    end

Each method wraps provider interactions with our with_retries mechanism and comprehensive logging. The test_connection method might appear simple, but it serves multiple purposes in our development lifecycle: validating API credentials during setup, verifying connectivity in CI/CD pipelines, and serving as a health check endpoint in production.

Finally, the retry mechanism that makes our service resilient:

    private
    
    # Implements retry logic with exponential backoff
    # @yield Block to retry on failure
    def with_retries
      retries = 0
      begin
        yield
      rescue RateLimitError => e
        # Special handling for rate limits with exponential backoff
        retries += 1
        if retries <= MAX_RETRIES
          sleep_time = RETRY_DELAY * (2 ** (retries - 1)) # Exponential backoff
          Rails.logger.warn "[Llm::Service] Rate limited, retry #{retries}/#{MAX_RETRIES} in #{sleep_time}s"
          sleep sleep_time
          retry
        else
          Rails.logger.error "[Llm::Service] Max retries exceeded"
          raise
        end
      rescue StandardError => e
        # General error retry with fixed delay
        retries += 1
        if retries <= MAX_RETRIES
          Rails.logger.warn "[Llm::Service] Error occurred, retrying"
          sleep RETRY_DELAY
          retry
        else
          Rails.logger.error "[Llm::Service] Max retries exceeded"
          raise
        end
      end
    end
  end
end

The with_retries method employs two different strategies: exponential backoff for rate limits (increasing the delay between retries) and fixed delay for general errors. This distinction is important—rate limits often require progressively longer waits to allow provider resources to recover, while transient network issues might resolve quickly.

We set MAX_RETRIES to 3, striking a balance between persistence and pragmatism. Too few retries and we might abandon salvageable requests; too many and we could delay user feedback or waste resources on unsalvageable requests.

This service layer abstracts all the complexity of LLM interactions, providing our application with a reliable, consistent interface regardless of which provider we’re using. The combination of factory pattern, retry logic, and error handling creates a robust foundation that can weather the unpredictable nature of external API services.

Provider Interface

Like a character class that establishes core abilities, our base provider defines the interface that all LLM implementations must follow:

# app/services/llm/providers/base.rb
module Llm
 module Providers
   class Base
     attr_reader :config
     
     def initialize(config)
       @config = config
     end
     
     # Core interface methods that all providers must implement
     def chat(messages:, system_prompt: nil)
       raise NotImplementedError, "#{self.class} must implement #chat"
     end
     
     def chat_with_schema(messages:, system_prompt: nil, schema:, provider_config: nil)
       raise NotImplementedError, "#{self.class} must implement #chat_with_schema"
     end
     
     def test_connection
       raise NotImplementedError, "#{self.class} must implement #test_connection"
     end
     
     protected
     
     # Helper for validating required configuration keys
     def validate_config!(*required_keys)
       missing_keys = required_keys.select { |key| config[key].nil? }
       return if missing_keys.empty?
       
       raise Llm::Service::ConfigurationError,
             "Missing required configuration keys: #{missing_keys.join(', ')}"
     end
   end
 end
end

This base class sets clear expectations for all provider implementations. The validate_config! method ensures that providers fail fast if they’re missing required configuration, while the interface methods establish a contract that all providers must fulfill.

Factory Pattern and Configuration

The factory pattern allows us to switch LLM providers at runtime without changing our application code:

# app/services/llm/factory.rb
module Llm
 class Factory
   class << self
     def create_provider
       provider_name = Rails.configuration.llm.provider
       config = Rails.configuration.llm.providers[provider_name]
       
       provider_class = case provider_name.to_sym
                       when :anthropic
                         Llm::Providers::Anthropic
                       when :openai
                         Llm::Providers::Openai
                       when :mock
                         Llm::Providers::Mock
                       else
                         raise Llm::Service::ConfigurationError, "Unknown provider: #{provider_name}"
                       end
       
       provider_class.new(config)
     rescue StandardError => e
       Rails.logger.error "[Llm::Factory] Provider creation failed: #{e.message}"
       raise Llm::Service::ConfigurationError, "Provider initialization failed: #{e.message}"
     end
   end
 end
end

Our configuration system pairs with the factory to provide environment-specific defaults:

# config/initializers/llm.rb
Rails.application.configure do
 config.llm = ActiveSupport::OrderedOptions.new
 
 # Environment-aware provider selection - like choosing the right tool for the job
 default_provider = case Rails.env
                   when 'test'
                     :mock
                   else
                     :anthropic
                   end
 
 # Allow override through environment variables
 config.llm.provider = (ENV['LLM_PROVIDER'] || default_provider).to_sym
 
 # Configuration for each supported provider
 config.llm.providers = {
   anthropic: {
     api_key: ENV['ANTHROPIC_API_KEY'],
     model: ENV['ANTHROPIC_MODEL'] || 'claude-3-5-sonnet-20241022',
     max_tokens: (ENV['ANTHROPIC_MAX_TOKENS'] || 4096).to_i,
     temperature: (ENV['ANTHROPIC_TEMPERATURE'] || 0.7).to_f
   },
   openai: {
     api_key: ENV['OPENAI_API_KEY'],
     model: ENV['OPENAI_MODEL'] || 'gpt-4-turbo-preview',
     max_tokens: (ENV['OPENAI_MAX_TOKENS'] || 4096).to_i,
     temperature: (ENV['OPENAI_TEMPERATURE'] || 0.7).to_f
   },
   mock: {
     # Mock provider doesn't need configuration
   }
 }
end

This configuration approach offers several benefits:

Environment-Based Defaults: Different environments use appropriate providers.
Environment Variable Overrides: Easy switching without code changes—a key advantage of the factory pattern.
Sensible Defaults: Each configuration has reasonable defaults that can be overridden when needed.

Testing Strategy

Our testing approach ensures the reliability of our LLM integration through both integration and unit tests:

# test/integration/llm_service_test.rb
class LlmServiceIntegrationTest < ActionDispatch::IntegrationTest
 setup do
   @character = characters(:warrior)  # Using a fixture for consistency
 end
 
 test "generates character traits using LLM service" do
   # Mock response that the service should return
   mock_response = {
     'traits' => [
       { 'trait' => 'Brave', 'description' => 'Always ready to face danger' },
       { 'trait' => 'Loyal', 'description' => 'Stands by their companions' }
     ]
   }
   
   # Mock the prompt service for consistent testing
   prompt = { 'user_prompt' => 'Generate traits', 'system_prompt' => nil }
   Llm::PromptService.expects(:generate)
                    .with(
                      request_type: 'character_traits',
                      provider: Rails.configuration.llm.provider,
                      name: @character.name,
                      race: @character.race,
                      class_type: @character.class_type,
                      level: @character.level,
                      alignment: @character.alignment,
                      background: @character.background
                    )
                    .returns(prompt)
   
   # Mock the LLM service to return our predefined response
   Llm::Service.any_instance.expects(:chat_with_schema)
               .returns(mock_response)
   
   # Verify traits were updated
   assert_changes -> { @character.personality_traits } do
     @character.generate_traits
   end
 end
end

Our unit tests verify specific service behaviors in isolation:

# test/services/llm/service_test.rb
module Llm
 class ServiceTest < ActiveSupport::TestCase
   setup do
     @mock_provider = mock('provider')
     Llm::Factory.stubs(:create_provider).returns(@mock_provider)
     @service = Llm::Service.new
     @messages = [{ 'role' => 'user', 'content' => 'Test' }]
   end
   
   test "retries on rate limit errors" do
     mock_response = { 'background' => 'Success after retry' }
     
     sequence = sequence('retry_sequence')
     
     @mock_provider.expects(:chat)
                  .raises(Llm::Service::RateLimitError.new("Rate limited"))
                  .in_sequence(sequence)
     
     @mock_provider.expects(:chat)
                  .returns(mock_response)
                  .in_sequence(sequence)
     
     response = @service.chat(messages: @messages)
     assert_equal mock_response, response
   end
   
   test "validates schema in chat_with_schema" do
     schema = {
       'type' => 'object',
       'required' => ['background'],
       'properties' => {
         'background' => { 'type' => 'string' }
       }
     }
     
     @mock_provider.expects(:chat_with_schema)
                  .with(messages: @messages, system_prompt: nil, schema: schema)
                  .returns({ 'background' => 'Test background' })
     
     response = @service.chat_with_schema(messages: @messages, schema: schema)
     assert_equal 'Test background', response['background']
   end
 end
end

This ensures our schema enforces structure—vital for consistent AI outputs and error handling. Our tests cover both the happy path (successful LLM calls) and error conditions (rate limits, validation failures), helping us maintain reliability as the codebase evolves.

Image Generation Integration

Our LLM service architecture extends to image generation for character portraits. This service follows the same patterns but with specialized abilities tailored for visual content creation.

Image generation presents unique challenges compared to text generation. While the overall service structure feels familiar, we need additional features like domain-specific prompt enhancement, external API integration, and file handling for the generated images. Let’s see how we address these challenges:

# app/services/image_generation/service.rb
module ImageGeneration
  class Service
    # Class-specific visual elements to enhance image generation prompts
    # @type [Hash<String, Array<String>>] Mapping of class names to visual elements
    CLASS_DETAILS = {
      'Wizard' => ['magical energy surrounding their hands', 'arcane symbols floating nearby'],
      'Fighter' => ['battle-worn armor details', 'warrior\'s confident stance'],
      'Cleric' => ['holy symbols', 'divine light effects', 'religious vestments'],
      # Other classes omitted for brevity
    }.freeze
    
    attr_reader :character, :prompt_type
    
    # Factory method for generating character portraits
    # @param character [Character] The character to generate a portrait for
    # @return [CharacterPortrait] The generated portrait record
    def self.generate(character:)
      new(character: character).generate
    end

Our implementation starts with two key architectural decisions:

We maintain a dictionary of class-specific visual elements (CLASS_DETAILS) to enhance prompts based on character class. This domain knowledge dramatically improves image quality by giving the AI model specific elements to include that match D&D conventions.
We implement the same factory method pattern seen in our LLM service, providing a clean, simple interface for callers.

The main generation method handles the complete lifecycle of image creation, from prompt construction to saving the final image:

    # Main portrait generation method
    # @return [CharacterPortrait] The generated and saved portrait
    def generate
      logger.info "[ImageGeneration::Service] Generating portrait for character #{character.id}"
      
      # Validate character has all required attributes before proceeding
      validate_character!
      
      # Get the prompt from PromptService - reusing our LLM prompt infrastructure
      # This ensures consistent prompt structure and caching
      prompt = Llm::PromptService.generate(
        request_type: 'character_image',
        provider: provider_name,
        **character_details
      )
      
      logger.debug "[ImageGeneration::Service] Using prompt: #{prompt}"
      
      # Call Fal.ai's Recraft model for portrait generation
      # We chose fal-ai for its realistic portraits and simple API
      result = HTTP.auth("Key #{ENV['FAL_API_KEY']}")
                     .post("https://fal.run/fal-ai/recraft-v3",
                       json: {
                         prompt: prompt,
                         image_size: "portrait_4_3",
                         style: "realistic_image"
                       })
      
      response = JSON.parse(result.body.to_s)
      
      # Handle API errors with clear error messages
      if !result.status.success?
        raise Error, "Failed to generate image: #{response['error'] || 'Unknown error'}"
      end

Notice how we reuse our PromptService for generating image prompts. This demonstrates a key benefit of our architecture—components designed for one AI modality (text) can be repurposed for another (images) with minimal changes.

We’re using Fal.ai’s Recraft model instead of better-known services like DALL-E or Midjourney. This deliberate choice leverages Recraft’s strengths in realistic character portraits and simple API, showing how our factory pattern enables easy provider swapping based on specific needs.

The second half of the method handles the result processing and persistence:

      # Create and attach the image to our character's portrait collection
      # New portraits are automatically selected, unselecting previous ones
      portrait = character.character_portraits.new(
        selected: true,
        generation_prompt: prompt
      )
      
      # Download and attach the image using ActiveStorage
      # This handles file storage and variant generation
      downloaded_image = URI.open(response['images'].first['url'])
      portrait.image.attach(
        io: downloaded_image,
        filename: "portrait_#{Time.current.to_i}.png",
        content_type: 'image/png'
      )
      
      portrait.save!
      portrait
    end

Here we see Rails’ ActiveStorage seamlessly integrating with our AI service, illustrating how Rails’ ecosystem handles the full pipeline from generation to storage. We’re creating a complete CharacterPortrait record that stores the image itself, preserves the generation prompt for future use, and handles portrait selection automatically—not just retrieving a URL

Finally, our validation and helper methods ensure data integrity:

    private
    
    # Validates that character has all required attributes for portrait generation
    # @raise [Error] If any required attributes are missing
    def validate_character!
      missing_attributes = []
      missing_attributes << "race" if character.race.blank?
      missing_attributes << "class_type" if character.class_type.blank?
      missing_attributes << "level" if character.level.blank?
      
      if missing_attributes.any?
        raise Error, "Cannot generate portrait: Missing #{missing_attributes.join(', ')}"
      end
    end
    
    # Builds rich prompt context from character attributes
    # @return [Hash] Character details for prompt generation
    def character_details
      {
        name: character.name,
        race: character.race,
        class_type: character.class_type,
        level: character.level,
        alignment: character.alignment,
        class_type_details: CLASS_DETAILS[character.class_type]&.join(", ")
      }
    end
  end
end

The validate_character! method provides early validation, preventing API calls with insufficient data that would likely result in poor images. This fail-fast approach improves the user experience by providing immediate, specific error messages rather than generic API failures.

The character_details method constructs a rich context object for prompt generation, enhancing basic attributes with class-specific visual details. This exemplifies how domain knowledge can significantly improve AI-generated content through thoughtful prompt engineering.

This image generation service demonstrates how our service architecture can flex to handle different AI modalities while maintaining consistent patterns. By reusing components where appropriate (PromptService) and adding specialized functionality where needed (ActiveStorage integration), we create a cohesive system that’s both powerful and maintainable.

Connecting to Our Character Model: Where the Magic Happens

Now that we’ve established our service layer architecture, let’s bring everything together by seeing how these services integrate with our domain models. The real test of any infrastructure is how cleanly it can be used in practice—and this is where Rails’ conventions truly shine.

Here’s how our Character model leverages the service layer to generate AI content:

# app/models/character.rb
class Character < ApplicationRecord
  # Associations for managing character portraits
  has_many :character_portraits, dependent: :destroy
  has_one :selected_portrait, -> { where(selected: true) },
          class_name: 'CharacterPortrait'
  
  # Generates a background story using our LLM service
  # @return [Boolean] True if background was generated and saved successfully
  def generate_background
    # Get the prompt from our PromptService
    prompt = Llm::PromptService.generate(
      request_type: 'character_background',
      provider: Rails.configuration.llm.provider,
      name: name,
      race: race,
      class_type: class_type,
      level: level,
      alignment: alignment
    )
    
    # Send the request to our LLM service with schema validation
    response = Llm::Service.chat_with_schema(
      messages: [{ 'role' => 'user', 'content' => prompt['user_prompt'] }],
      system_prompt: prompt['system_prompt'],
      schema: prompt['schema']
    )
    
    # Update the character with the generated content
    update!(
      background: response['background'],
      personality_traits: response['personality_traits']
    )
  end
  
  # Generates a portrait using the ImageGeneration service
  # @return [CharacterPortrait] The generated portrait
  def generate_portrait
    ImageGeneration::Service.generate(character: self)
  end
end

Notice how clean and domain-focused these methods are. The generate_background method knows nothing about API keys, rate limiting, or retry strategies. It simply requests a character background and updates the model with the results. Each line tells a clear story:

First, we get a prompt tailored to this specific character by passing its attributes to our PromptService
Next, we send that prompt to our LLM service using schema validation to ensure structured results
Finally, we update the character record with the generated content

This approach keeps our model focused on its domain responsibilities rather than AI implementation details. The generate_portrait method is even simpler—a single line that delegates to our image generation service. This clean separation of concerns exemplifies how well-designed service layers can simplify domain models.

For production applications, handling AI generation synchronously isn’t ideal, especially for longer outputs like character backgrounds. We’re planning to enhance this with background job processing:

# Future implementation - app/models/character.rb
def generate_background_async
  # Enqueue background job and update status
  GenerateBackgroundJob.perform_later(id)
  update!(generation_status: :pending)
end

# app/jobs/generate_background_job.rb
class GenerateBackgroundJob < ApplicationJob
  # @param character_id [Integer] ID of the character to generate background for
  def perform(character_id)
    character = Character.find(character_id)
    character.update!(generation_status: :generating)
    
    begin
      # Generate the background using our existing service
      prompt = Llm::PromptService.generate(
        request_type: 'character_background',
        provider: Rails.configuration.llm.provider,
        name: character.name,
        # Other character attributes...
      )
      
      response = Llm::Service.chat_with_schema(
        messages: [{ 'role' => 'user', 'content' => prompt['user_prompt'] }],
        system_prompt: prompt['system_prompt'],
        schema: prompt['schema']
      )
      
      # Update character with generated content
      character.update!(
        background: response['background'],
        personality_traits: response['personality_traits'],
        generation_status: :completed
      )
      
      # Broadcast update to UI using Turbo Streams
      broadcast_replace_to character,
        target: "character_#{character.id}_background",
        partial: "characters/background"
    rescue => e
      # Handle errors by updating status and showing error in UI
      character.update!(generation_status: :failed)
      Rails.logger.error("Background generation failed: #{e.message}")
      
      broadcast_replace_to character,
        target: "character_#{character.id}_background",
        partial: "characters/background_error"
    end
  end
end

This asynchronous approach adds three important capabilities:

Status tracking: The character model tracks generation status (pending, generating, completed, failed), providing clear visibility into the process
Background processing: Using ActiveJob moves generation out of the request cycle, preventing timeouts on slower LLM responses
Real-time updates: Turbo Streams broadcasts updates directly to the UI when generation completes or fails

This implementation maintains all the benefits of our service architecture while adding production-ready features for a better user experience. The job still uses our existing service layer, demonstrating how well our design accommodates different execution contexts.

What’s particularly elegant about this approach is how it leverages Rails’ ecosystem. ActiveJob handles the background processing, ActiveRecord manages the character state, and Turbo Streams delivers real-time updates—all working together with our custom service layer in a cohesive system.

Conclusion

Our implementation of the LLM service layer demonstrates how Rails’ conventions naturally support complex AI integrations.

Key wins include a modular provider system, robust retry logic, and cached prompts—all built on Rails’ predictable patterns. This service layer’s modularity, clear interfaces, and Rails-aligned conventions make it both powerful and easy to maintain—proof that “boring” can be brilliant.

By following established Rails patterns like service objects, factories, and configuration management, we’ve created a system that can handle AI interactions reliably while remaining flexible for future development.

In our next article, we’ll delve into specific LLM provider implementations, focusing on how we handle structured responses, rate limiting, and token streaming with providers like Claude and GPT.

If you’re eager to see these concepts in action, the complete source code for our D&D character generator is available on GitHub. The repository includes all the components we’ve discussed, along with comprehensive documentation and setup instructions.