When managing multiple servers, it’s important to have a reliable load balancer in place. In many cases, you may want to prioritize one server, using it as the primary, while having a backup server only take over when the primary is unavailable. This approach is called a “preference-based failover.”

In this blog post, we’ll go through the steps to configure Caddy as a reverse proxy load balancer with a preference-based approach. The setup will prioritize Server A and switch to Server B only if Server A becomes unreachable, using Caddy’s built-in health checks.

Use Case

I like to run Ollama, a LLM model serving platform, on two machines in my home lab. One machine is a beefy setup with a powerful GPU, which is great for handling intensive AI workloads, but it’s not always on. The second machine is slower and doesn’t have a GPU, but it’s always running and available.

To make managing these two machines easier, I want a solution where I can refer to both machines using a single, shorthand name (like ollama.example.com). The setup should automatically route requests to the beefy GPU machine when it’s available, but fall back to the slower machine when the primary machine is offline. Caddy’s reverse proxy with load balancing and health checks provides an ideal solution for this use case.

Why Caddy?

Caddy is an excellent choice for this type of setup because:

  • It’s simple to configure with a human-readable Caddyfile.
  • The Caddyfile can easily be backed up into version control.
  • Built-in support for reverse proxying and load balancing.
  • Automatic HTTPS provisioning using Let’s Encrypt.
  • Advanced health checks and load balancing policies.

The Scenario

In this example, you have two servers:

  • Server A: 192.168.1.5:11434
  • Server B: 192.168.1.6:11434

The goal is to have Caddy reverse proxy traffic to Server A by default, and only failover to Server B when Server A is not responding.

Caddyfile Configuration

Here’s how you can configure your Caddyfile for this preference-based load balancing approach:

@ollama host ollama.example.com
handle @ollama {
        reverse_proxy {
                to http://192.168.1.5:11434 http://192.168.1.6:11434
                health_uri /
                health_interval 10s
                health_timeout 2s
                health_status 200
                fail_duration 30s
                lb_policy first
        }
}

Explanation of the Configuration

  1. Reverse Proxy Targets: to http://192.168.1.5:11434 http://192.168.1.6:11434 Specifies the two servers (A and B) that traffic will be proxied to. Server A is listed first, which tells Caddy to prefer it.

  2. Health Checks:

    • health_uri: / This is the URI that Caddy will request to check the health of the server.
    • health_interval: 10s Caddy will check the health of each server every 10 seconds.
    • health_timeout: 2s Each health check request has a timeout of 2 seconds.
    • health_status: 200 The expected HTTP status code for a healthy server.
  3. Failover Timing:

    • fail_duration: 30s If a server fails a health check, it will be considered unhealthy for 30 seconds before it’s checked again.
  4. Load Balancing Policy:

    • lb_policy first The first load balancing policy ensures that traffic is sent to the first available server in the list. This is what creates the preference for Server A. If Server A is down, traffic will failover to Server B.

How It Works

  • Server A (192.168.1.5) is the preferred server, and all traffic will be directed to it by default.
  • Caddy will regularly check the health of Server A using the specified health check settings (interval, timeout, URI).
  • If Server A is down or becomes unresponsive, Caddy will route traffic to Server B (192.168.1.6).
  • After Server A recovers and becomes healthy again, Caddy will automatically revert traffic back to Server A.

Testing Your Setup

You can verify this setup by:

  1. Temporarily stopping Server A to simulate a failure.
  2. Checking the logs or using tools like curl to see that requests are being routed to Server B.
  3. Restarting Server A and watching traffic automatically revert to it.

Conclusion

This setup is ideal when you want a highly reliable system with failover support but still prefer one server over another in normal operation. With just a few lines of configuration, Caddy provides powerful load balancing features and automatic health checks, making it easy to create a robust and resilient infrastructure.

By using the lb_policy first in combination with health checks, you can ensure that your preferred server handles traffic under normal conditions while maintaining a backup server for failover in case of any issues.