Introduction | Andrew Odendaal

Understanding Load Balancing in Distributed Systems

Load balancing in distributed systems operates at multiple levels, from DNS-based global load balancing to application-level request distribution. Before diving into specific strategies, let’s understand the key objectives and challenges.

Key Objectives of Load Balancing

Even Distribution: Spread workload evenly across available resources
High Availability: Ensure service continuity even when some components fail
Scalability: Accommodate growing workloads by adding resources
Efficiency: Optimize resource utilization
Latency Reduction: Minimize response times for end users

Load Balancing Layers

Load balancing can be implemented at different layers of the system:

┌─────────────────────────────────────────────────────────┐
│                  Global Load Balancing                  │
│                  (DNS, GeoDNS, Anycast)                 │
└───────────────────────────┬─────────────────────────────┘
                            │
┌───────────────────────────▼─────────────────────────────┐
│                 Regional Load Balancing                 │
│                 (L4/L7 Load Balancers)                 │
└───────────────────────────┬─────────────────────────────┘
                            │
┌───────────────────────────▼─────────────────────────────┐
│                  Local Load Balancing                   │
│            (Service Mesh, Client-Side Balancing)        │
└─────────────────────────────────────────────────────────┘

Load Balancing Algorithms

The choice of load balancing algorithm significantly impacts system performance and resource utilization. Let’s explore the most common algorithms and their use cases.

1. Round Robin

Round Robin is one of the simplest load balancing algorithms, distributing requests sequentially across the server pool.

Implementation Example: Nginx Round Robin Configuration

http {
    upstream backend {
        server backend1.example.com;
        server backend2.example.com;
        server backend3.example.com;
    }
    
    server {
        listen 80;
        
        location / {
            proxy_pass http://backend;
        }
    }
}

When to Use Round Robin

When servers have similar capabilities and resources
For simple deployments with relatively uniform request patterns
As a starting point before implementing more complex algorithms

Limitations

Doesn’t account for server load or capacity differences
Doesn’t consider connection duration or request complexity
May lead to uneven distribution with varying request processing times

2. Weighted Round Robin

Weighted Round Robin extends the basic Round Robin by assigning weights to servers based on their capacity or performance.

Implementation Example: HAProxy Weighted Round Robin

global
    log 127.0.0.1 local0
    maxconn 4096
    
defaults
    log global
    mode http
    timeout connect 10s
    timeout client 30s
    timeout server 30s
    
frontend http-in
    bind *:80
    default_backend servers
    
backend servers
    balance roundrobin
    server server1 192.168.1.10:80 weight 5 check
    server server2 192.168.1.11:80 weight 3 check
    server server3 192.168.1.12:80 weight 2 check

When to Use Weighted Round Robin

When servers have different capacities or performance characteristics
In heterogeneous environments with varying instance types
When gradually introducing new servers or phasing out old ones

Limitations

Static weights don’t adapt to changing server conditions
Requires manual tuning as system evolves
Doesn’t account for actual server load

3. Least Connections

The Least Connections algorithm directs traffic to the server with the fewest active connections, assuming that fewer connections indicate more available capacity.

Implementation Example: Nginx Least Connections

http {
    upstream backend {
        least_conn;
        server backend1.example.com;
        server backend2.example.com;
        server backend3.example.com;
    }
    
    server {
        listen 80;
        
        location / {
            proxy_pass http://backend;
        }
    }
}

When to Use Least Connections

When request processing times vary significantly
For workloads with long-lived connections
When servers have similar processing capabilities

Limitations

Connection count doesn’t always correlate with server load
Doesn’t account for connection complexity or resource usage
May not be optimal for very short-lived connections

4. Weighted Least Connections

Weighted Least Connections combines the Least Connections approach with server weighting to account for different server capacities.

Continue Your Learning

This is part 1 of 4 in the comprehensive guide.

Guide Overview See all 4 parts Next → Fundamentals and Core Concepts