Rate Limiting vs. Throttling

While often used interchangeably, rate limiting and throttling have subtle differences:

  • Rate Limiting: Typically involves rejecting requests that exceed a threshold
  • Throttling: May involve delaying requests (queuing) rather than rejecting them outright

Here’s a simple throttling implementation that delays requests instead of rejecting them:

package main

import (
	"fmt"
	"sync"
	"time"
)

// ThrottledLimiter implements a throttling rate limiter
type ThrottledLimiter struct {
	mu           sync.Mutex
	lastRequest  time.Time
	minInterval  time.Duration
}

// NewThrottledLimiter creates a new throttling rate limiter
func NewThrottledLimiter(requestsPerSecond int) *ThrottledLimiter {
	return &ThrottledLimiter{
		minInterval: time.Second / time.Duration(requestsPerSecond),
	}
}

// Wait blocks until the request can be processed according to the rate limit
func (l *ThrottledLimiter) Wait() {
	l.mu.Lock()
	defer l.mu.Unlock()
	
	now := time.Now()
	
	// If this is not the first request, calculate how long to wait
	if !l.lastRequest.IsZero() {
		elapsed := now.Sub(l.lastRequest)
		if elapsed < l.minInterval {
			sleepDuration := l.minInterval - elapsed
			time.Sleep(sleepDuration)
			now = time.Now() // Update now after sleeping
		}
	}
	
	l.lastRequest = now
}

func main() {
	// Create a throttled limiter: 2 requests per second
	limiter := NewThrottledLimiter(2)
	
	// Simulate 5 requests
	for i := 1; i <= 5; i++ {
		start := time.Now()
		limiter.Wait()
		elapsed := time.Since(start)
		fmt.Printf("Request %d: waited %v\n", i, elapsed)
	}
}

This throttling approach ensures a consistent rate by introducing delays between requests, which can be useful for scenarios where dropping requests is undesirable.

Advanced Rate Limiting Algorithms

While simple counters can work for basic use cases, more sophisticated algorithms provide better fairness, accuracy, and performance characteristics. Let’s explore the most widely used advanced rate limiting algorithms.

Token Bucket Algorithm

The token bucket algorithm is one of the most popular rate limiting approaches due to its simplicity and effectiveness. It models rate limiting as a bucket that continuously fills with tokens at a fixed rate. Each request consumes a token, and when the bucket is empty, requests are rejected.

Here’s a comprehensive implementation:

package main

import (
	"fmt"
	"sync"
	"time"
)

// TokenBucket implements the token bucket algorithm for rate limiting
type TokenBucket struct {
	mu            sync.Mutex
	tokens        float64
	maxTokens     float64
	refillRate    float64 // tokens per second
	lastRefillTime time.Time
}

// NewTokenBucket creates a new token bucket rate limiter
func NewTokenBucket(maxTokens, refillRate float64) *TokenBucket {
	return &TokenBucket{
		tokens:        maxTokens,
		maxTokens:     maxTokens,
		refillRate:    refillRate,
		lastRefillTime: time.Now(),
	}
}

// refill adds tokens to the bucket based on elapsed time
func (tb *TokenBucket) refill() {
	now := time.Now()
	elapsed := now.Sub(tb.lastRefillTime).Seconds()
	
	// Calculate tokens to add based on elapsed time
	newTokens := elapsed * tb.refillRate
	
	// Update token count, capped at maxTokens
	tb.tokens = min(tb.tokens+newTokens, tb.maxTokens)
	tb.lastRefillTime = now
}

// Allow checks if a request can proceed and consumes a token if allowed
func (tb *TokenBucket) Allow() bool {
	tb.mu.Lock()
	defer tb.mu.Unlock()
	
	tb.refill()
	
	if tb.tokens >= 1.0 {
		tb.tokens--
		return true
	}
	
	return false
}

// AllowN checks if N requests can proceed and consumes N tokens if allowed
func (tb *TokenBucket) AllowN(n float64) bool {
	tb.mu.Lock()
	defer tb.mu.Unlock()
	
	tb.refill()
	
	if tb.tokens >= n {
		tb.tokens -= n
		return true
	}
	
	return false
}

// min returns the minimum of two float64 values
func min(a, b float64) float64 {
	if a < b {
		return a
	}
	return b
}

func main() {
	// Create a token bucket: 5 max tokens, refill rate of 2 tokens per second
	limiter := NewTokenBucket(5, 2)
	
	// Simulate burst of 5 requests (should all succeed due to initial token count)
	for i := 1; i <= 5; i++ {
		allowed := limiter.Allow()
		fmt.Printf("Request %d: %v\n", i, allowed)
	}
	
	// The 6th request should be rejected (no tokens left)
	allowed := limiter.Allow()
	fmt.Printf("Request 6: %v\n", allowed)
	
	// Wait for some tokens to refill
	fmt.Println("Waiting for token refill...")
	time.Sleep(2 * time.Second) // Wait for ~4 tokens to be refilled
	
	// Try 4 more requests
	for i := 7; i <= 10; i++ {
		allowed := limiter.Allow()
		fmt.Printf("Request %d: %v\n", i, allowed)
	}
	
	// Demonstrate AllowN for a request that needs multiple tokens
	fmt.Println("Waiting for token refill...")
	time.Sleep(3 * time.Second)
	
	// Try to consume 3 tokens at once
	allowed = limiter.AllowN(3)
	fmt.Printf("Bulk request (3 tokens): %v\n", allowed)
}

Key advantages of the token bucket algorithm:

  1. Burst Handling: It naturally accommodates bursts up to the bucket capacity
  2. Smooth Rate Limiting: The continuous refill provides a smoother limiting effect
  3. Flexibility: Can be adapted for different token costs per request type