Implementing Capacity Planning

Let’s explore how to implement capacity planning in practice.

Capacity Planning Process

A structured capacity planning process includes:

  1. Data Collection

    • Gather historical usage data
    • Collect business projections
    • Document system dependencies
    • Measure resource consumption
  2. Analysis and Forecasting

    • Identify trends and patterns
    • Generate demand forecasts
    • Model resource requirements
    • Create capacity plans
  3. Implementation

    • Provision resources according to plan
    • Configure auto-scaling policies
    • Implement capacity alerts
    • Document capacity decisions
  4. Monitoring and Adjustment

    • Track actual vs. forecast usage
    • Measure forecast accuracy
    • Adjust models based on observations
    • Update capacity plans regularly

Capacity Planning Tools

Several tools can assist with capacity planning:

  1. Monitoring Systems

    • Prometheus + Grafana
    • Datadog
    • New Relic
    • Dynatrace
  2. Forecasting Tools

    • Prophet (Facebook)
    • StatsModels (Python)
    • TensorFlow Time Series
    • Amazon Forecast
  3. Resource Modeling

    • Custom simulation tools
    • Queueing calculators
    • Load testing frameworks (JMeter, Locust)
    • Cloud provider calculators
  4. Capacity Management

    • Kubernetes Cluster Autoscaler
    • AWS Auto Scaling
    • Terraform for infrastructure as code
    • Custom capacity management systems

Example: Capacity Planning for a Web Service

Let’s walk through a capacity planning example for a web service:

Step 1: Collect and analyze historical data

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose

# Load historical data
data = pd.read_csv('request_data.csv', parse_dates=['timestamp'])
data.set_index('timestamp', inplace=True)

# Resample to hourly data
hourly_data = data['requests'].resample('H').sum()

# Analyze seasonality
result = seasonal_decompose(hourly_data, model='multiplicative', period=24*7)  # Weekly seasonality

# Plot components
fig, (ax1, ax2, ax3, ax4) = plt.subplots(4, 1, figsize=(12, 10))
result.observed.plot(ax=ax1, title='Observed')
result.trend.plot(ax=ax2, title='Trend')
result.seasonal.plot(ax=ax3, title='Seasonality')
result.resid.plot(ax=ax4, title='Residuals')
plt.tight_layout()
plt.savefig('seasonality_analysis.png')

Step 2: Forecast future demand

from fbprophet import Prophet

# Prepare data for Prophet
prophet_data = pd.DataFrame({
    'ds': hourly_data.index,
    'y': hourly_data.values
})

# Create and fit model
model = Prophet(
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=True,
    changepoint_prior_scale=0.05
)
model.fit(prophet_data)

# Make future dataframe
future = model.make_future_dataframe(periods=24*30, freq='H')  # Forecast 30 days

# Forecast
forecast = model.predict(future)

# Plot forecast
fig = model.plot(forecast)
plt.title('Request Forecast')
plt.ylabel('Requests per Hour')
plt.savefig('request_forecast.png')

# Extract peak forecast
peak_forecast = forecast['yhat_upper'].max()

Step 3: Model resource requirements

# Resource requirements per request (from load testing)
resources_per_request = {
    'cpu_cores': 0.0002,  # CPU cores per request
    'memory_mb': 0.5,     # MB of memory per request
    'disk_iops': 0.01,    # Disk IOPS per request
    'network_mbps': 0.005 # Mbps per request
}

# Calculate resource needs for peak forecast
peak_resources = {
    'cpu_cores': peak_forecast * resources_per_request['cpu_cores'],
    'memory_mb': peak_forecast * resources_per_request['memory_mb'],
    'disk_iops': peak_forecast * resources_per_request['disk_iops'],
    'network_mbps': peak_forecast * resources_per_request['network_mbps']
}

# Add headroom (50%)
headroom_factor = 1.5
capacity_plan = {k: v * headroom_factor for k, v in peak_resources.items()}

print("Capacity Plan:")
for resource, amount in capacity_plan.items():
    print(f"- {resource}: {amount:.2f}")

Step 4: Translate to infrastructure

# Instance types and their resources
instance_types = {
    'small': {
        'cpu_cores': 2,
        'memory_mb': 4096,
        'cost_per_hour': 0.05
    },
    'medium': {
        'cpu_cores': 4,
        'memory_mb': 8192,
        'cost_per_hour': 0.10
    },
    'large': {
        'cpu_cores': 8,
        'memory_mb': 16384,
        'cost_per_hour': 0.20
    }
}

# Calculate instances needed
def calculate_instances(capacity_plan, instance_type):
    specs = instance_types[instance_type]
    cpu_instances = math.ceil(capacity_plan['cpu_cores'] / specs['cpu_cores'])
    memory_instances = math.ceil(capacity_plan['memory_mb'] / specs['memory_mb'])
    return max(cpu_instances, memory_instances)

# Calculate for each instance type
instance_counts = {
    instance_type: calculate_instances(capacity_plan, instance_type)
    for instance_type in instance_types
}

# Calculate costs
instance_costs = {
    instance_type: count * instance_types[instance_type]['cost_per_hour'] * 24 * 30
    for instance_type, count in instance_counts.items()
}

# Find most cost-effective option
most_cost_effective = min(instance_costs, key=instance_costs.get)

print(f"Most cost-effective option: {instance_counts[most_cost_effective]} {most_cost_effective} instances")
print(f"Monthly cost: ${instance_costs[most_cost_effective]:.2f}")

Step 5: Implement capacity plan

# Kubernetes deployment with HPA
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-service
spec:
  replicas: 10  # Initial capacity
  selector:
    matchLabels:
      app: web-service
  template:
    metadata:
      labels:
        app: web-service
    spec:
      containers:
      - name: web-service
        image: web-service:1.0.0
        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits:
            cpu: "1000m"
            memory: "1Gi"
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-service-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-service
  minReplicas: 5
  maxReplicas: 50
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Advanced Capacity Planning Strategies

As your systems mature, consider these advanced strategies:

Multi-Region Capacity Planning

Planning capacity across multiple regions requires additional considerations:

  1. Regional Traffic Distribution: How traffic is distributed geographically
  2. Failover Scenarios: Capacity needed during regional failures
  3. Data Replication: Impact of data synchronization on capacity
  4. Latency Requirements: How latency affects regional deployment

Example Multi-Region Capacity Plan:

regions:
  us-east:
    normal_traffic_percentage: 40
    peak_rps: 5000
    instances:
      baseline: 20
      peak: 30
      failover: 50  # Can handle us-west failure
  us-west:
    normal_traffic_percentage: 30
    peak_rps: 3750
    instances:
      baseline: 15
      peak: 25
      failover: 45  # Can handle us-east failure
  eu-central:
    normal_traffic_percentage: 20
    peak_rps: 2500
    instances:
      baseline: 10
      peak: 15
      failover: 20  # Not a failover region
  ap-southeast:
    normal_traffic_percentage: 10
    peak_rps: 1250
    instances:
      baseline: 5
      peak: 10
      failover: 15  # Not a failover region