Production Kubernetes Deployment

Moving from Docker Compose to Kubernetes felt overwhelming at first. The complexity seemed unnecessary for simple applications. But after managing production workloads for years, I understand why Kubernetes became the standard - it handles the operational complexity that emerges at scale.

Kubernetes Architecture and Core Concepts

Kubernetes orchestrates containers across multiple machines, providing features that Docker Compose can’t match: automatic failover, rolling updates, service discovery, and resource management across a cluster.

The key components you need to understand:

  • Pods: The smallest deployable units, usually containing one container
  • Deployments: Manage replica sets and rolling updates
  • Services: Provide stable network endpoints for pods
  • ConfigMaps and Secrets: Manage configuration and sensitive data
  • Ingress: Handle external traffic routing

Basic deployment example:

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  labels:
    app: web-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      containers:
      - name: web
        image: myregistry.io/web-app:v1.2.0
        ports:
        - containerPort: 8080
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: url
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5

This deployment creates three replicas with proper resource limits and health checks. The probes ensure Kubernetes knows when your application is healthy and ready to receive traffic.

Service Discovery and Load Balancing

Services provide stable endpoints for your pods:

# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: web-app-service
spec:
  selector:
    app: web-app
  ports:
  - protocol: TCP
    port: 80
    targetPort: 8080
  type: ClusterIP
---
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-app-ingress
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
  - hosts:
    - myapp.example.com
    secretName: web-app-tls
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: web-app-service
            port:
              number: 80

Configuration Management

Separate configuration from code using ConfigMaps and Secrets:

# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  app.properties: |
    log.level=INFO
    feature.new-ui=true
    cache.ttl=300
---
# secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
type: Opaque
data:
  url: cG9zdGdyZXNxbDovL3VzZXI6cGFzc3dvcmRAZGI6NTQzMi9teWFwcA==
  username: dXNlcg==
  password: cGFzc3dvcmQ=

Mount these in your deployment:

spec:
  template:
    spec:
      containers:
      - name: web
        volumeMounts:
        - name: config-volume
          mountPath: /etc/config
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: url
      volumes:
      - name: config-volume
        configMap:
          name: app-config

Resource Management and Autoscaling

Proper resource management prevents one application from affecting others:

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

The HPA automatically scales your deployment based on CPU and memory usage.

High Availability and Fault Tolerance

Design deployments to survive node failures:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 6
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 2
  template:
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - web-app
              topologyKey: kubernetes.io/hostname
---
# Pod Disruption Budget
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-app-pdb
spec:
  minAvailable: 4
  selector:
    matchLabels:
      app: web-app

This configuration spreads pods across different nodes and ensures at least 4 pods remain available during disruptions.

Monitoring Integration

Implement monitoring to understand your application’s behavior:

# service-monitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: web-app-metrics
spec:
  selector:
    matchLabels:
      app: web-app
  endpoints:
  - port: metrics
    interval: 30s
    path: /metrics
---
# prometheus-rule.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: web-app-alerts
spec:
  groups:
  - name: web-app
    rules:
    - alert: HighErrorRate
      expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.1
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: High error rate detected
        description: "Error rate is {{ $value }} errors per second"

Deployment Strategies

Implement safe deployment practices:

#!/bin/bash
# deploy.sh - Safe deployment script

IMAGE_TAG=${1:-latest}
NAMESPACE=${2:-default}

echo "Deploying web-app:$IMAGE_TAG to namespace $NAMESPACE"

# Update the deployment
kubectl set image deployment/web-app web=myregistry.io/web-app:$IMAGE_TAG -n $NAMESPACE

# Wait for rollout to complete
kubectl rollout status deployment/web-app -n $NAMESPACE --timeout=300s

# Verify deployment health
READY_REPLICAS=$(kubectl get deployment web-app -n $NAMESPACE -o jsonpath='{.status.readyReplicas}')
DESIRED_REPLICAS=$(kubectl get deployment web-app -n $NAMESPACE -o jsonpath='{.spec.replicas}')

if [ "$READY_REPLICAS" != "$DESIRED_REPLICAS" ]; then
    echo "Deployment failed: $READY_REPLICAS/$DESIRED_REPLICAS replicas ready"
    kubectl rollout undo deployment/web-app -n $NAMESPACE
    exit 1
fi

echo "Deployment successful"

Kubernetes provides the foundation for running containers at scale, but it requires careful planning and configuration. The complexity pays off when you need features like automatic scaling, rolling updates, and multi-region deployments. In the next part, I’ll cover CI/CD integration and how to automate your container deployment pipeline.