Introduction and Setup
My first production Docker deployment was a disaster. I thought running containers in production would be as simple as docker run
with a few extra flags. Three hours into the deployment, our application was down, the database was corrupted, and I was frantically trying to figure out why containers kept restarting in an endless loop.
That painful experience taught me that production Docker deployments are fundamentally different from development. The stakes are higher, the complexity is greater, and the margin for error is essentially zero.
Why Production Is Different
Development Docker usage focuses on convenience and speed. You can restart containers, lose data, and experiment freely. Production deployment requires reliability, security, monitoring, and the ability to handle real user traffic without downtime.
The biggest lesson I’ve learned: production deployment isn’t about containers - it’s about building systems that happen to use containers. The container is just the packaging; the real work is in orchestration, networking, storage, monitoring, and operations.
Essential Infrastructure Components
Production Docker deployments need solid foundations. Here’s what I consider essential:
Container Orchestration: I use Kubernetes for most production deployments. It handles service discovery, load balancing, health management, and scaling automatically.
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: web-app
image: myapp:v1.0.0
ports:
- containerPort: 8080
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
Load Balancing and Ingress:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- api.example.com
secretName: api-tls
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web-app-service
port:
number: 80
Security Foundation
Security must be configured from day one. I implement security at multiple layers:
apiVersion: v1
kind: Pod
metadata:
name: secure-app
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
containers:
- name: app
image: myapp:v1.0.0
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
Monitoring Setup
I set up monitoring before deploying applications, not after:
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
data:
prometheus.yml: |
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
Deployment Pipeline
I use GitOps for production deployments because it provides auditability and rollback capabilities:
name: Deploy to Production
on:
push:
tags: ['v*']
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Build and Push Image
run: |
docker build -t ${{ secrets.REGISTRY }}/myapp:${{ github.sha }} .
docker push ${{ secrets.REGISTRY }}/myapp:${{ github.sha }}
- name: Security Scan
run: |
trivy image --exit-code 1 --severity HIGH,CRITICAL \
${{ secrets.REGISTRY }}/myapp:${{ github.sha }}
- name: Deploy
run: |
kubectl set image deployment/myapp \
myapp=${{ secrets.REGISTRY }}/myapp:${{ github.sha }}
kubectl rollout status deployment/myapp --timeout=300s
Health Checks
Every production container needs proper health checks:
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
EXPOSE 3000
CMD ["node", "server.js"]
Common Pitfalls
I’ve made every production deployment mistake possible:
Insufficient resource limits. Containers without resource limits can consume all available CPU and memory, bringing down entire nodes.
Missing health checks. Applications that don’t implement proper health checks can’t be managed effectively by orchestrators.
Inadequate monitoring. You can’t fix what you can’t see. Comprehensive monitoring is essential for production operations.
No rollback plan. Every deployment needs a tested rollback procedure for when things go wrong.
Development Workflow
I establish clear workflows that make production deployments predictable and safe:
- Feature Development: Work in feature branches with local Docker Compose
- Integration Testing: Deploy to development cluster
- Staging Validation: Deploy to staging for final validation
- Production Deployment: Automated deployment with monitoring
- Post-Deployment Verification: Confirm application health
This foundation provides the reliability and operational capabilities needed for production Docker deployments. The key is building these capabilities before you need them, not after problems arise.
Next, we’ll explore core concepts including container orchestration, service discovery, and load balancing that make production deployments scalable and reliable.