DevOps

111 articles about devops development, tools, and best practices

Karpenter v1 vs Cluster Autoscaler: A Production Migration Story

May 30, 2026 #Kubernetes #AWS #DevOps

I’d been running Cluster Autoscaler on our production EKS cluster for years. It worked. It wasn’t exciting, it wasn’t cheap, but it …

Read Article →

AWS Organizations and Control Tower: Multi-Account Strategy

May 20, 2026 #AWS #Security #DevOps

Single-account AWS is a ticking time bomb. I don’t say that lightly. I’ve watched it blow up firsthand, and I’ve spent more hours …

Read Article →

Kubernetes eBPF: Next-Generation Observability and Security

May 20, 2026 #Kubernetes #Security #Observability

eBPF is the biggest shift in Linux observability since strace. I don’t say that lightly. I’ve spent years wiring up monitoring stacks, …

Read Article →

DevSecOps Pipeline: Integrating Security into Every Stage

May 19, 2026 #DevSecOps #Security #CI/CD

Security as a gate at the end of the pipeline is security theater. I’ve believed this for years, but it took watching a real incident unfold to …

Read Article →

Chaos Engineering on AWS: Fault Injection Simulator Guide

May 12, 2026 #AWS #SRE #DevOps

You don’t know your system is resilient until you’ve broken it on purpose.

I believed our payment processing service was fault tolerant. …

Read Article →

Kubernetes Gateway API: The Future of Ingress

May 5, 2026 #Kubernetes #Networking #DevOps

Gateway API is what Ingress should have been from day one.

I don’t say that lightly. I’ve spent years wrangling Kubernetes Ingress …

Read Article →

Infrastructure Drift Detection and Remediation

April 25, 2026 #Terraform #AWS #DevOps

If you’re not running scheduled terraform plan, you have drift. You just don’t know it yet.

I learned this the hard way. A colleague made …

Read Article →

AWS Architecture Guide: Production Patterns and Best Practices

April 24, 2026 #AWS #Cloud-Native #DevOps

Everything I’ve learned building on AWS since 2012, organized by domain.

Serverless

AWS Lambda Cold Starts: Causes, Measurement, and Mitigation …

Read Article →

Kubernetes Guide: From Basics to Production Operations

April 23, 2026 #Kubernetes #DevOps #Containers

This is the hub for everything I’ve written about Kubernetes. Whether you’re setting up your first cluster or optimizing a multi-tenant …

Read Article →

Kubernetes Multi-Cluster Management with Fleet and Rancher

April 22, 2026 #Kubernetes #DevOps #GitOps

I’ve been running Kubernetes in production for years now, and there’s a specific kind of pain that only hits you once you cross the …

Read Article →

Implementing SLOs and Error Budgets in Practice

April 11, 2026 #SRE #DevOps #Monitoring

99.99% availability sounds great until you realize that’s 4 minutes and 19 seconds of downtime per month. Four minutes. That’s barely …

Read Article →

Python Packaging in 2026: uv, Poetry, and the Modern Ecosystem

April 8, 2026 #Python #Programming #DevOps

I mass-deleted requirements.txt files from a monorepo last month. Fourteen of them. Some had unpinned dependencies, some had pins from 2021, one had a …

Read Article →

Kubernetes Ingress Controllers: NGINX vs Traefik vs Istio Gateway

April 4, 2026 #Kubernetes #Networking #DevOps

NGINX Ingress is the Honda Civic of ingress controllers. Boring, reliable, gets the job done. I’ve deployed it on dozens of clusters and …

Read Article →

AWS Step Functions: Orchestrating Complex Workflows

April 1, 2026 #AWS #Serverless #Architecture

I deleted roughly 2,000 lines of orchestration code from our payment processing service last year. Replaced it with about 200 lines of Amazon States …

Read Article →

Terraform Testing: Unit, Integration, and End-to-End

March 28, 2026 #Terraform #DevOps #Testing

Most Terraform code has zero tests. That’s insane for something managing production infrastructure. We wouldn’t ship application code …

Read Article →

Distributed Tracing with OpenTelemetry: A Complete Guide

March 25, 2026 #Observability #DevOps #Distributed Systems

I spent four hours on a Tuesday night debugging a 30-second API call. Four hours. The call touched 12 services — auth, inventory, pricing, three …

Read Article →

Container Security Scanning in CI/CD Pipelines

March 21, 2026 #Security #Docker #CI/CD

If you’re not scanning container images before they hit production, it’s only a matter of time before something ugly shows up in your …

Read Article →

AWS EventBridge: Building Event-Driven Architectures

March 17, 2026 #AWS #Serverless #Architecture

EventBridge is the most underused AWS service. I’ll die on that hill. Teams will build these elaborate Rube Goldberg machines out of SNS topics, …

Read Article →

Kubernetes Operators: Building Custom Controllers in Go

March 10, 2026 #Kubernetes #Go #DevOps

Operator SDK vs kubebuilder — I pick kubebuilder every time. Operator SDK wraps kubebuilder anyway, adds a layer of abstraction that mostly just gets …

Read Article →

AWS CDK vs Terraform: A Practical Comparison in 2026

March 3, 2026 #AWS #Terraform #CDK

I use both. Terraform for multi-cloud, CDK when it’s pure AWS and the team knows TypeScript. That’s the short answer. But the long answer …

Read Article →

Platform Engineering: Building an Internal Developer Platform

February 28, 2026 #DevOps #Platform Engineering #Kubernetes

Platform engineering is DevOps done right. Or maybe it’s DevOps with a product mindset. Either way, it’s the recognition that telling …

Read Article →

Kubernetes Horizontal Pod Autoscaling with Custom Metrics

February 25, 2026 #Kubernetes #DevOps #Performance

CPU-based autoscaling is a lie for most web services. There, I said it.

I spent a painful week last year watching an HPA scale our API pods from 3 to …

Read Article →

Implementing Zero-Trust Networking on AWS

February 18, 2026 #AWS #Security #Networking

VPNs are not zero trust. Stop calling them that.

I can’t count how many times I’ve sat in architecture reviews where someone points at a …

Read Article →

AWS Cost Optimization: 15 Techniques That Actually Work

February 10, 2026 #AWS #DevOps #Cloud

I got a call from a startup founder last year. “Our AWS bill just hit $47,000 and we have twelve engineers.” They’d been running for …

Read Article →