AWS Architecture Guide: Production Patterns and Best Practices
Everything I’ve learned building on AWS since 2012, organized by domain.
In-depth guides, insights, and best practices for modern software engineering
Everything I’ve learned building on AWS since 2012, organized by domain.
This is the hub for everything I’ve written about Kubernetes. Whether you’re setting up your first cluster or optimizing a multi-tenant …
Read Article →I’ve been running Kubernetes in production for years now, and there’s a specific kind of pain that only hits you once you cross the …
Read Article →Last year I ported an image processing pipeline from JavaScript to Rust compiled to WebAssembly. The JS version took 1.2 seconds to apply a chain of …
Read Article →Aurora Serverless v2 is what v1 should have been. I don’t say that lightly — I ran v1 in production for two years and spent more time fighting …
Read Article →99.99% availability sounds great until you realize that’s 4 minutes and 19 seconds of downtime per month. Four minutes. That’s barely …
Read Article →I mass-deleted requirements.txt files from a monorepo last month. Fourteen of them. Some had unpinned dependencies, some had pins from 2021, one had a …
NGINX Ingress is the Honda Civic of ingress controllers. Boring, reliable, gets the job done. I’ve deployed it on dozens of clusters and …
Read Article →I deleted roughly 2,000 lines of orchestration code from our payment processing service last year. Replaced it with about 200 lines of Amazon States …
Read Article →Most Terraform code has zero tests. That’s insane for something managing production infrastructure. We wouldn’t ship application code …
Read Article →I spent four hours on a Tuesday night debugging a 30-second API call. Four hours. The call touched 12 services — auth, inventory, pricing, three …
Read Article →If you’re not scanning container images before they hit production, it’s only a matter of time before something ugly shows up in your …
Read Article →EventBridge is the most underused AWS service. I’ll die on that hill. Teams will build these elaborate Rube Goldberg machines out of SNS topics, …
Read Article →Don’t optimize until you’ve profiled. I’ve watched teams rewrite entire modules that weren’t even the bottleneck. Weeks of …
Read Article →Operator SDK vs kubebuilder — I pick kubebuilder every time. Operator SDK wraps kubebuilder anyway, adds a layer of abstraction that mostly just gets …
Read Article →I got paged at 3am on a Tuesday because a Rust service I’d deployed two weeks earlier crashed hard. No graceful degradation, no useful error …
Read Article →I use both. Terraform for multi-cloud, CDK when it’s pure AWS and the team knows TypeScript. That’s the short answer. But the long answer …
Read Article →Platform engineering is DevOps done right. Or maybe it’s DevOps with a product mindset. Either way, it’s the recognition that telling …
Read Article →CPU-based autoscaling is a lie for most web services. There, I said it.
I spent a painful week last year watching an HPA scale our API pods from 3 to …
Read Article →Goroutines are cheap. Goroutine leaks are not.
I learned this the hard way at 2am on a Tuesday, staring at Grafana dashboards showing one of our …
Read Article →VPNs are not zero trust. Stop calling them that.
I can’t count how many times I’ve sat in architecture reviews where someone points at a …
Read Article →If you’re writing Python without type hints in 2026, you’re making life harder for everyone — including future you. I held out for a …
Read Article →I got a call from a startup founder last year. “Our AWS bill just hit $47,000 and we have twelve engineers.” They’d been running for …
Read Article →I’m going to say something that’ll upset people: if your developers have cluster-admin access in production, you’re running on …
Read Article →