GitOps with ArgoCD: From Zero to Production
ArgoCD won the GitOps war. I’ll say it. Flux is fine—it works, it’s CNCF graduated, it has its fans—but ArgoCD’s UI alone makes it worth choosing. When something’s out of sync at 2am, I don’t want to be parsing CLI output. I want to click on a resource tree and see exactly what drifted.
I’ve been running ArgoCD in production across multiple clusters for a couple of years now, and this is the guide I wish I’d had when I started. We’ll go from a fresh install to a production-grade setup with app-of-apps, RBAC, SSO, multi-cluster management, and sane sync policies.
But first, a war story.
The kubectl apply Incident
I joined a team that was deploying to Kubernetes by running kubectl apply from developer laptops. No, really. The “deployment process” was a Confluence page with a list of commands. Step 4 was literally “make sure you’re pointed at the right cluster.” You can guess what happened.
One Thursday afternoon, a developer applied a staging config to production. The service mesh configuration got overwritten, traffic routing broke, and three microservices started talking to staging databases. We didn’t notice for forty minutes because our monitoring was also partially broken from the config change. The blast radius was enormous—data inconsistency across services, customers seeing other customers’ draft orders, the whole nightmare.
The postmortem was brutal. Not because anyone got blamed—the team was good about that—but because the root cause was so obviously preventable. There was no source of truth for what should be running in production. No audit trail. No way to roll back except “does anyone remember what the YAML looked like before?” Someone had a local copy. Maybe.
That’s when I pushed hard for GitOps, and specifically ArgoCD. Within two weeks we had it running. Within a month, nobody was allowed to kubectl apply anything to production. The difference was night and day.
Installing ArgoCD
Let’s get ArgoCD running. I’m assuming you’ve got a Kubernetes cluster and kubectl access. If you’re following along with my CI/CD with GitHub Actions setup, this slots right in as the deployment layer.
Create the namespace and install:
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
That gives you the full install with the UI. For production, I always use the HA manifests instead:
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/ha/install.yaml
Grab the CLI too—you’ll need it:
brew install argocd
Get the initial admin password:
argocd admin initial-password -n argocd
Change it immediately:
argocd login <your-argocd-server>
argocd account update-password
I expose ArgoCD through an ingress with TLS rather than port-forwarding. Here’s a minimal ingress:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: argocd-server
namespace: argocd
annotations:
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
spec:
ingressClassName: nginx
rules:
- host: argocd.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: argocd-server
port:
number: 443
tls:
- hosts:
- argocd.example.com
secretName: argocd-tls
Creating Your First Application
An ArgoCD Application is a CRD that maps a Git repo path to a Kubernetes namespace. Here’s the simplest possible example:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: my-app
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/your-org/k8s-manifests.git
targetRevision: main
path: apps/my-app
destination:
server: https://kubernetes.default.svc
namespace: my-app
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
The key bits: syncPolicy.automated means ArgoCD will automatically sync when it detects drift between Git and the cluster. prune: true means it’ll delete resources that are no longer in Git. selfHeal: true means if someone does a sneaky kubectl edit in production, ArgoCD will revert it. That last one is non-negotiable for me after the incident I described.
You can also create apps via CLI:
argocd app create my-app \
--repo https://github.com/your-org/k8s-manifests.git \
--path apps/my-app \
--dest-server https://kubernetes.default.svc \
--dest-namespace my-app \
--sync-policy automated \
--auto-prune \
--self-heal
I prefer the declarative YAML approach though. It’s GitOps all the way down—even your ArgoCD config lives in Git.
Helm and Kustomize Integration
ArgoCD natively understands Helm charts and Kustomize overlays. You don’t need to render anything beforehand.
For Helm, point your Application source at a chart:
spec:
source:
repoURL: https://charts.bitnami.com/bitnami
chart: postgresql
targetRevision: 13.2.0
helm:
releaseName: postgres
values: |
auth:
postgresPassword: changeme
primary:
persistence:
size: 20Gi
Or reference a values file from your own repo:
spec:
source:
repoURL: https://github.com/your-org/k8s-manifests.git
path: charts/my-app
targetRevision: main
helm:
valueFiles:
- values-production.yaml
For Kustomize, just point at a directory with a kustomization.yaml:
spec:
source:
repoURL: https://github.com/your-org/k8s-manifests.git
path: overlays/production
targetRevision: main
ArgoCD detects Kustomize automatically. If you need to pass patches or override images, you can do it inline:
spec:
source:
kustomize:
images:
- my-app=registry.example.com/my-app:v1.2.3
I use Kustomize for most things and Helm for third-party charts. That combination covers about 95% of what I need. If you’re managing multiple environments, check out my guide on GitOps multi-environment deployments for the overlay patterns I use.
The App-of-Apps Pattern
This is where ArgoCD gets really powerful. Instead of creating each Application manually, you create one “root” Application that points to a directory of Application manifests. ArgoCD then manages all of them.
Your repo structure looks like this:
argocd-apps/
├── root-app.yaml
└── apps/
├── my-app.yaml
├── postgres.yaml
├── redis.yaml
├── monitoring.yaml
└── cert-manager.yaml
The root app:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: root-app
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/your-org/argocd-apps.git
targetRevision: main
path: apps
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated:
selfHeal: true
prune: true
Each file in apps/ is a full Application manifest like the ones we created earlier. When you merge a new Application YAML into that directory, ArgoCD picks it up and deploys it. Delete the file, and ArgoCD removes the application and its resources.
This is how I manage everything. New service? Add an Application YAML, open a PR, get it reviewed, merge. Done. The entire state of what’s running across all clusters is visible in one Git repo. That’s the promise of GitOps actually delivered.
I’ve written more about managing infrastructure this way in GitOps managing infrastructure as code.
RBAC Configuration
The default ArgoCD install gives the admin account full access to everything. That’s fine for getting started, terrible for production.
ArgoCD’s RBAC uses a Casbin-based policy model. You configure it through the argocd-rbac-cm ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-rbac-cm
namespace: argocd
data:
policy.default: role:readonly
policy.csv: |
p, role:dev, applications, get, */*, allow
p, role:dev, applications, sync, */*, allow
p, role:dev, logs, get, */*, allow
p, role:ops, applications, *, */*, allow
p, role:ops, clusters, get, *, allow
p, role:ops, repositories, *, *, allow
p, role:ops, projects, *, *, allow
g, dev-team, role:dev
g, ops-team, role:ops
The format is p, role, resource, action, object, effect. The g lines map groups to roles—these tie into your SSO groups.
My standard setup has three roles: readonly for everyone (the default), dev for developers who can view and sync but not delete or modify app configs, and ops for the platform team who can do everything. Keep it simple. You can get more granular with project-scoped policies, but I’ve found these three cover most teams.
SSO Integration
Running ArgoCD with local accounts doesn’t scale. You want SSO. ArgoCD supports OIDC, SAML, and Dex (which is bundled and acts as a federation hub).
Here’s a Dex config for Azure AD in the argocd-cm ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-cm
namespace: argocd
data:
url: https://argocd.example.com
dex.config: |
connectors:
- type: microsoft
id: azure-ad
name: Azure AD
config:
clientID: $dex.azure.clientID
clientSecret: $dex.azure.clientSecret
tenant: your-tenant-id
redirectURI: https://argocd.example.com/api/dex/callback
groups:
- dev-team
- ops-team
The $dex.azure.clientID syntax references secrets from argocd-secret. Don’t put credentials in the ConfigMap directly.
Store the secrets:
kubectl -n argocd patch secret argocd-secret -p \
'{"stringData": {"dex.azure.clientID": "your-client-id", "dex.azure.clientSecret": "your-client-secret"}}'
Once SSO is working, disable the admin account:
data:
admin.enabled: "false"
I’ve seen too many setups where the admin account stays enabled with the default password months after SSO is configured. Don’t be that team.
Sync Policies and Waves
Sync policies control how and when ArgoCD applies changes. I already showed automated with prune and selfHeal, but there’s more nuance in production.
Sync waves let you control ordering. This matters when you’ve got dependencies—you need the namespace and secrets before the deployment, the CRDs before the custom resources:
metadata:
annotations:
argocd.argoproj.io/sync-wave: "-1"
Lower numbers sync first. I typically use:
-2for namespaces and CRDs-1for secrets and configmaps0for the main application resources (default)1for post-deploy jobs, monitoring configs
Retry policies are also worth configuring:
syncPolicy:
automated:
prune: true
selfHeal: true
retry:
limit: 3
backoff:
duration: 10s
factor: 2
maxDuration: 3m
This handles transient failures—API server blips, webhook timeouts, that sort of thing. Without retries, you’ll get paged for things that would’ve resolved themselves.
For sync options, these are the ones I set on almost every app:
syncOptions:
- CreateNamespace=true
- PrunePropagationPolicy=foreground
- PruneLast=true
- ServerSideApply=true
ServerSideApply is particularly important if you’ve got large CRDs or resources with many fields. It avoids the annotation size limits that bite you with client-side apply. If you’re working with Kubernetes deployment strategies, server-side apply handles the more complex resource definitions cleanly.
Multi-Cluster Management
This is where ArgoCD really shines compared to alternatives. Adding a cluster is one command:
argocd cluster add my-production-cluster --name production
This creates a ServiceAccount in the target cluster and stores the credentials in ArgoCD. Then you reference it in your Application:
spec:
destination:
server: https://production-cluster-api.example.com
namespace: my-app
For the app-of-apps pattern across clusters, I structure it like this:
argocd-apps/
├── clusters/
│ ├── staging/
│ │ ├── my-app.yaml
│ │ └── monitoring.yaml
│ └── production/
│ ├── my-app.yaml
│ └── monitoring.yaml
└── root-apps/
├── staging-root.yaml
└── production-root.yaml
Each root app points to its cluster’s directory. The Application manifests in each cluster directory specify the correct destination.server. Clean separation, easy to audit, and PRs for production changes only touch the production directory.
I use ArgoCD Projects to enforce boundaries:
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
name: production
namespace: argocd
spec:
description: Production cluster applications
sourceRepos:
- https://github.com/your-org/k8s-manifests.git
- https://github.com/your-org/argocd-apps.git
destinations:
- server: https://production-cluster-api.example.com
namespace: '*'
clusterResourceWhitelist:
- group: '*'
kind: '*'
roles:
- name: ops-only
policies:
- p, proj:production:ops-only, applications, *, production/*, allow
groups:
- ops-team
Projects restrict which repos can deploy to which clusters. The production project only allows deployments from specific repos to the production cluster. Developers can’t accidentally (or intentionally) point a staging app at production.
Rollback
Things go wrong. ArgoCD keeps a history of every sync, and rolling back is straightforward:
argocd app history my-app
argocd app rollback my-app <revision-id>
In the UI, it’s even simpler—click the history tab, pick a revision, hit rollback.
But here’s my take: if you’re doing GitOps properly, you shouldn’t use ArgoCD’s rollback feature. You should revert the Git commit. The rollback command puts the cluster in a state that doesn’t match Git, which is exactly the drift problem GitOps is supposed to solve. Use git revert, push, and let ArgoCD sync the reverted state. That way your Git history is the complete audit trail.
The exception is emergencies. If production is on fire and you need to buy time while someone prepares a proper revert, ArgoCD rollback is your break-glass procedure. Just make sure you follow up with the Git revert before the next sync overwrites your rollback.
If you’re building out your full pipeline, my CI/CD with Jenkins guide covers the build side that feeds into ArgoCD’s deployment side.
What I’ve Learned Running ArgoCD in Production
A few things that aren’t in the docs but matter:
Resource tracking gets expensive at scale. If you’ve got thousands of resources, bump the controller’s --status-processors and --operation-processors flags. The defaults are conservative.
Notifications are essential. ArgoCD has a built-in notification engine—use it. Pipe sync failures and health degradations to Slack or Teams. Don’t rely on someone watching the UI.
Git webhook integration speeds everything up. Without it, ArgoCD polls every 3 minutes by default. With webhooks, syncs start within seconds of a merge.
Secrets management needs a separate solution. ArgoCD syncs what’s in Git, and you shouldn’t put secrets in Git. I use Sealed Secrets or External Secrets Operator alongside ArgoCD. The sealed/external secret manifests live in Git, and the actual secret values come from Vault or AWS Secrets Manager at runtime.
The team that was doing kubectl apply from laptops? They’re now one of the strongest advocates for GitOps I’ve worked with. Once you’ve experienced the confidence of knowing that Git is the single source of truth—that every change is reviewed, auditable, and reversible—going back to manual deployments feels reckless. ArgoCD makes that experience accessible without requiring everyone to become a platform engineer. The UI meets people where they are, and the declarative model keeps everything honest.
That’s the real win. Not the technology itself, but the workflow it enforces.