GitOps with ArgoCD: From Zero to Production

ArgoCD won the GitOps war. I’ll say it. Flux is fine—it works, it’s CNCF graduated, it has its fans—but ArgoCD’s UI alone makes it worth choosing. When something’s out of sync at 2am, I don’t want to be parsing CLI output. I want to click on a resource tree and see exactly what drifted.

I’ve been running ArgoCD in production across multiple clusters for a couple of years now, and this is the guide I wish I’d had when I started. We’ll go from a fresh install to a production-grade setup with app-of-apps, RBAC, SSO, multi-cluster management, and sane sync policies.

But first, a war story.

The kubectl apply Incident

I joined a team that was deploying to Kubernetes by running kubectl apply from developer laptops. No, really. The “deployment process” was a Confluence page with a list of commands. Step 4 was literally “make sure you’re pointed at the right cluster.” You can guess what happened.

One Thursday afternoon, a developer applied a staging config to production. The service mesh configuration got overwritten, traffic routing broke, and three microservices started talking to staging databases. We didn’t notice for forty minutes because our monitoring was also partially broken from the config change. The blast radius was enormous—data inconsistency across services, customers seeing other customers’ draft orders, the whole nightmare.

The postmortem was brutal. Not because anyone got blamed—the team was good about that—but because the root cause was so obviously preventable. There was no source of truth for what should be running in production. No audit trail. No way to roll back except “does anyone remember what the YAML looked like before?” Someone had a local copy. Maybe.

That’s when I pushed hard for GitOps, and specifically ArgoCD. Within two weeks we had it running. Within a month, nobody was allowed to kubectl apply anything to production. The difference was night and day.

Installing ArgoCD

Let’s get ArgoCD running. I’m assuming you’ve got a Kubernetes cluster and kubectl access. If you’re following along with my CI/CD with GitHub Actions setup, this slots right in as the deployment layer.

Create the namespace and install:

kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

That gives you the full install with the UI. For production, I always use the HA manifests instead:

kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/ha/install.yaml

Grab the CLI too—you’ll need it:

brew install argocd

Get the initial admin password:

argocd admin initial-password -n argocd

Change it immediately:

argocd login <your-argocd-server>
argocd account update-password

I expose ArgoCD through an ingress with TLS rather than port-forwarding. Here’s a minimal ingress:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: argocd-server
  namespace: argocd
  annotations:
    nginx.ingress.kubernetes.io/ssl-passthrough: "true"
    nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
spec:
  ingressClassName: nginx
  rules:
    - host: argocd.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: argocd-server
                port:
                  number: 443
  tls:
    - hosts:
        - argocd.example.com
      secretName: argocd-tls

Creating Your First Application

An ArgoCD Application is a CRD that maps a Git repo path to a Kubernetes namespace. Here’s the simplest possible example:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: my-app
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/your-org/k8s-manifests.git
    targetRevision: main
    path: apps/my-app
  destination:
    server: https://kubernetes.default.svc
    namespace: my-app
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true

The key bits: syncPolicy.automated means ArgoCD will automatically sync when it detects drift between Git and the cluster. prune: true means it’ll delete resources that are no longer in Git. selfHeal: true means if someone does a sneaky kubectl edit in production, ArgoCD will revert it. That last one is non-negotiable for me after the incident I described.

You can also create apps via CLI:

argocd app create my-app \
  --repo https://github.com/your-org/k8s-manifests.git \
  --path apps/my-app \
  --dest-server https://kubernetes.default.svc \
  --dest-namespace my-app \
  --sync-policy automated \
  --auto-prune \
  --self-heal

I prefer the declarative YAML approach though. It’s GitOps all the way down—even your ArgoCD config lives in Git.

Helm and Kustomize Integration

ArgoCD natively understands Helm charts and Kustomize overlays. You don’t need to render anything beforehand.

For Helm, point your Application source at a chart:

spec:
  source:
    repoURL: https://charts.bitnami.com/bitnami
    chart: postgresql
    targetRevision: 13.2.0
    helm:
      releaseName: postgres
      values: |
        auth:
          postgresPassword: changeme
        primary:
          persistence:
            size: 20Gi

Or reference a values file from your own repo:

spec:
  source:
    repoURL: https://github.com/your-org/k8s-manifests.git
    path: charts/my-app
    targetRevision: main
    helm:
      valueFiles:
        - values-production.yaml

For Kustomize, just point at a directory with a kustomization.yaml:

spec:
  source:
    repoURL: https://github.com/your-org/k8s-manifests.git
    path: overlays/production
    targetRevision: main

ArgoCD detects Kustomize automatically. If you need to pass patches or override images, you can do it inline:

spec:
  source:
    kustomize:
      images:
        - my-app=registry.example.com/my-app:v1.2.3

I use Kustomize for most things and Helm for third-party charts. That combination covers about 95% of what I need. If you’re managing multiple environments, check out my guide on GitOps multi-environment deployments for the overlay patterns I use.

The App-of-Apps Pattern

This is where ArgoCD gets really powerful. Instead of creating each Application manually, you create one “root” Application that points to a directory of Application manifests. ArgoCD then manages all of them.

Your repo structure looks like this:

argocd-apps/
├── root-app.yaml
└── apps/
    ├── my-app.yaml
    ├── postgres.yaml
    ├── redis.yaml
    ├── monitoring.yaml
    └── cert-manager.yaml

The root app:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: root-app
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/your-org/argocd-apps.git
    targetRevision: main
    path: apps
  destination:
    server: https://kubernetes.default.svc
    namespace: argocd
  syncPolicy:
    automated:
      selfHeal: true
      prune: true

Each file in apps/ is a full Application manifest like the ones we created earlier. When you merge a new Application YAML into that directory, ArgoCD picks it up and deploys it. Delete the file, and ArgoCD removes the application and its resources.

This is how I manage everything. New service? Add an Application YAML, open a PR, get it reviewed, merge. Done. The entire state of what’s running across all clusters is visible in one Git repo. That’s the promise of GitOps actually delivered.

I’ve written more about managing infrastructure this way in GitOps managing infrastructure as code.

RBAC Configuration

The default ArgoCD install gives the admin account full access to everything. That’s fine for getting started, terrible for production.

ArgoCD’s RBAC uses a Casbin-based policy model. You configure it through the argocd-rbac-cm ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-rbac-cm
  namespace: argocd
data:
  policy.default: role:readonly
  policy.csv: |
    p, role:dev, applications, get, */*, allow
    p, role:dev, applications, sync, */*, allow
    p, role:dev, logs, get, */*, allow

    p, role:ops, applications, *, */*, allow
    p, role:ops, clusters, get, *, allow
    p, role:ops, repositories, *, *, allow
    p, role:ops, projects, *, *, allow

    g, dev-team, role:dev
    g, ops-team, role:ops

The format is p, role, resource, action, object, effect. The g lines map groups to roles—these tie into your SSO groups.

My standard setup has three roles: readonly for everyone (the default), dev for developers who can view and sync but not delete or modify app configs, and ops for the platform team who can do everything. Keep it simple. You can get more granular with project-scoped policies, but I’ve found these three cover most teams.

SSO Integration

Running ArgoCD with local accounts doesn’t scale. You want SSO. ArgoCD supports OIDC, SAML, and Dex (which is bundled and acts as a federation hub).

Here’s a Dex config for Azure AD in the argocd-cm ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-cm
  namespace: argocd
data:
  url: https://argocd.example.com
  dex.config: |
    connectors:
      - type: microsoft
        id: azure-ad
        name: Azure AD
        config:
          clientID: $dex.azure.clientID
          clientSecret: $dex.azure.clientSecret
          tenant: your-tenant-id
          redirectURI: https://argocd.example.com/api/dex/callback
          groups:
            - dev-team
            - ops-team

The $dex.azure.clientID syntax references secrets from argocd-secret. Don’t put credentials in the ConfigMap directly.

Store the secrets:

kubectl -n argocd patch secret argocd-secret -p \
  '{"stringData": {"dex.azure.clientID": "your-client-id", "dex.azure.clientSecret": "your-client-secret"}}'

Once SSO is working, disable the admin account:

data:
  admin.enabled: "false"

I’ve seen too many setups where the admin account stays enabled with the default password months after SSO is configured. Don’t be that team.

Sync Policies and Waves

Sync policies control how and when ArgoCD applies changes. I already showed automated with prune and selfHeal, but there’s more nuance in production.

Sync waves let you control ordering. This matters when you’ve got dependencies—you need the namespace and secrets before the deployment, the CRDs before the custom resources:

metadata:
  annotations:
    argocd.argoproj.io/sync-wave: "-1"

Lower numbers sync first. I typically use:

-2 for namespaces and CRDs
-1 for secrets and configmaps
0 for the main application resources (default)
1 for post-deploy jobs, monitoring configs

Retry policies are also worth configuring:

syncPolicy:
  automated:
    prune: true
    selfHeal: true
  retry:
    limit: 3
    backoff:
      duration: 10s
      factor: 2
      maxDuration: 3m

This handles transient failures—API server blips, webhook timeouts, that sort of thing. Without retries, you’ll get paged for things that would’ve resolved themselves.

For sync options, these are the ones I set on almost every app:

syncOptions:
  - CreateNamespace=true
  - PrunePropagationPolicy=foreground
  - PruneLast=true
  - ServerSideApply=true

ServerSideApply is particularly important if you’ve got large CRDs or resources with many fields. It avoids the annotation size limits that bite you with client-side apply. If you’re working with Kubernetes deployment strategies, server-side apply handles the more complex resource definitions cleanly.

Multi-Cluster Management

This is where ArgoCD really shines compared to alternatives. Adding a cluster is one command:

argocd cluster add my-production-cluster --name production

This creates a ServiceAccount in the target cluster and stores the credentials in ArgoCD. Then you reference it in your Application:

spec:
  destination:
    server: https://production-cluster-api.example.com
    namespace: my-app

For the app-of-apps pattern across clusters, I structure it like this:

argocd-apps/
├── clusters/
│   ├── staging/
│   │   ├── my-app.yaml
│   │   └── monitoring.yaml
│   └── production/
│       ├── my-app.yaml
│       └── monitoring.yaml
└── root-apps/
    ├── staging-root.yaml
    └── production-root.yaml

Each root app points to its cluster’s directory. The Application manifests in each cluster directory specify the correct destination.server. Clean separation, easy to audit, and PRs for production changes only touch the production directory.

I use ArgoCD Projects to enforce boundaries:

apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
  name: production
  namespace: argocd
spec:
  description: Production cluster applications
  sourceRepos:
    - https://github.com/your-org/k8s-manifests.git
    - https://github.com/your-org/argocd-apps.git
  destinations:
    - server: https://production-cluster-api.example.com
      namespace: '*'
  clusterResourceWhitelist:
    - group: '*'
      kind: '*'
  roles:
    - name: ops-only
      policies:
        - p, proj:production:ops-only, applications, *, production/*, allow
      groups:
        - ops-team

Projects restrict which repos can deploy to which clusters. The production project only allows deployments from specific repos to the production cluster. Developers can’t accidentally (or intentionally) point a staging app at production.

Rollback

Things go wrong. ArgoCD keeps a history of every sync, and rolling back is straightforward:

argocd app history my-app
argocd app rollback my-app <revision-id>

In the UI, it’s even simpler—click the history tab, pick a revision, hit rollback.

But here’s my take: if you’re doing GitOps properly, you shouldn’t use ArgoCD’s rollback feature. You should revert the Git commit. The rollback command puts the cluster in a state that doesn’t match Git, which is exactly the drift problem GitOps is supposed to solve. Use git revert, push, and let ArgoCD sync the reverted state. That way your Git history is the complete audit trail.

The exception is emergencies. If production is on fire and you need to buy time while someone prepares a proper revert, ArgoCD rollback is your break-glass procedure. Just make sure you follow up with the Git revert before the next sync overwrites your rollback.

If you’re building out your full pipeline, my CI/CD with Jenkins guide covers the build side that feeds into ArgoCD’s deployment side.

What I’ve Learned Running ArgoCD in Production

A few things that aren’t in the docs but matter:

Resource tracking gets expensive at scale. If you’ve got thousands of resources, bump the controller’s --status-processors and --operation-processors flags. The defaults are conservative.

Notifications are essential. ArgoCD has a built-in notification engine—use it. Pipe sync failures and health degradations to Slack or Teams. Don’t rely on someone watching the UI.

Git webhook integration speeds everything up. Without it, ArgoCD polls every 3 minutes by default. With webhooks, syncs start within seconds of a merge.

Secrets management needs a separate solution. ArgoCD syncs what’s in Git, and you shouldn’t put secrets in Git. I use Sealed Secrets or External Secrets Operator alongside ArgoCD. The sealed/external secret manifests live in Git, and the actual secret values come from Vault or AWS Secrets Manager at runtime.

The team that was doing kubectl apply from laptops? They’re now one of the strongest advocates for GitOps I’ve worked with. Once you’ve experienced the confidence of knowing that Git is the single source of truth—that every change is reviewed, auditable, and reversible—going back to manual deployments feels reckless. ArgoCD makes that experience accessible without requiring everyone to become a platform engineer. The UI meets people where they are, and the declarative model keeps everything honest.

That’s the real win. Not the technology itself, but the workflow it enforces.