Kubernetes Operators: Building Custom Controllers in Go

Operator SDK vs kubebuilder — I pick kubebuilder every time. Operator SDK wraps kubebuilder anyway, adds a layer of abstraction that mostly just gets in the way, and the documentation lags behind. Kubebuilder gives you the scaffolding, the code generation, and then gets out of your face. That’s what I want from a framework.

I built my first operator about two years ago. The task: automate database provisioning for development teams. Every time a team needed a new PostgreSQL instance, they’d file a Jira ticket, wait for the platform team to provision it, get credentials back in a Slack DM (yes, really), and manually configure their app. The whole cycle took three to five days. Sometimes longer if someone was on leave.

The operator replaced all of that. A developer would apply a Database custom resource to the cluster, and within minutes they’d have a provisioned PostgreSQL instance, credentials stored in a Kubernetes Secret, and a connection string ready to inject. No tickets. No Slack DMs. No waiting.

It wasn’t smooth getting there. The first version had a reconciliation loop that would fight with itself — creating duplicate databases because I didn’t understand idempotency in the controller context. The second version leaked connections because I was opening database admin connections in every reconcile call without closing them. The third version finally worked, and it’s been running in production since, handling about 200 database instances across three clusters.

This article is what I wish I’d read before starting. If you’ve already gone through my earlier piece on developing a K8s controller in Go, think of this as the deep dive — less “hello world,” more “here’s what actually happens when you run this in production.”

Why Operators Exist

Kubernetes is good at managing stateless workloads. You declare a Deployment, it creates ReplicaSets, pods get scheduled, traffic gets routed. The built-in controllers handle all of that.

But what about stateful things? Databases, message queues, certificate authorities, DNS records — stuff that has lifecycle requirements beyond “run N copies of this container.” The built-in controllers don’t know how to provision a PostgreSQL database, run a schema migration, or rotate credentials. That’s domain-specific knowledge.

Operators encode that domain knowledge into a controller. You define a Custom Resource Definition (CRD) that describes what you want, and the operator’s reconciliation loop figures out how to make it happen. It’s the same declarative model Kubernetes uses for everything else, extended to your specific domain.

The pattern is simple: watch for changes to your custom resource, compare desired state with actual state, take action to close the gap. Repeat forever. That’s it. The complexity is in the details.

Setting Up Kubebuilder

You’ll need Go 1.21+, Docker, kubectl, and access to a Kubernetes cluster. Kind works fine for development.

Install kubebuilder:

curl -L -o kubebuilder "https://go.kubebuilder.io/dl/latest/$(go env GOOS)/$(go env GOARCH)"
chmod +x kubebuilder && sudo mv kubebuilder /usr/local/bin/

Scaffold the project:

mkdir database-operator && cd database-operator
kubebuilder init --domain example.com --repo github.com/example/database-operator
kubebuilder create api --group infra --version v1alpha1 --kind Database --resource --controller

That gives you a project structure with the API types, controller skeleton, and all the boilerplate for webhooks, RBAC, and manager setup. Kubebuilder generates a lot of files. Most of them you won’t touch. The two that matter are api/v1alpha1/database_types.go and internal/controller/database_controller.go.

Defining the Custom Resource

The types file is where you define what your CRD looks like. Here’s what I ended up with for the database operator:

// api/v1alpha1/database_types.go
package v1alpha1

import (
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

type DatabaseSpec struct {
	// Engine is the database engine (postgres, mysql)
	Engine string `json:"engine"`
	// Version is the engine version
	Version string `json:"version"`
	// StorageGB is the storage size in gigabytes
	StorageGB int `json:"storageGB"`
	// Team is the owning team name
	Team string `json:"team"`
}

type DatabasePhase string

const (
	DatabasePending      DatabasePhase = "Pending"
	DatabaseProvisioning DatabasePhase = "Provisioning"
	DatabaseReady        DatabasePhase = "Ready"
	DatabaseFailed       DatabasePhase = "Failed"
	DatabaseDeleting     DatabasePhase = "Deleting"
)

type DatabaseStatus struct {
	Phase             DatabasePhase      `json:"phase,omitempty"`
	ConnectionSecret  string             `json:"connectionSecret,omitempty"`
	Host              string             `json:"host,omitempty"`
	Message           string             `json:"message,omitempty"`
	ObservedGeneration int64             `json:"observedGeneration,omitempty"`
	Conditions        []metav1.Condition `json:"conditions,omitempty"`
}

// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// +kubebuilder:printcolumn:name="Engine",type=string,JSONPath=`.spec.engine`
// +kubebuilder:printcolumn:name="Phase",type=string,JSONPath=`.status.phase`
// +kubebuilder:printcolumn:name="Age",type=date,JSONPath=`.metadata.creationTimestamp`
type Database struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`
	Spec              DatabaseSpec   `json:"spec,omitempty"`
	Status            DatabaseStatus `json:"status,omitempty"`
}

// +kubebuilder:object:root=true
type DatabaseList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []Database `json:"items"`
}

func init() {
	SchemeBuilder.Register(&Database{}, &DatabaseList{})
}

A few things I learned the hard way. The +kubebuilder:printcolumn markers are worth adding early — they make kubectl get databases actually useful instead of showing just name and age. The ObservedGeneration field in status is critical for detecting whether the controller has processed the latest spec change. And use metav1.Condition for conditions instead of rolling your own — it’s the standard now and tools like ArgoCD understand it.

Run make manifests to generate the CRD YAML from those markers. Kubebuilder’s code generation is one of its best features — you annotate Go structs and it produces OpenAPI schemas, RBAC rules, and webhook configurations.

The Reconciliation Loop

This is where the actual work happens. The reconciler gets called whenever something changes — a create, update, delete, or even a periodic resync. Your job is to look at the current state, compare it to the desired state, and take one step toward convergence.

One step. Not all the steps. This was my biggest mistake early on. I tried to do everything in a single reconcile call: provision the database, create the secret, update the status. If any step failed, the whole thing failed, and I’d end up with half-provisioned resources and no way to recover cleanly.

The pattern that works: do one thing, update status, requeue. Let the next reconcile call handle the next step. It’s slower but dramatically more reliable.

// internal/controller/database_controller.go
package controller

import (
	"context"
	"fmt"
	"time"

	"k8s.io/apimachinery/pkg/api/errors"
	"k8s.io/apimachinery/pkg/api/meta"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/apimachinery/pkg/runtime"
	ctrl "sigs.k8s.io/controller-runtime"
	"sigs.k8s.io/controller-runtime/pkg/client"
	"sigs.k8s.io/controller-runtime/pkg/log"

	infrav1alpha1 "github.com/example/database-operator/api/v1alpha1"
)

type DatabaseReconciler struct {
	client.Client
	Scheme     *runtime.Scheme
	Provisioner DatabaseProvisioner
}

type DatabaseProvisioner interface {
	Provision(ctx context.Context, db *infrav1alpha1.Database) (host string, err error)
	GetStatus(ctx context.Context, db *infrav1alpha1.Database) (ready bool, err error)
	Delete(ctx context.Context, db *infrav1alpha1.Database) error
}

func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
	log := log.FromContext(ctx)

	var db infrav1alpha1.Database
	if err := r.Get(ctx, req.NamespacedName, &db); err != nil {
		return ctrl.Result{}, client.IgnoreNotFound(err)
	}

	// Handle deletion
	if !db.DeletionTimestamp.IsZero() {
		return r.reconcileDelete(ctx, &db)
	}

	// Add finalizer if missing
	if !containsFinalizer(db.Finalizers, "infra.example.com/database") {
		db.Finalizers = append(db.Finalizers, "infra.example.com/database")
		if err := r.Update(ctx, &db); err != nil {
			return ctrl.Result{}, err
		}
		return ctrl.Result{Requeue: true}, nil
	}

	// Reconcile based on current phase
	switch db.Status.Phase {
	case "", infrav1alpha1.DatabasePending:
		return r.reconcilePending(ctx, &db)
	case infrav1alpha1.DatabaseProvisioning:
		return r.reconcileProvisioning(ctx, &db)
	case infrav1alpha1.DatabaseReady:
		return r.reconcileReady(ctx, &db)
	case infrav1alpha1.DatabaseFailed:
		return r.reconcileFailed(ctx, &db)
	default:
		log.Info("unknown phase", "phase", db.Status.Phase)
		return ctrl.Result{}, nil
	}
}

func (r *DatabaseReconciler) reconcilePending(ctx context.Context, db *infrav1alpha1.Database) (ctrl.Result, error) {
	log := log.FromContext(ctx)
	log.Info("starting database provisioning")

	host, err := r.Provisioner.Provision(ctx, db)
	if err != nil {
		return r.setPhase(ctx, db, infrav1alpha1.DatabaseFailed, fmt.Sprintf("provision failed: %v", err))
	}

	db.Status.Host = host
	return r.setPhase(ctx, db, infrav1alpha1.DatabaseProvisioning, "provisioning started")
}

func (r *DatabaseReconciler) reconcileProvisioning(ctx context.Context, db *infrav1alpha1.Database) (ctrl.Result, error) {
	ready, err := r.Provisioner.GetStatus(ctx, db)
	if err != nil {
		return r.setPhase(ctx, db, infrav1alpha1.DatabaseFailed, fmt.Sprintf("status check failed: %v", err))
	}

	if !ready {
		// Not ready yet — check again in 15 seconds
		return ctrl.Result{RequeueAfter: 15 * time.Second}, nil
	}

	// Create connection secret
	if err := r.ensureConnectionSecret(ctx, db); err != nil {
		return r.setPhase(ctx, db, infrav1alpha1.DatabaseFailed, fmt.Sprintf("secret creation failed: %v", err))
	}

	return r.setPhase(ctx, db, infrav1alpha1.DatabaseReady, "database is ready")
}

func (r *DatabaseReconciler) reconcileReady(ctx context.Context, db *infrav1alpha1.Database) (ctrl.Result, error) {
	// Check if spec changed (generation mismatch)
	if db.Generation != db.Status.ObservedGeneration {
		return r.setPhase(ctx, db, infrav1alpha1.DatabasePending, "spec changed, reprovisioning")
	}
	// Periodic health check
	return ctrl.Result{RequeueAfter: 5 * time.Minute}, nil
}

func (r *DatabaseReconciler) reconcileFailed(ctx context.Context, db *infrav1alpha1.Database) (ctrl.Result, error) {
	// Retry after backoff
	return ctrl.Result{RequeueAfter: 1 * time.Minute}, nil
}

func (r *DatabaseReconciler) reconcileDelete(ctx context.Context, db *infrav1alpha1.Database) (ctrl.Result, error) {
	log := log.FromContext(ctx)
	log.Info("deleting database")

	if err := r.Provisioner.Delete(ctx, db); err != nil {
		log.Error(err, "failed to delete database, will retry")
		return ctrl.Result{RequeueAfter: 30 * time.Second}, nil
	}

	// Remove finalizer
	db.Finalizers = removeFinalizer(db.Finalizers, "infra.example.com/database")
	if err := r.Update(ctx, &db); err != nil {
		return ctrl.Result{}, err
	}
	return ctrl.Result{}, nil
}

func (r *DatabaseReconciler) setPhase(ctx context.Context, db *infrav1alpha1.Database, phase infrav1alpha1.DatabasePhase, msg string) (ctrl.Result, error) {
	db.Status.Phase = phase
	db.Status.Message = msg
	db.Status.ObservedGeneration = db.Generation

	condition := metav1.Condition{
		Type:               string(phase),
		Status:             metav1.ConditionTrue,
		ObservedGeneration: db.Generation,
		LastTransitionTime: metav1.Now(),
		Reason:             string(phase),
		Message:            msg,
	}
	meta.SetStatusCondition(&db.Status.Conditions, condition)

	if err := r.Status().Update(ctx, db); err != nil {
		return ctrl.Result{}, err
	}
	return ctrl.Result{Requeue: true}, nil
}

func (r *DatabaseReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		For(&infrav1alpha1.Database{}).
		Complete(r)
}

func containsFinalizer(finalizers []string, f string) bool {
	for _, fin := range finalizers {
		if fin == f {
			return true
		}
	}
	return false
}

func removeFinalizer(finalizers []string, f string) []string {
	var out []string
	for _, fin := range finalizers {
		if fin != f {
			out = append(out, fin)
		}
	}
	return out
}

There’s a lot going on here, so let me call out the important bits.

Finalizers are non-negotiable. Without a finalizer, Kubernetes will delete your custom resource immediately, and your controller never gets a chance to clean up the actual database. The finalizer says “don’t delete this until I’ve had a chance to run my cleanup logic.” I forgot finalizers in my first version and ended up with orphaned databases that nobody knew about until the cloud bill arrived.

Phase-based reconciliation keeps each reconcile call focused. Instead of a giant function with nested if-else chains, each phase handler does one thing. Pending provisions the database. Provisioning checks if it’s ready. Ready watches for drift. Failed retries. This maps cleanly to a state machine, and it’s easy to reason about what the controller is doing at any point.

Status updates use the status subresource. That r.Status().Update() call is important — it only updates the status, not the spec. If you use r.Update() for status changes, you’ll trigger another reconcile (because the resource changed), which triggers another status update, which triggers another reconcile. I’ve seen operators burn through their rate limits this way. Always use the status subresource.

ObservedGeneration tells you whether the controller has seen the latest spec. Kubernetes increments .metadata.generation on spec changes. By storing it in status, you can detect “the user changed the spec but the controller hasn’t processed it yet.” This is how the Ready phase knows to re-provision when someone changes the storage size.

The Connection Secret

When the database is ready, the operator creates a Kubernetes Secret with connection details. This is what application pods actually consume:

func (r *DatabaseReconciler) ensureConnectionSecret(ctx context.Context, db *infrav1alpha1.Database) error {
	secretName := fmt.Sprintf("%s-conn", db.Name)
	secret := &corev1.Secret{
		ObjectMeta: metav1.ObjectMeta{
			Name:      secretName,
			Namespace: db.Namespace,
			Labels: map[string]string{
				"app.kubernetes.io/managed-by": "database-operator",
				"infra.example.com/team":       db.Spec.Team,
			},
		},
		StringData: map[string]string{
			"host":     db.Status.Host,
			"port":     "5432",
			"database": db.Name,
			"uri":      fmt.Sprintf("postgresql://app:$(PASSWORD)@%s:5432/%s?sslmode=require", db.Status.Host, db.Name),
		},
	}

	// Set owner reference so the secret gets garbage collected with the Database
	if err := ctrl.SetControllerReference(db, secret, r.Scheme); err != nil {
		return err
	}

	// Create or update
	existing := &corev1.Secret{}
	err := r.Get(ctx, client.ObjectKeyFromObject(secret), existing)
	if errors.IsNotFound(err) {
		return r.Create(ctx, secret)
	}
	if err != nil {
		return err
	}
	existing.StringData = secret.StringData
	return r.Update(ctx, existing)
}

The owner reference is key. When someone deletes the Database resource, Kubernetes automatically garbage-collects the Secret. No orphaned secrets cluttering up the namespace.

Testing the Operator

Kubebuilder sets up envtest for integration testing — it runs a real API server and etcd in-process, so you can test your controller against actual Kubernetes APIs without a full cluster:

// internal/controller/database_controller_test.go
var _ = Describe("Database Controller", func() {
	ctx := context.Background()

	It("should provision a database and create a connection secret", func() {
		db := &infrav1alpha1.Database{
			ObjectMeta: metav1.ObjectMeta{
				Name:      "test-db",
				Namespace: "default",
			},
			Spec: infrav1alpha1.DatabaseSpec{
				Engine:    "postgres",
				Version:   "15",
				StorageGB: 20,
				Team:      "backend",
			},
		}

		Expect(k8sClient.Create(ctx, db)).To(Succeed())

		// Wait for Ready phase
		Eventually(func() infrav1alpha1.DatabasePhase {
			var fetched infrav1alpha1.Database
			_ = k8sClient.Get(ctx, client.ObjectKeyFromObject(db), &fetched)
			return fetched.Status.Phase
		}, 30*time.Second, time.Second).Should(Equal(infrav1alpha1.DatabaseReady))

		// Verify connection secret exists
		var secret corev1.Secret
		Expect(k8sClient.Get(ctx, client.ObjectKey{
			Name: "test-db-conn", Namespace: "default",
		}, &secret)).To(Succeed())
		Expect(secret.Data).To(HaveKey("host"))
	})
})

Run the tests with make test. Envtest is slower than pure unit tests but catches issues that mocks never will — RBAC problems, status subresource behavior, finalizer ordering. I run envtest in CI and pure unit tests locally for fast feedback.

Deploying to Production

Here’s the Deployment manifest I use. A few things are non-obvious:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: database-operator
  namespace: database-operator-system
spec:
  replicas: 2
  selector:
    matchLabels:
      app: database-operator
  template:
    metadata:
      labels:
        app: database-operator
    spec:
      serviceAccountName: database-operator
      containers:
        - name: manager
          image: ghcr.io/example/database-operator:v0.3.0
          args:
            - --leader-elect
            - --health-probe-bind-address=:8081
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              memory: 256Mi
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8081
            initialDelaySeconds: 15
          readinessProbe:
            httpGet:
              path: /readyz
              port: 8081
            initialDelaySeconds: 5

Two replicas with leader election. The --leader-elect flag means only one instance runs the reconciliation loop at a time. The other sits idle, ready to take over if the leader dies. This gives you zero-downtime upgrades and crash recovery. Without leader election, two replicas would both try to reconcile the same resources and you’d get race conditions.

Resource limits matter. Operators that watch many resources can consume surprising amounts of memory because of the informer cache. Start conservative and monitor. I’ve seen operators OOM-killed because someone added a watch on a high-cardinality resource type without thinking about cache size.

Build and deploy:

make docker-build docker-push IMG=ghcr.io/example/database-operator:v0.3.0
make deploy IMG=ghcr.io/example/database-operator:v0.3.0

Then apply a Database resource and watch it work:

apiVersion: infra.example.com/v1alpha1
kind: Database
metadata:
  name: orders-db
  namespace: backend
spec:
  engine: postgres
  version: "15"
  storageGB: 50
  team: orders

$ kubectl get databases -n backend
NAME        ENGINE     PHASE   AGE
orders-db   postgres   Ready   2m

Lessons from Production

After running this operator for over a year, here’s what I’d tell myself at the start.

Rate limiting saves you. The default controller-runtime rate limiter is fine for most cases, but if your operator talks to external APIs (cloud provider, database admin endpoints), add your own rate limiting. I hit AWS API throttling during a mass-reconcile event and every single database showed as Failed until the backoff cleared.

Metrics are not optional. controller-runtime exposes Prometheus metrics out of the box — reconcile duration, queue depth, error counts. Wire them up from day one. When reconcile latency starts creeping up, you want to know before users notice. I covered some of the concurrency patterns that help with this in my Go concurrency patterns article — worker pools and context cancellation are directly applicable.

Don’t reconcile the world on startup. When the operator restarts, it re-lists all resources and reconciles them. If you have 200 databases and each reconcile calls an external API, that’s 200 API calls in the first few seconds. Use RequeueAfter with jitter to spread the load.

Version your CRDs carefully. Once a CRD is in production, changing it is painful. Use v1alpha1 until you’re confident in the schema, then promote to v1beta1, then v1. Adding fields is easy. Removing or renaming fields requires a conversion webhook. I learned this when I tried to rename storageGB to storageSizeGB and broke every existing resource.

GitOps and operators work beautifully together. We deploy our Database resources through ArgoCD, which means database provisioning is part of the same Git-driven workflow as application deployments. A developer adds a Database manifest to their app’s Helm chart, opens a PR, gets it reviewed, and ArgoCD syncs it. The operator picks it up from there. No tickets, no manual steps, full audit trail.

The operator pattern isn’t right for everything. If you’re managing something simple — a ConfigMap that needs to be synced across namespaces, say — a controller is overkill. A CronJob or a simple script would do. But for anything with real lifecycle management — provisioning, health checking, credential rotation, cleanup — operators are the cleanest abstraction Kubernetes offers. They turn operational knowledge into code, and code is something you can test, review, version, and run at 2am without waking anyone up.