Build a Kubernetes operator that watches a custom resource and reconciles desired state. Applies Module 11.
Build a Kubernetes operator that watches a Custom Resource Definition and reconciles the desired state.
A K8s operator that:
SimpleApp that manages a Deployment + Service)Operators are the standard pattern for extending Kubernetes. Companies like Redis, Elastic, and MongoDB build operators to manage their products on K8s. If you can build an operator from scratch, you understand K8s at a level that most users never reach.
This project ties together everything: structs, interfaces, error handling, HTTP clients (the K8s API), concurrency (informers and work queues), and testing.
Your operator manages SimpleApp resources. A SimpleApp bundles a Deployment and a Service into a single resource:
apiVersion: apps.example.com/v1
kind: SimpleApp
metadata:
name: my-web-app
namespace: default
spec:
image: nginx:1.25
replicas: 3
port: 80
env:
- name: ENV
value: production
- name: LOG_LEVEL
value: info
When you create this resource, the operator should:
nginx:1.25# Install the CRD
kubectl apply -f config/crd.yaml
# Run the operator locally (outside the cluster)
simpleapp-operator --kubeconfig ~/.kube/config
# In another terminal, create a SimpleApp
kubectl apply -f config/samples/my-web-app.yaml
# Watch the operator create resources
kubectl get deployments
kubectl get services
kubectl get simpleapps
# Scale by editing the CR
kubectl patch simpleapp my-web-app --type merge -p '{"spec":{"replicas":5}}'
# Operator detects the change and updates the Deployment
# Delete the CR — operator cleans up
kubectl delete simpleapp my-web-app
# Deployment and Service are removed via finalizer
$ simpleapp-operator --kubeconfig ~/.kube/config
2024/01/15 10:30:01 starting SimpleApp operator
2024/01/15 10:30:01 waiting for cache sync...
2024/01/15 10:30:02 cache synced, starting workers
2024/01/15 10:30:15 reconciling SimpleApp default/my-web-app
2024/01/15 10:30:15 creating Deployment default/my-web-app (replicas=3, image=nginx:1.25)
2024/01/15 10:30:15 creating Service default/my-web-app (port=80)
2024/01/15 10:30:15 updating status: available=0/3
2024/01/15 10:30:20 updating status: available=3/3
2024/01/15 10:31:00 reconciling SimpleApp default/my-web-app
2024/01/15 10:31:00 updating Deployment default/my-web-app (replicas=3→5)
2024/01/15 10:31:00 updating status: available=3/5
2024/01/15 10:32:00 reconciling SimpleApp default/my-web-app (deleting)
2024/01/15 10:32:00 removing finalizer, cleaning up owned resources
2024/01/15 10:32:00 deleted Deployment default/my-web-app
2024/01/15 10:32:00 deleted Service default/my-web-app
CRD definition: Write a YAML file that registers the SimpleApp custom resource with the K8s API server. Include the spec and status subresources.
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: simpleapps.apps.example.com
spec:
group: apps.example.com
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
image:
type: string
replicas:
type: integer
port:
type: integer
env:
type: array
items:
type: object
properties:
name:
type: string
value:
type: string
status:
type: object
properties:
availableReplicas:
type: integer
conditions:
type: array
items:
type: object
properties:
type:
type: string
status:
type: string
lastTransitionTime:
type: string
subresources:
status: {}
scope: Namespaced
names:
plural: simpleapps
singular: simpleapp
kind: SimpleApp
Go types: Define the SimpleApp as Go structs using the unstructured client or typed client approach:
type SimpleApp struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec SimpleAppSpec `json:"spec,omitempty"`
Status SimpleAppStatus `json:"status,omitempty"`
}
type SimpleAppSpec struct {
Image string `json:"image"`
Replicas int32 `json:"replicas"`
Port int32 `json:"port"`
Env []EnvVar `json:"env,omitempty"`
}
type EnvVar struct {
Name string `json:"name"`
Value string `json:"value"`
}
type SimpleAppStatus struct {
AvailableReplicas int32 `json:"availableReplicas"`
Conditions []Condition `json:"conditions,omitempty"`
}
Controller setup: Use controller-runtime (the library behind kubebuilder and operator-sdk) to set up the manager, controller, and reconciler:
mgr, err := ctrl.NewManager(cfg, ctrl.Options{})
ctrl.NewControllerManagedBy(mgr).
For(&SimpleApp{}).
Owns(&appsv1.Deployment{}).
Owns(&corev1.Service{}).
Complete(&SimpleAppReconciler{})
Reconcile loop: The Reconcile function should:
DeletionTimestamp), run cleanup and remove the finalizerOwner references: Set owner references on created Deployments and Services so K8s garbage collection cleans them up if the operator isn't running:
ctrl.SetControllerReference(simpleApp, deployment, r.Scheme)
Finalizers: Add a finalizer to the SimpleApp on creation. On deletion, clean up owned resources and remove the finalizer:
const finalizerName = "apps.example.com/finalizer"
// Add finalizer
if !controllerutil.ContainsFinalizer(app, finalizerName) {
controllerutil.AddFinalizer(app, finalizerName)
r.Update(ctx, app)
}
// Remove finalizer after cleanup
controllerutil.RemoveFinalizer(app, finalizerName)
Idempotency: The reconcile loop must be safe to run multiple times. Use CreateOrUpdate to avoid conflicts:
_, err := ctrl.CreateOrUpdate(ctx, r.Client, deployment, func() error {
// Set desired state on the deployment
deployment.Spec.Replicas = &app.Spec.Replicas
deployment.Spec.Template.Spec.Containers[0].Image = app.Spec.Image
return ctrl.SetControllerReference(app, deployment, r.Scheme)
})
For a SimpleApp with image: nginx:1.25, replicas: 3, port: 80:
Deployment:
app.kubernetes.io/name: <name>, app.kubernetes.io/managed-by: simpleapp-operatorService:
simpleapp-operator/
├── main.go ← Entry point, create manager, start controller
├── api/
│ └── v1/
│ └── types.go ← SimpleApp, SimpleAppSpec, SimpleAppStatus structs
├── controller/
│ ├── reconciler.go ← Reconcile loop
│ ├── reconciler_test.go ← Tests with fake client
│ ├── deployment.go ← Build desired Deployment from SimpleApp spec
│ └── service.go ← Build desired Service from SimpleApp spec
├── config/
│ ├── crd.yaml ← CustomResourceDefinition
│ └── samples/
│ └── my-web-app.yaml ← Example SimpleApp resource
└── go.mod
Suggested approach:
- Start by installing the CRD and creating a SimpleApp with
kubectl— verify the API server accepts it- Set up the controller-runtime manager and a basic reconciler that just logs when it's called
- Add Deployment creation — hardcode the spec first, then read from the SimpleApp
- Add Service creation
- Add status updates — read the Deployment's status and write it back to the SimpleApp
- Add the finalizer and deletion handling
- Add idempotent updates (CreateOrUpdate)
- Add tests with the fake client
func buildDeployment(app *SimpleApp) *appsv1.Deployment {
labels := map[string]string{
"app.kubernetes.io/name": app.Name,
"app.kubernetes.io/managed-by": "simpleapp-operator",
}
return &appsv1.Deployment{
ObjectMeta: metav1.ObjectMeta{
Name: app.Name,
Namespace: app.Namespace,
},
Spec: appsv1.DeploymentSpec{
Replicas: &app.Spec.Replicas,
Selector: &metav1.LabelSelector{
MatchLabels: labels,
},
Template: corev1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{Labels: labels},
Spec: corev1.PodSpec{
Containers: []corev1.Container{{
Name: app.Name,
Image: app.Spec.Image,
Ports: []corev1.ContainerPort{{
ContainerPort: app.Spec.Port,
}},
}},
},
},
},
}
}
func (r *SimpleAppReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
log := log.FromContext(ctx)
// 1. Fetch the SimpleApp
var app SimpleApp
if err := r.Get(ctx, req.NamespacedName, &app); err != nil {
return ctrl.Result{}, client.IgnoreNotFound(err)
}
// 2. Handle deletion
if !app.DeletionTimestamp.IsZero() {
// cleanup + remove finalizer
return ctrl.Result{}, nil
}
// 3. Ensure finalizer
// 4. Create/update Deployment
// 5. Create/update Service
// 6. Update status
return ctrl.Result{}, nil
}
You don't need a real cluster for most testing. Use these approaches:
The controller-runtime fake client lets you test reconciliation logic without a cluster:
func TestReconcile_CreatesDeployment(t *testing.T) {
scheme := runtime.NewScheme()
_ = appsv1.AddToScheme(scheme)
_ = corev1.AddToScheme(scheme)
// Register your custom types too
app := &SimpleApp{
ObjectMeta: metav1.ObjectMeta{
Name: "test-app",
Namespace: "default",
},
Spec: SimpleAppSpec{
Image: "nginx:1.25",
Replicas: 3,
Port: 80,
},
}
fakeClient := fake.NewClientBuilder().
WithScheme(scheme).
WithObjects(app).
Build()
reconciler := &SimpleAppReconciler{
Client: fakeClient,
Scheme: scheme,
}
_, err := reconciler.Reconcile(context.Background(), ctrl.Request{
NamespacedName: types.NamespacedName{Name: "test-app", Namespace: "default"},
})
if err != nil {
t.Fatal(err)
}
// Verify the Deployment was created
var dep appsv1.Deployment
err = fakeClient.Get(context.Background(), types.NamespacedName{
Name: "test-app", Namespace: "default",
}, &dep)
if err != nil {
t.Fatal("expected Deployment to be created")
}
if *dep.Spec.Replicas != 3 {
t.Errorf("replicas = %d, want 3", *dep.Spec.Replicas)
}
}
envtest runs a real API server and etcd locally — no cluster needed:
func TestMain(m *testing.M) {
testEnv = &envtest.Environment{
CRDDirectoryPaths: []string{"../config"},
}
cfg, err := testEnv.Start()
// ... set up manager with cfg ...
code := m.Run()
testEnv.Stop()
os.Exit(code)
}
The Deployment and Service builder functions are pure — test them directly:
func TestBuildDeployment(t *testing.T) {
app := &SimpleApp{
ObjectMeta: metav1.ObjectMeta{Name: "web", Namespace: "prod"},
Spec: SimpleAppSpec{
Image: "myapp:v2",
Replicas: 5,
Port: 8080,
},
}
dep := buildDeployment(app)
if dep.Name != "web" {
t.Errorf("name = %q, want %q", dep.Name, "web")
}
if *dep.Spec.Replicas != 5 {
t.Errorf("replicas = %d, want 5", *dep.Spec.Replicas)
}
if dep.Spec.Template.Spec.Containers[0].Image != "myapp:v2" {
t.Errorf("image = %q, want %q", dep.Spec.Template.Spec.Containers[0].Image, "myapp:v2")
}
}
func TestBuildService(t *testing.T) {
app := &SimpleApp{
ObjectMeta: metav1.ObjectMeta{Name: "web", Namespace: "prod"},
Spec: SimpleAppSpec{Port: 8080},
}
svc := buildService(app)
if svc.Spec.Ports[0].Port != 8080 {
t.Errorf("port = %d, want 8080", svc.Spec.Ports[0].Port)
}
}
SimpleJob CRD that manages a K8s Job instead of a Deployment--namespace flag, or all namespaces by defaultSkills Used: controller-runtime, client-go, Custom Resource Definitions, reconciliation pattern, owner references, finalizers, status subresources, fake client testing, envtest, struct embedding, JSON tags, context propagation, error handling.