kro: Kubernetes Operators Without the Code

kro (Kube Resource Orchestrator) is a CNCF project from Kubernetes SIG Cloud Provider that landed at version 0.9 this year. The pitch is simple: define a ResourceGraphDefinition (RGD) in YAML, and kro generates a CRD plus a live controller for it — no Go, no kubebuilder, no code-gen. You get a real reconciliation loop watching real Kubernetes resources, derived from a YAML template and CEL expressions.

It deserves a closer look: where does it sit relative to Helm and proper operators, and — perhaps most interestingly — can you use it to build toy operators to learn the operator pattern without writing code?

How kro works

kro is itself an operator. It ships as a single controller Deployment that watches ResourceGraphDefinition objects. When you apply an RGD, kro's controller reads it, registers a new CRD in the cluster, and starts an instance controller for that CRD — all at runtime, without a restart, without a build. You are not generating a static CRD YAML and deploying a hand-written controller; kro is dynamically doing both on your behalf. The RGD is the source of truth; kro is the operator that turns it into a running API.

A ResourceGraphDefinition has two parts:

spec.schema — defines the CRD users interact with: field types, defaults, constraints, and what to surface in .status
spec.resources — the Kubernetes resources to create per instance, wired together with ${cel.expression} references

Consider a Redis RGD that creates a ConfigMap (for runtime params), a StatefulSet, a headless Service for peer discovery, and a regular Service for client access:

yaml

apiVersion: kro.run/v1alpha1
kind: ResourceGraphDefinition
metadata:
  name: redis
spec:
  schema:
    apiVersion: v1alpha1
    kind: Redis
    spec:
      replicas: integer | default=1
      maxMemory: string | default="256mb"
      maxMemoryPolicy: string | default="allkeys-lru"
    status:
      ready: ${statefulset.status.readyReplicas >= 1}
      readyReplicas: ${statefulset.status.readyReplicas}
      endpoint: ${schema.metadata.name + "." + schema.metadata.namespace + ".svc.cluster.local:6379"}

  resources:
    - id: config # no upstream CEL refs → first in DAG
      template:
        apiVersion: v1
        kind: ConfigMap
        metadata:
          name: ${schema.metadata.name}-config
        data:
          MAXMEMORY: ${schema.spec.maxMemory}
          MAXMEMORY_POLICY: ${schema.spec.maxMemoryPolicy}

    - id: statefulset # refs config.metadata.name → depends on config
      readyWhen:
        - ${statefulset.status.readyReplicas >= 1}
      template:
        apiVersion: apps/v1
        kind: StatefulSet
        metadata:
          name: ${schema.metadata.name}
        spec:
          serviceName: ${schema.metadata.name + "-headless"}
          replicas: ${schema.spec.replicas}
          selector:
            matchLabels:
              app: ${schema.metadata.name}
          template:
            metadata:
              labels:
                app: ${schema.metadata.name}
            spec:
              containers:
                - name: redis
                  image: redis:7-alpine
                  command:
                    [
                      "redis-server",
                      "--maxmemory",
                      "$(MAXMEMORY)",
                      "--maxmemory-policy",
                      "$(MAXMEMORY_POLICY)",
                    ]
                  ports:
                    - containerPort: 6379
                  envFrom:
                    - configMapRef:
                        name: ${config.metadata.name}

    - id: headless # refs statefulset selector → depends on statefulset
      template:
        apiVersion: v1
        kind: Service
        metadata:
          name: ${schema.metadata.name + "-headless"}
        spec:
          clusterIP: None
          selector: ${statefulset.spec.selector.matchLabels}
          ports:
            - port: 6379

    - id: client # refs statefulset selector → depends on statefulset
      template:
        apiVersion: v1
        kind: Service
        metadata:
          name: ${schema.metadata.name}
        spec:
          selector: ${statefulset.spec.selector.matchLabels}
          ports:
            - port: 6379

kro never sees an explicit ordering declaration. It parses every ${...} expression, identifies which resource ID each one references, and builds the dependency graph from that. config has no upstream references — it goes first. statefulset references config.metadata.name — it goes second, and kro waits for the readyWhen gate (readyReplicas >= 1) before proceeding. headless and client both reference statefulset.spec.selector.matchLabels — they depend on statefulset but not each other, so kro creates them in parallel.

A user creates a Redis instance with:

yaml

apiVersion: kro.run/v1alpha1
kind: Redis
metadata:
  name: session-cache
  namespace: default
spec:
  replicas: 1
  maxMemory: "512mb"
  maxMemoryPolicy: "volatile-lru"

kro creates four real Kubernetes resources in topological order and writes the client endpoint and readyReplicas count back to Redis.status. Patch maxMemory on the instance and kro reconciles the ConfigMap and restarts the pod. Delete session-cache-headless manually and kro recreates it within one reconciliation cycle.

kro vs. Helm

Helm and kro both sit in the "abstraction over a set of K8s resources" space, but their execution model is fundamentally different.

	Helm	kro
Model	Imperative install/upgrade/rollback	Declarative continuous reconciliation
Render time	At `helm install` / `helm upgrade`	Continuously, on every change
Self-healing	No — drift is not detected	Yes — reconciliation loop corrects drift
Ordering	Basic hooks, no readiness-aware DAG	Full dependency graph with readiness gates
Status	No native aggregation	Status values surfaced from child resources
Schema validation	values.yaml is untyped; JSON Schema possible	Typed schema with defaults and constraints
Templating	Go templates + Sprig	CEL expressions
Distribution	Chart registries (OCI / Helm Hub)	RGDs live as CRDs in the cluster
Lifecycle	User-triggered; release history in `Secret`s	Controller-driven; no release history

The biggest difference: Helm renders once and walks away. If someone deletes a Deployment that Helm created, it stays deleted until the next helm upgrade. kro detects drift and reconciles it back. That is the difference between a package manager and an operator.

Helm charts are also easier to share — push to an OCI registry, anyone can install. kro RGDs are cluster-scoped objects; distributing them means distributing YAML files, not a proper release artifact. On the other hand, Helm's Go-template syntax becomes genuinely painful at scale. CEL is typed, terminates by definition, and can be statically analysed before anything runs.

kro vs. Kubernetes operators

A proper Kubernetes operator (kubebuilder, controller-runtime, Kopf, etc.) is a controller written in code. kro automates what an operator does for a specific class of problems — "create and manage a fixed set of K8s resources per instance" — and makes it declarative.

	Custom operator	kro
Implementation	Go / Python / any language	YAML + CEL
Reconciliation loop	You write it	kro runs it
Logic	Arbitrary: state machines, external API calls, mutations	CEL only — no side effects, non-Turing-complete
External calls	Yes — call any API, write to DBs, etc.	No — CEL has no I/O
Complex state machines	Yes	No
Watches	Configurable — any resource type	Child resources only
Webhook support	Yes (admission, conversion)	Not directly
Build + deploy	Build binary, containerize, deploy controller pod	Apply an RGD YAML
CRD versioning	Full multi-version support	Single version (multi-version planned)

The key constraint: CEL expressions in kro have no side effects and always terminate. That is a feature for auditability — you can prove what a definition does — but it rules out anything that goes beyond wiring Kubernetes resources together. If your operator needs to call the AWS SDK directly, update a database, or implement a non-trivial state machine, you need real code.

For the large class of operators that mostly create, update, and delete standard Kubernetes (or CRD-backed) resources in response to a custom resource, kro covers the use case entirely.

Can kro build toy operators?

Yes, and it is surprisingly good at it.

The standard advice for learning the operator pattern is "use kubebuilder and implement a simple controller." That workflow requires Go, a proper dev environment, kubebuilder scaffolding, controller-gen for CRD generation, building a container, pushing it to a registry, and deploying a controller Deployment. Getting through all that before you can reconcile your first resource takes a day.

With kro, a toy operator is:

bash

kubectl apply -f my-rgd.yaml      # register CRD + start controller
kubectl apply -f my-instance.yaml # create an instance
kubectl get myresource             # observe reconciliation

That is the operator pattern — CRD, controller, reconciliation loop — in three commands and a YAML file.

One feature not shown in the Redis RGD is conditional resources via includeWhen. A resource tagged with includeWhen is only created when the CEL expression evaluates to true — and if it evaluates to false, every node that depends on it is also dropped from the graph. A MicroService RGD that optionally creates an Ingress:

yaml

    - id: ingress
      includeWhen:
        - ${schema.spec.enableIngress}
      template:
        apiVersion: networking.k8s.io/v1
        kind: Ingress
        ...

With enableIngress: false, the Ingress is never created, never tracked, never reconciled — it does not exist in kro's graph for that instance. Flip it to true and kro creates it on the next reconciliation cycle.

What the Redis and MicroService examples together teach, without a line of Go:

Dependency ordering inferred from CEL expression analysis — no explicit dependsOn
Readiness gating — upstream resources must satisfy readyWhen before dependents are created
Parallel creation for nodes with no mutual dependency (HPA and Service above)
Conditional subgraphs — entire branches included or excluded at runtime
Status aggregation — values from child resources projected back to the parent CR
Self-healing — drift in any managed resource triggers reconciliation

Where you hit the wall:

External API calls (CEL has no I/O — no AWS SDK, no HTTP requests)
Non-trivial state machines (kro has one reconciliation loop; no branching on historical state)
Watching resources outside the managed set
Admission or conversion webhooks

The ceiling is real, but it covers most of what a first or second operator needs to do.

Beyond basics: collections and RGD chaining

Collections (forEach) expand one resource template into N resources from a list or range. A Redis RGD with shards: 3 that expands into three StatefulSet instances, each with a different index, is a single resource definition with forEach: ${lists.range(schema.spec.shards)}. No loops in code, no Helm range template — the expansion is part of the graph.

RGD chaining lets one instance reference outputs from another. A Database RGD exposes status.endpoint; an Application RGD consumes it via an external reference. This is GitOps-friendly composition: each RGD is a unit of abstraction, and you compose them by referencing their status fields. The dependency management across instances is still graph-based — kro waits for the upstream instance to be ready before reconciling the downstream one.

Where kro fits in the ecosystem

Helm gives you templated installs with no continuous reconciliation. kubebuilder / operator-sdk gives you a full operator with arbitrary logic. kro sits between: real continuous reconciliation and a real CRD, constrained to wiring Kubernetes resources via CEL.

The sweet spot is platform engineering — the Application CR that wraps things like Deployment + HorizontalPodAutoscaler + PodDisruptionBudget + ServiceAccount + NetworkPolicy behind five fields. That is the use case kro was designed for, and it covers it completely without requiring a single line of controller code.

For simple operators, kro removes every barrier that usually stops people from experimenting with the operator pattern. No Go toolchain, no container registry, no deployment manifests for the controller itself — just an RGD YAML and a running kro installation.

kro: Kubernetes Operators Without the Code ​

How kro works ​

kro vs. Helm ​

kro vs. Kubernetes operators ​

Can kro build toy operators? ​

Beyond basics: collections and RGD chaining ​

Where kro fits in the ecosystem ​

kro: Kubernetes Operators Without the Code

How kro works

kro vs. Helm

kro vs. Kubernetes operators

Can kro build toy operators?

Beyond basics: collections and RGD chaining

Where kro fits in the ecosystem