kro: Kubernetes Operators Without the Code
kro (Kube Resource Orchestrator) is a CNCF project from Kubernetes SIG Cloud Provider that landed at version 0.9 this year. The pitch is simple: define a ResourceGraphDefinition (RGD) in YAML, and kro generates a CRD plus a live controller for it — no Go, no kubebuilder, no code-gen. You get a real reconciliation loop watching real Kubernetes resources, derived from a YAML template and CEL expressions.
It deserves a closer look: where does it sit relative to Helm and proper operators, and — perhaps most interestingly — can you use it to build toy operators to learn the operator pattern without writing code?
How kro works
kro is itself an operator. It ships as a single controller Deployment that watches ResourceGraphDefinition objects. When you apply an RGD, kro's controller reads it, registers a new CRD in the cluster, and starts an instance controller for that CRD — all at runtime, without a restart, without a build. You are not generating a static CRD YAML and deploying a hand-written controller; kro is dynamically doing both on your behalf. The RGD is the source of truth; kro is the operator that turns it into a running API.
A ResourceGraphDefinition has two parts:
spec.schema— defines the CRD users interact with: field types, defaults, constraints, and what to surface in.statusspec.resources— the Kubernetes resources to create per instance, wired together with${cel.expression}references
Consider a Redis RGD that creates a ConfigMap (for runtime params), a StatefulSet, a headless Service for peer discovery, and a regular Service for client access:
apiVersion: kro.run/v1alpha1
kind: ResourceGraphDefinition
metadata:
name: redis
spec:
schema:
apiVersion: v1alpha1
kind: Redis
spec:
replicas: integer | default=1
maxMemory: string | default="256mb"
maxMemoryPolicy: string | default="allkeys-lru"
status:
ready: ${statefulset.status.readyReplicas >= 1}
readyReplicas: ${statefulset.status.readyReplicas}
endpoint: ${schema.metadata.name + "." + schema.metadata.namespace + ".svc.cluster.local:6379"}
resources:
- id: config # no upstream CEL refs → first in DAG
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: ${schema.metadata.name}-config
data:
MAXMEMORY: ${schema.spec.maxMemory}
MAXMEMORY_POLICY: ${schema.spec.maxMemoryPolicy}
- id: statefulset # refs config.metadata.name → depends on config
readyWhen:
- ${statefulset.status.readyReplicas >= 1}
template:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: ${schema.metadata.name}
spec:
serviceName: ${schema.metadata.name + "-headless"}
replicas: ${schema.spec.replicas}
selector:
matchLabels:
app: ${schema.metadata.name}
template:
metadata:
labels:
app: ${schema.metadata.name}
spec:
containers:
- name: redis
image: redis:7-alpine
command:
[
"redis-server",
"--maxmemory",
"$(MAXMEMORY)",
"--maxmemory-policy",
"$(MAXMEMORY_POLICY)",
]
ports:
- containerPort: 6379
envFrom:
- configMapRef:
name: ${config.metadata.name}
- id: headless # refs statefulset selector → depends on statefulset
template:
apiVersion: v1
kind: Service
metadata:
name: ${schema.metadata.name + "-headless"}
spec:
clusterIP: None
selector: ${statefulset.spec.selector.matchLabels}
ports:
- port: 6379
- id: client # refs statefulset selector → depends on statefulset
template:
apiVersion: v1
kind: Service
metadata:
name: ${schema.metadata.name}
spec:
selector: ${statefulset.spec.selector.matchLabels}
ports:
- port: 6379kro never sees an explicit ordering declaration. It parses every ${...} expression, identifies which resource ID each one references, and builds the dependency graph from that. config has no upstream references — it goes first. statefulset references config.metadata.name — it goes second, and kro waits for the readyWhen gate (readyReplicas >= 1) before proceeding. headless and client both reference statefulset.spec.selector.matchLabels — they depend on statefulset but not each other, so kro creates them in parallel.
A user creates a Redis instance with:
apiVersion: kro.run/v1alpha1
kind: Redis
metadata:
name: session-cache
namespace: default
spec:
replicas: 1
maxMemory: "512mb"
maxMemoryPolicy: "volatile-lru"kro creates four real Kubernetes resources in topological order and writes the client endpoint and readyReplicas count back to Redis.status. Patch maxMemory on the instance and kro reconciles the ConfigMap and restarts the pod. Delete session-cache-headless manually and kro recreates it within one reconciliation cycle.
kro vs. Helm
Helm and kro both sit in the "abstraction over a set of K8s resources" space, but their execution model is fundamentally different.
| Helm | kro | |
|---|---|---|
| Model | Imperative install/upgrade/rollback | Declarative continuous reconciliation |
| Render time | At helm install / helm upgrade | Continuously, on every change |
| Self-healing | No — drift is not detected | Yes — reconciliation loop corrects drift |
| Ordering | Basic hooks, no readiness-aware DAG | Full dependency graph with readiness gates |
| Status | No native aggregation | Status values surfaced from child resources |
| Schema validation | values.yaml is untyped; JSON Schema possible | Typed schema with defaults and constraints |
| Templating | Go templates + Sprig | CEL expressions |
| Distribution | Chart registries (OCI / Helm Hub) | RGDs live as CRDs in the cluster |
| Lifecycle | User-triggered; release history in Secrets | Controller-driven; no release history |
The biggest difference: Helm renders once and walks away. If someone deletes a Deployment that Helm created, it stays deleted until the next helm upgrade. kro detects drift and reconciles it back. That is the difference between a package manager and an operator.
Helm charts are also easier to share — push to an OCI registry, anyone can install. kro RGDs are cluster-scoped objects; distributing them means distributing YAML files, not a proper release artifact. On the other hand, Helm's Go-template syntax becomes genuinely painful at scale. CEL is typed, terminates by definition, and can be statically analysed before anything runs.
kro vs. Kubernetes operators
A proper Kubernetes operator (kubebuilder, controller-runtime, Kopf, etc.) is a controller written in code. kro automates what an operator does for a specific class of problems — "create and manage a fixed set of K8s resources per instance" — and makes it declarative.
| Custom operator | kro | |
|---|---|---|
| Implementation | Go / Python / any language | YAML + CEL |
| Reconciliation loop | You write it | kro runs it |
| Logic | Arbitrary: state machines, external API calls, mutations | CEL only — no side effects, non-Turing-complete |
| External calls | Yes — call any API, write to DBs, etc. | No — CEL has no I/O |
| Complex state machines | Yes | No |
| Watches | Configurable — any resource type | Child resources only |
| Webhook support | Yes (admission, conversion) | Not directly |
| Build + deploy | Build binary, containerize, deploy controller pod | Apply an RGD YAML |
| CRD versioning | Full multi-version support | Single version (multi-version planned) |
The key constraint: CEL expressions in kro have no side effects and always terminate. That is a feature for auditability — you can prove what a definition does — but it rules out anything that goes beyond wiring Kubernetes resources together. If your operator needs to call the AWS SDK directly, update a database, or implement a non-trivial state machine, you need real code.
For the large class of operators that mostly create, update, and delete standard Kubernetes (or CRD-backed) resources in response to a custom resource, kro covers the use case entirely.
Can kro build toy operators?
Yes, and it is surprisingly good at it.
The standard advice for learning the operator pattern is "use kubebuilder and implement a simple controller." That workflow requires Go, a proper dev environment, kubebuilder scaffolding, controller-gen for CRD generation, building a container, pushing it to a registry, and deploying a controller Deployment. Getting through all that before you can reconcile your first resource takes a day.
With kro, a toy operator is:
kubectl apply -f my-rgd.yaml # register CRD + start controller
kubectl apply -f my-instance.yaml # create an instance
kubectl get myresource # observe reconciliationThat is the operator pattern — CRD, controller, reconciliation loop — in three commands and a YAML file.
One feature not shown in the Redis RGD is conditional resources via includeWhen. A resource tagged with includeWhen is only created when the CEL expression evaluates to true — and if it evaluates to false, every node that depends on it is also dropped from the graph. A MicroService RGD that optionally creates an Ingress:
- id: ingress
includeWhen:
- ${schema.spec.enableIngress}
template:
apiVersion: networking.k8s.io/v1
kind: Ingress
...With enableIngress: false, the Ingress is never created, never tracked, never reconciled — it does not exist in kro's graph for that instance. Flip it to true and kro creates it on the next reconciliation cycle.
What the Redis and MicroService examples together teach, without a line of Go:
- Dependency ordering inferred from CEL expression analysis — no explicit
dependsOn - Readiness gating — upstream resources must satisfy
readyWhenbefore dependents are created - Parallel creation for nodes with no mutual dependency (
HPAandServiceabove) - Conditional subgraphs — entire branches included or excluded at runtime
- Status aggregation — values from child resources projected back to the parent CR
- Self-healing — drift in any managed resource triggers reconciliation
Where you hit the wall:
- External API calls (CEL has no I/O — no AWS SDK, no HTTP requests)
- Non-trivial state machines (kro has one reconciliation loop; no branching on historical state)
- Watching resources outside the managed set
- Admission or conversion webhooks
The ceiling is real, but it covers most of what a first or second operator needs to do.
Beyond basics: collections and RGD chaining
Collections (forEach) expand one resource template into N resources from a list or range. A Redis RGD with shards: 3 that expands into three StatefulSet instances, each with a different index, is a single resource definition with forEach: ${lists.range(schema.spec.shards)}. No loops in code, no Helm range template — the expansion is part of the graph.
RGD chaining lets one instance reference outputs from another. A Database RGD exposes status.endpoint; an Application RGD consumes it via an external reference. This is GitOps-friendly composition: each RGD is a unit of abstraction, and you compose them by referencing their status fields. The dependency management across instances is still graph-based — kro waits for the upstream instance to be ready before reconciling the downstream one.
Where kro fits in the ecosystem
Helm gives you templated installs with no continuous reconciliation. kubebuilder / operator-sdk gives you a full operator with arbitrary logic. kro sits between: real continuous reconciliation and a real CRD, constrained to wiring Kubernetes resources via CEL.
The sweet spot is platform engineering — the Application CR that wraps things like Deployment + HorizontalPodAutoscaler + PodDisruptionBudget + ServiceAccount + NetworkPolicy behind five fields. That is the use case kro was designed for, and it covers it completely without requiring a single line of controller code.
For simple operators, kro removes every barrier that usually stops people from experimenting with the operator pattern. No Go toolchain, no container registry, no deployment manifests for the controller itself — just an RGD YAML and a running kro installation.