Configure Dynamic KRaft Quorum for Confluent Platform Using Confluent for Kubernetes
Use dynamic KRaft quorum to add and remove controller nodes from the metadata quorum without recreating the entire cluster. This feature is essential for recovering from failures and supporting both single-region and multi-region deployments.
Dynamic quorum is available in Confluent Platform 7.9 and later, and provides flexible quorum management for production environments. CFK supports both greenfield deployments and migrations from existing ZooKeeper or static KRaft clusters.
Supported deployment scenarios
CFK 3.2 and later supports the following deployment scenarios for dynamic KRaft quorum:
Greenfield deployments: Deploy new single-region or multi-region clusters with dynamic quorum from initial setup.
ZooKeeper to KRaft migration: Migrate existing ZooKeeper-based clusters to KRaft with dynamic quorum. This requires Confluent Platform 7.9.6 or later.
Static to dynamic migration: Upgrade existing static KRaft clusters to dynamic quorum. This requires Confluent Platform 8.0 or later.
Multi-region greenfield deployments require Confluent Platform 7.9.6 or later in the 7.9.x series, or 8.1.2 or later in the 8.1.x series. This is due to a known issue (KMETA-2870) in earlier versions where controller registration can time out when advertised listeners are present from initial startup. Static to dynamic migration is not affected by this issue. For details, see Deploy a multi-region cluster with dynamic quorum.
Prerequisites and requirements
Review the following prerequisites and requirements before configuring dynamic KRaft quorum.
Version requirements
Feature | Version | Notes |
|---|---|---|
KRaft (Static Quorum) | Confluent Platform 7.4 or later | This is the original KRaft implementation, and continues to be supported. |
KRaft (Dynamic Quorum) | Confluent Platform 7.9 or later | Required for dynamic quorum features. Confluent Platform 7.9.6 or later is strongly recommended for dynamic quorum. Earlier 7.9.x patch versions (7.9.0 through 7.9.5) have known issues. |
Auto-Join Quorum | Confluent Platform 8.2 or later | Simplifies observer promotion automatically. Confluent Platform 8.2 or later is recommended if you want automatic observer promotion using the auto-join feature. |
ZooKeeper to KRaft Migration (KRaft Migration Job) | Confluent Platform 7.9.6 or later | Required for ZooKeeper to KRaft migration with dynamic quorum. ZooKeeper is removed in Confluent Platform 8.0 and later. |
Static to Dynamic KRaft Migration | Confluent Platform 8.0 or later | Required for upgrading from static quorum ( |
Infrastructure requirements
CFK 3.2 or later installed.
Confluent Platform 7.9.6 or later Docker images required.
Deploy a single-region cluster with dynamic quorum
This section describes how to deploy a new KRaft cluster in a single region with dynamic quorum enabled. This is the recommended approach for greenfield deployments.
Tip
For complete end-to-end examples, see Dynamic KRaft Quorum: Single-Region on GitHub. The repository includes plaintext and secured examples with complete YAML files and commands.
Prerequisites for single-region deployment
Review the general prerequisites in Prerequisites and requirements for version requirements and feature compatibility. In addition to those general requirements, single-region deployments require:
Single Kubernetes cluster with
kubectlaccess.CFK 3.2 or later installed and running in the cluster.
Confluent Platform 7.9.6 or later Docker images (
confluentinc/cp-server:7.9.6andconfluentinc/confluent-init-container:3.2.0).Namespace created for the deployment.
Sufficient cluster resources, such as CPU, memory, and storage, for the planned number of controllers and brokers.
Step 1: Create bootstrap coordination ConfigMap
Create a ConfigMap to coordinate which pod becomes the bootstrap controller for dynamic quorum. This ConfigMap ensures that only one controller formats storage with --standalone mode.
Create the following ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: kraftcontroller-dynamic-quorum
namespace: confluent
data:
bootstrap-status: '{"bootstrap_formatted": false}'
Apply the ConfigMap:
kubectl apply -f kraftcontroller-bootstrap-configmap.yaml
The ConfigMap is required on the cluster which has the bootstrap pod.
Important
After the cluster is running and bootstrapped, do not update this ConfigMap to mark
bootstrap_formattedas false.The absence of this ConfigMap is treated the same as
bootstrap_formatted: true.
Step 2: Create RBAC resources
The bootstrap controller needs permissions to update the ConfigMap to signal that bootstrap formatting is complete.
Create the following RBAC resources:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kraftcontroller-sa
namespace: confluent
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: kraftcontroller-bootstrap-role
namespace: confluent
rules:
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["kraftcontroller-dynamic-quorum"] # Must match ConfigMap name from Step 1
verbs: ["get", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kraftcontroller-bootstrap-rolebinding
namespace: confluent
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kraftcontroller-bootstrap-role
subjects:
- kind: ServiceAccount
name: kraftcontroller-sa
namespace: confluent
Apply the RBAC resources:
kubectl apply -f kraftcontroller-rbac.yaml
Step 3: Deploy KRaft controllers
Create and configure a KRaftController CR with dynamic quorum enabled.
apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
name: kraftcontroller
namespace: confluent
annotations:
platform.confluent.io/broker-id-offset: "100" # Avoids ID conflicts with broker node IDs
spec:
replicas: 3
dataVolumeCapacity: 10Gi
image:
application: confluentinc/cp-server:7.9.6 # Use 7.9.6+; earlier 7.9.x versions have known issues
init: confluentinc/confluent-init-container:3.2.0
dynamicQuorumConfig:
enabled: true # Uses controller.quorum.bootstrap.servers instead of .voters
bootstrapPod: 0 # Only this pod formats with --standalone mode; others join as observers
podTemplate:
serviceAccountName: kraftcontroller-sa # Service account from Step 2
Apply the KRaftController CR:
kubectl apply -f kraftcontroller.yaml
Note
The bootstrap controller (kraftcontroller-0) formats its storage and creates a single-voter quorum. Additional controllers (kraftcontroller-1 and kraftcontroller-2) join as observers and wait to be promoted.
Step 4: Verify quorum formation
After the controllers are running, verify that the bootstrap controller is the only voter and other controllers are observers.
Check that all pods are running:
kubectl get pods -n confluent -l app=kraftcontroller
Check the quorum status:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --replication
Expected output:
NodeId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp Status
100 <offset> 0 <timestamp> <timestamp> Leader
101 <offset> 0 <timestamp> <timestamp> Observer
102 <offset> 0 <timestamp> <timestamp> Observer
This confirms:
Node 100 (
kraftcontroller-0) is the leader and only voter.Nodes 101 and 102 (
kraftcontroller-1andkraftcontroller-2) are observers.All observers show
Lag: 0, which means they are caught up and ready for promotion. If observers show significant lag, wait for them to catch up before promoting in the next step.
Step 5: Promote observers to voters
To create a full three-controller quorum, promote the observers to voters. On Confluent Platform 8.2 or later with auto-join enabled, observers automatically promote themselves after they are caught up with the leader. You can skip manual promotion and proceed to verification.
For Confluent Platform 7.9.x or 8.x without auto-join, manually promote observers:
# Promote kraftcontroller-1
kubectl exec kraftcontroller-1 -n confluent -- \
kafka-metadata-quorum \
--bootstrap-controller kraftcontroller-0.kraftcontroller.confluent.svc.cluster.local:9074 \
--command-config <admin-client-properties-file> \
add-controller
# Promote kraftcontroller-2
kubectl exec kraftcontroller-2 -n confluent -- \
kafka-metadata-quorum \
--bootstrap-controller kraftcontroller-0.kraftcontroller.confluent.svc.cluster.local:9074 \
--command-config <admin-client-properties-file> \
add-controller
You must ensure the following when promoting observers:
Wait for each promotion to complete before promoting the next controller.
The
add-controllercommand must be run from the controller being promoted. This is required because the admin-client properties file containsadvertised.listeners, which is specific to each controller pod. Running this command from a different pod results in incorrect listener information being registered.The admin-client properties file must include global SSL/SASL security settings and
advertised.listenersfor the controller pod being promoted.
Step 6: Verify all controllers are voters
Verify that all controllers are now voters with zero lag:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --replication
Expected output:
NodeId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp Status
100 <offset> 0 <timestamp> <timestamp> Leader
101 <offset> 0 <timestamp> <timestamp> Follower
102 <offset> 0 <timestamp> <timestamp> Follower
All controllers should show:
Lag of 0, which means they are fully caught up.
Recent fetch and caught-up timestamps.
Status as Leader or Follower, indicating all are voters.
Step 7: Deploy Kafka brokers
After the KRaft controller quorum is healthy, deploy Kafka brokers.
For KRaft-enabled Kafka with dynamic quorum, add a kRaftController cluster reference in the dependencies section:
apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
name: kafka
namespace: confluent
spec:
replicas: 3
dataVolumeCapacity: 100Gi
image:
application: confluentinc/cp-server:7.9.6
init: confluentinc/confluent-init-container:3.2.0
dependencies:
kRaftController:
clusterRef:
name: kraftcontroller
namespace: confluent
When dynamicQuorumConfig.enabled: true is set on the controllers, CFK automatically configures brokers to use controller.quorum.bootstrap.servers to discover the current quorum membership dynamically.
Apply the Kafka CR:
kubectl apply -f kafka.yaml
Verify the brokers are running:
kubectl get pods -n confluent -l app=kafka
Your single-region KRaft cluster with dynamic quorum is now deployed and operational.
Deploy a multi-region cluster with dynamic quorum
This section describes how to deploy a new KRaft cluster across multiple regions with dynamic quorum enabled. This is the recommended approach for multi-region (MRC) greenfield deployments requiring high availability across geographically distributed Kubernetes clusters.
Tip
For complete end-to-end examples, see Dynamic KRaft Quorum: Multi-Region Greenfield on GitHub. The repository includes secured examples with TLS, SASL/PLAIN, OAuth, and RBAC configuration.
Multi-region deployments require external access (LoadBalancer) for cross-cluster communication and advertised listeners so that controllers and brokers in different regions can communicate.
Prerequisites for multi-region deployment
In addition to the prerequisites listed in Prerequisites and requirements, multi-region deployments require:
LoadBalancer support in each Kubernetes cluster
DNS provider integration - for example, external-dns with GCP Cloud DNS, AWS Route 53, or Azure DNS
Confluent Platform 7.9.6 or later in the 7.9.x series, or 8.1.2 or later in the 8.1.x series. Earlier Confluent Platform versions 8.0 and 8.2 have known issues.
Shared cluster ID across all regions
Step 1: Deploy infrastructure in both regions
Deploy the necessary infrastructure components in both Kubernetes clusters before deploying KRaft controllers.
Create namespaces in both regions:
kubectl --context <region1-context> create namespace central
kubectl --context <region2-context> create namespace east
Generate TLS certificates with wildcard SANs covering both regions. The certificates must include Subject Alternative Names (SANs) for:
Wildcard DNS for external LoadBalancer endpoints (
*.yourdomain.com)Internal Kubernetes service DNS for both namespaces
Headless service DNS for StatefulSet pods
Create TLS secrets in both regions:
kubectl --context <region1-context> create secret generic tls-kraftcontroller \
--from-file=fullchain.pem=<path-to-fullchain> \
--from-file=privkey.pem=<path-to-private-key> \
--from-file=cacerts.pem=<path-to-ca-cert> \
-n central
kubectl --context <region2-context> create secret generic tls-kraftcontroller \
--from-file=fullchain.pem=<path-to-fullchain> \
--from-file=privkey.pem=<path-to-private-key> \
--from-file=cacerts.pem=<path-to-ca-cert> \
-n east
Create SASL/PLAIN credential secrets in both regions:
kubectl --context <region1-context> create secret generic credential \
--from-file=plain.txt=<path-to-plain-credentials> \
--from-file=plain-users.json=<path-to-user-list> \
--from-file=plain-interbroker.txt=<path-to-interbroker-credentials> \
-n central
kubectl --context <region2-context> create secret generic credential \
--from-file=plain.txt=<path-to-plain-credentials> \
--from-file=plain-users.json=<path-to-user-list> \
--from-file=plain-interbroker.txt=<path-to-interbroker-credentials> \
-n east
Deploy external-dns in both regions to synchronize LoadBalancer IPs with your DNS provider:
kubectl --context <region1-context> apply -f external-dns-region1.yaml
kubectl --context <region2-context> apply -f external-dns-region2.yaml
Deploy Confluent for Kubernetes in both regions:
helm --kube-context <region1-context> upgrade --install \
confluent-operator confluentinc/confluent-for-kubernetes \
--namespace central
helm --kube-context <region2-context> upgrade --install \
confluent-operator confluentinc/confluent-for-kubernetes \
--namespace east
Step 2: Create bootstrap ConfigMap and RBAC in region 1 only
The bootstrap ConfigMap and RBAC resources are only required in the bootstrap region (Region 1). Region 2 controllers join the existing quorum without needing their own bootstrap coordination.
Create the bootstrap ConfigMap in region 1:
apiVersion: v1
kind: ConfigMap
metadata:
name: kraftcontroller-dynamic-quorum
namespace: central
data:
bootstrap-status: '{"bootstrap_formatted": false}'
Apply the ConfigMap:
kubectl --context <region1-context> apply -f kraftcontroller-bootstrap-configmap.yaml
Create RBAC resources in region 1:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kraftcontroller-sa
namespace: central
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: kraftcontroller-bootstrap-role
namespace: central
rules:
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["kraftcontroller-dynamic-quorum"]
verbs: ["get", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kraftcontroller-bootstrap-rolebinding
namespace: central
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kraftcontroller-bootstrap-role
subjects:
- kind: ServiceAccount
name: kraftcontroller-sa
namespace: central
Apply the RBAC resources:
kubectl --context <region1-context> apply -f kraftcontroller-rbac.yaml
Step 3: Deploy KRaft controllers in region 1 (bootstrap region)
Deploy the bootstrap region controllers first. Region 1 is the bootstrap region where the initial quorum is created.
Create and configure a KRaftController CR with dynamic quorum and advertised listeners enabled:
apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
name: kraftcontroller
namespace: central
annotations:
platform.confluent.io/broker-id-offset: "100" # Region 1 controllers use IDs 100, 101, 102
spec:
replicas: 3
image:
application: confluentinc/cp-server:7.9.6 # Use 7.9.6+ or 8.1.2+; see version requirements
init: confluentinc/confluent-init-container:3.2.0
dataVolumeCapacity: 10Gi
dynamicQuorumConfig:
enabled: true
bootstrapPod: 0 # Only kraftcontroller-0 formats with --standalone mode
configOverrides:
server:
- "controller.quorum.voters=100@kraft-central0.yourdomain.com:9074,101@kraft-central1.yourdomain.com:9074,102@kraft-central2.yourdomain.com:9074" # Region 1 controller external addresses; required for cross-cluster communication
listeners:
controller:
advertisedListenersEnabled: true # Required for cross-region communication via external DNS
tls:
enabled: true
secretRef: tls-kraftcontroller
authentication:
type: plain
jaasConfig:
secretRef: credential
externalAccess:
type: loadBalancer
loadBalancer:
domain: yourdomain.com # Your DNS domain (for example, example.com)
prefix: kraft-central # Results in kraft-central0/1/2.yourdomain.com
podTemplate:
serviceAccountName: kraftcontroller-sa # Service account from Step 2
Apply the KRaftController CR in region 1:
kubectl --context <region1-context> apply -f kraftcontroller-region1.yaml
Wait for region 1 controllers to be ready:
kubectl --context <region1-context> wait \
--for=condition=platform.confluent.io/cluster-ready \
kraftcontroller/kraftcontroller -n central --timeout=10m
Verify that region 1 controllers are running:
kubectl --context <region1-context> get pods -n central -l app=kraftcontroller
Step 4: Retrieve cluster ID from region 1
The cluster ID must be shared with Region 2 so that Region 2 controllers join the same cluster instead of creating a separate cluster.
Retrieve the cluster ID from region 1:
kubectl --context <region1-context> get kraftcontroller kraftcontroller \
-n central -o jsonpath='{.status.clusterID}'
Note this cluster ID for use in the next step.
Step 5: Deploy KRaft controllers in region 2 (observer region)
Deploy Region 2 controllers with the same cluster ID retrieved from Region 1. Region 2 controllers join the existing quorum as observers.
Create and configure a KRaftController CR for region 2:
apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
name: kraftcontroller
namespace: east
annotations:
platform.confluent.io/broker-id-offset: "200" # Region 2 controllers use IDs 200, 201, 202
spec:
replicas: 3
clusterID: <cluster-id-from-region1> # Must match the cluster ID from Region 1
image:
application: confluentinc/cp-server:7.9.6
init: confluentinc/confluent-init-container:3.2.0
dataVolumeCapacity: 10Gi
dynamicQuorumConfig:
enabled: true # No bootstrapPod; all Region 2 controllers join as observers
configOverrides:
server:
- "controller.quorum.voters=100@kraft-central0.yourdomain.com:9074,101@kraft-central1.yourdomain.com:9074,102@kraft-central2.yourdomain.com:9074" # Region 1 voter addresses; needed to join the existing quorum
listeners:
controller:
advertisedListenersEnabled: true
tls:
enabled: true
secretRef: tls-kraftcontroller
authentication:
type: plain
jaasConfig:
secretRef: credential
externalAccess:
type: loadBalancer
loadBalancer:
domain: yourdomain.com
prefix: kraft-east # Results in kraft-east0/1/2.yourdomain.com
Apply the KRaftController CR in region 2:
kubectl --context <region2-context> apply -f kraftcontroller-region2.yaml
Wait for Region 2 controllers to be ready:
kubectl --context <region2-context> wait \
--for=condition=platform.confluent.io/cluster-ready \
kraftcontroller/kraftcontroller -n east --timeout=10m
Step 6: Verify quorum formation across regions
Verify that all controllers have joined the quorum. Region 1 controller-0 should be the leader and only voter. All other controllers should be observers.
Check quorum status from region 1:
kubectl --context <region1-context> exec kraftcontroller-0 -n central -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
--command-config <admin-properties-file> \
describe --replication
Expected output:
NodeId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp Status
100 <offset> 0 <timestamp> <timestamp> Leader
101 <offset> 0 <timestamp> <timestamp> Observer
102 <offset> 0 <timestamp> <timestamp> Observer
200 <offset> 0 <timestamp> <timestamp> Observer
201 <offset> 0 <timestamp> <timestamp> Observer
202 <offset> 0 <timestamp> <timestamp> Observer
This confirms:
Node 100 (
kraftcontroller-0in Region 1) is the leader and only voter.Nodes 101-102 (Region 1) are observers.
Nodes 200-202 (Region 2) are observers.
All observers show
Lag: 0, which means they are caught up and ready for promotion.
Note
Use a separate admin properties file (not the built-in kafka.properties) for the --command-config flag. The built-in kafka.properties contains listener-level security properties that admin CLI tools cannot use. Your admin properties file must include global client security properties: ssl.truststore.location, sasl.jaas.config, and security.protocol.
Step 7: Promote observers to voters
To create a full multi-region quorum, promote all observer controllers to voters. On Confluent Platform 8.2 or later with auto-join enabled, observers automatically promote themselves after they are caught up with the leader. You can skip manual promotion and proceed to verification.
For Confluent Platform 7.9.x or 8.x without auto-join, manually promote observers using the add-controller command.
Promote region 1 observers (controllers 1 and 2):
# Promote kraftcontroller-1 in Region 1
kubectl --context <region1-context> exec kraftcontroller-1 -n central -- \
kafka-metadata-quorum \
--bootstrap-controller kraft-central0.yourdomain.com:9074 \
--command-config <admin-properties-file> \
add-controller
# Promote kraftcontroller-2 in Region 1
kubectl --context <region1-context> exec kraftcontroller-2 -n central -- \
kafka-metadata-quorum \
--bootstrap-controller kraft-central0.yourdomain.com:9074 \
--command-config <admin-properties-file> \
add-controller
Promote region 2 observers (all three controllers):
# Promote kraftcontroller-0 in Region 2
kubectl --context <region2-context> exec kraftcontroller-0 -n east -- \
kafka-metadata-quorum \
--bootstrap-controller kraft-central0.yourdomain.com:9074 \
--command-config <admin-properties-file> \
add-controller
# Promote kraftcontroller-1 in Region 2
kubectl --context <region2-context> exec kraftcontroller-1 -n east -- \
kafka-metadata-quorum \
--bootstrap-controller kraft-central0.yourdomain.com:9074 \
--command-config <admin-properties-file> \
add-controller
# Promote kraftcontroller-2 in Region 2
kubectl --context <region2-context> exec kraftcontroller-2 -n east -- \
kafka-metadata-quorum \
--bootstrap-controller kraft-central0.yourdomain.com:9074 \
--command-config <admin-properties-file> \
add-controller
Important considerations when promoting observers:
Wait for each promotion to complete before promoting the next controller.
The
add-controllercommand must be run from the controller being promoted because the properties file contains thenode.idandlog.dirsspecific to that pod.For secured clusters, the admin properties file must include global security properties (
ssl.truststore.location,sasl.jaas.config,security.protocol), not listener-level properties fromkafka.properties.
Step 8: Verify all controllers are voters
Verify that all controllers across both regions are now voters with zero lag:
kubectl --context <region1-context> exec kraftcontroller-0 -n central -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
--command-config <admin-properties-file> \
describe --replication
Expected output:
NodeId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp Status
100 <offset> 0 <timestamp> <timestamp> Leader
101 <offset> 0 <timestamp> <timestamp> Follower
102 <offset> 0 <timestamp> <timestamp> Follower
200 <offset> 0 <timestamp> <timestamp> Follower
201 <offset> 0 <timestamp> <timestamp> Follower
202 <offset> 0 <timestamp> <timestamp> Follower
All controllers should show:
Lag that is not consistently rising, indicating controllers are actively replicating
Recent fetch and caught-up timestamps
Status as Leader or Follower, indicating all are voters
Step 9: Deploy Kafka brokers in both regions
After the KRaft controller quorum is healthy across both regions, deploy Kafka brokers in each region.
Deploy brokers in region 1:
apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
name: kafka
namespace: central
spec:
replicas: 2
dataVolumeCapacity: 100Gi
image:
application: confluentinc/cp-server:7.9.6
init: confluentinc/confluent-init-container:3.2.0
dependencies:
kRaftController:
clusterRef:
name: kraftcontroller
namespace: central
listeners:
external:
externalAccess:
type: loadBalancer
loadBalancer:
domain: yourdomain.com
prefix: kafka-central-ext
tls:
enabled: true
secretRef: tls-kafka
authentication:
type: plain
jaasConfig:
secretRef: credential
Apply the Kafka CR in region 1:
kubectl --context <region1-context> apply -f kafka-region1.yaml
Deploy brokers in region 2:
apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
name: kafka
namespace: east
spec:
replicas: 2
dataVolumeCapacity: 100Gi
image:
application: confluentinc/cp-server:7.9.6
init: confluentinc/confluent-init-container:3.2.0
dependencies:
kRaftController:
clusterRef:
name: kraftcontroller
namespace: east
listeners:
external:
externalAccess:
type: loadBalancer
loadBalancer:
domain: yourdomain.com
prefix: kafka-east-ext
tls:
enabled: true
secretRef: tls-kafka
authentication:
type: plain
jaasConfig:
secretRef: credential
Apply the Kafka CR in region 2:
kubectl --context <region2-context> apply -f kafka-region2.yaml
Wait for brokers to be ready in both regions:
kubectl --context <region1-context> wait \
--for=condition=platform.confluent.io/cluster-ready \
kafka/kafka -n central --timeout=10m
kubectl --context <region2-context> wait \
--for=condition=platform.confluent.io/cluster-ready \
kafka/kafka -n east --timeout=10m
Verify brokers are running:
kubectl --context <region1-context> get pods -n central -l app=kafka
kubectl --context <region2-context> get pods -n east -l app=kafka
Your multi-region KRaft cluster with dynamic quorum is now deployed and operational.
For migration steps, see ZooKeeper to KRaft Migration.
Migrate from ZooKeeper to KRaft with dynamic quorum
This section describes how to migrate an existing ZooKeeper-based Kafka cluster to KRaft with dynamic quorum enabled. This migration path is for clusters running on Confluent Platform 7.9.x that want to adopt KRaft with the flexibility of dynamic quorum membership. This migration is only supported on Confluent Platform 7.9.x, as ZooKeeper is removed in Confluent Platform 8.0 and later.
Tip
For complete end-to-end examples, see Dynamic KRaft Quorum: ZooKeeper to KRaft Migration on GitHub. The repository includes quick start and secured (MRC) examples with complete YAML files and commands.
Prerequisites for ZooKeeper to KRaft migration
Review the general prerequisites in Prerequisites and requirements for version requirements and feature compatibility. In addition to those general requirements, ZooKeeper to KRaft migration with dynamic quorum requires:
Existing ZooKeeper-based Kafka cluster running on Confluent Platform 7.9.6 or later.
CFK 3.2 or later with KRaftMigrationJob support.
Confluent Platform 7.9.6 or later Docker images (earlier 7.9.x versions have critical bugs).
The Kafka CR must have the IBP 3.9 annotation before starting migration.
Bootstrap ConfigMap and RBAC resources for dynamic quorum coordination.
Namespace with sufficient resources for both ZooKeeper and KRaft controllers during dual-write phase.
Important
Critical version requirement: Confluent Platform 7.9.6 or later is required for ZooKeeper to KRaft migration with dynamic quorum. Earlier versions (7.9.0 through 7.9.5) have known issues.
IBP version annotation requirement
The Kafka CR must be annotated with platform.confluent.io/kraft-migration-ibp-version: "3.9" before starting the migration. The default IBP version 3.6 is incompatible with kraft.version=1 (dynamic quorum), and without this annotation:
The
kraft.versioncannot be finalized to 1Direct-to-controller APIs are blocked
Observer promotion fails
Add the annotation to your Kafka CR before deploying the KRaftMigrationJob:
apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
name: kafka
namespace: confluent
annotations:
platform.confluent.io/kraft-migration-ibp-version: "3.9" # Required for kraft.version=1
Step 1: Deploy bootstrap ConfigMap and RBAC resources
Create the bootstrap ConfigMap to coordinate which controller pod becomes the bootstrap controller:
apiVersion: v1
kind: ConfigMap
metadata:
name: kraftcontroller-dynamic-quorum
namespace: confluent
data:
bootstrap-status: '{"bootstrap_formatted": false}'
Apply the ConfigMap:
kubectl apply -f kraftcontroller-bootstrap-configmap.yaml
Create RBAC resources to allow the bootstrap pod to update the ConfigMap:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kraftcontroller-sa
namespace: confluent
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: kraftcontroller-bootstrap-role
namespace: confluent
rules:
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["kraftcontroller-dynamic-quorum"]
verbs: ["get", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kraftcontroller-bootstrap-rolebinding
namespace: confluent
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kraftcontroller-bootstrap-role
subjects:
- kind: ServiceAccount
name: kraftcontroller-sa
namespace: confluent
Apply the RBAC resources:
kubectl apply -f kraftcontroller-rbac.yaml
Step 2: Deploy KRaftController
Deploy the KRaftController CR with dynamic quorum enabled. The controllers remain on hold until the KRaftMigrationJob starts the migration process.
apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
name: kraftcontroller
namespace: confluent
annotations:
platform.confluent.io/broker-id-offset: "100" # Controllers use IDs 100, 101, 102
platform.confluent.io/kraft-migration-hold-krc-creation: "true" # Delays pod creation until the migration job starts
platform.confluent.io/use-log4j1: "true" # Required for |co| 3.0 or later with |cp| 7.9.x
spec:
replicas: 3
image:
application: confluentinc/cp-server:7.9.6 # 7.9.6+ required for ZK-to-KRaft with dynamic quorum
init: confluentinc/confluent-init-container:3.2.0
dataVolumeCapacity: 10Gi
dynamicQuorumConfig:
enabled: true # Uses controller.quorum.bootstrap.servers instead of .voters
bootstrapPod: 0 # kraftcontroller-0 formats with --standalone; others join as observers
podTemplate:
serviceAccountName: kraftcontroller-sa # Service account from Step 1
Apply the KRaftController CR:
kubectl apply -f kraftcontroller.yaml
The controllers start but remain in a holding state until the migration job begins.
Step 3: Start the migration
Create and apply the KRaftMigrationJob to begin the ZooKeeper to KRaft migration:
apiVersion: platform.confluent.io/v1beta1
kind: KRaftMigrationJob
metadata:
name: kraftmigrationjob
namespace: confluent
spec:
kafkaClusterRef:
name: kafka
namespace: confluent
kRaftControllerClusterRef:
name: kraftcontroller
namespace: confluent
Apply the KRaftMigrationJob:
kubectl apply -f kraftmigrationjob.yaml
Step 4: Monitor migration progress
Monitor the KRaftMigrationJob status until it reaches the DUAL_WRITE phase:
kubectl get kraftmigrationjob kraftmigrationjob -n confluent -w
The migration progresses through several phases:
SETUP: Preparing the migration
MIGRATE: Starting the migration process
DUAL_WRITE: Both ZooKeeper and KRaft controllers are active and receiving metadata updates
Wait for the migration to reach the DUAL_WRITE phase before proceeding to the next step.
Step 5: Verify kraft.version=1 in DUAL_WRITE phase
After the migration reaches DUAL_WRITE, verify that kraft.version is finalized to 1 (dynamic quorum mode):
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-features --bootstrap-controller localhost:9074 describe | grep kraft.version
Expected output:
Feature: kraft.version SupportedMinVersion: 0 SupportedMaxVersion: 1 FinalizedVersionLevel: 1 Epoch: <epoch>
The FinalizedVersionLevel: 1 confirms that dynamic quorum is active.
Check the initial quorum state:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --status
Expected output shows one voter and multiple observers:
ClusterId: <cluster-id>
LeaderId: 100
LeaderEpoch: <epoch>
HighWatermark: <offset>
MaxFollowerLag: <lag>
MaxFollowerLagTimeMs: <lag-ms>
CurrentVoters: [100]
CurrentObservers: [101,102]
Check replication status to verify all controllers have low lag:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --replication
Expected output:
NodeId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp Status
100 <offset> 0 <timestamp> <timestamp> Leader
101 <offset> 0 <timestamp> <timestamp> Observer
102 <offset> 0 <timestamp> <timestamp> Observer
All observers should show low lag that is not rising, indicating they are ready for promotion.
Step 6: Promote observers to voters
Since ZooKeeper to KRaft migration requires Confluent Platform 7.9.x (ZooKeeper is removed in 8.0+), the auto-join feature is not available. You must manually promote each observer to a voter.
Important
Observer promotion must be completed during the DUAL_WRITE phase, before finalizing the migration. Do not proceed to finalization until all controllers are voters.
Run the add-controller command from each observer pod, pointing to an existing voter:
# Promote kraftcontroller-1
kubectl exec kraftcontroller-1 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller \
kraftcontroller-0.kraftcontroller.confluent.svc.cluster.local:9074 \
--command-config <admin-client-properties-file> \
add-controller
# Promote kraftcontroller-2
kubectl exec kraftcontroller-2 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller \
kraftcontroller-0.kraftcontroller.confluent.svc.cluster.local:9074 \
--command-config <admin-client-properties-file> \
add-controller
Important considerations:
The
add-controllercommand must be run from the controller being promoted.The admin-client properties file must include global SSL/SASL security settings and
advertised.listenersfor the controller pod being promoted.Wait for each promotion to complete before promoting the next controller.
Verify all three controllers are now voters:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --status
Expected output:
CurrentVoters: [100,101,102]
CurrentObservers: []
Verify replication status:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --replication
All controllers should show status as Leader or Follower (not Observer) with zero lag.
Step 7: Finalize the migration
Once all controllers are voters and the quorum is healthy, finalize the migration to complete the transition to pure KRaft mode:
kubectl annotate kraftmigrationjob kraftmigrationjob -n confluent \
platform.confluent.io/kraft-migration-trigger-finalize-to-kraft='true'
Monitor the finalization process:
kubectl get kraftmigrationjob kraftmigrationjob -n confluent \
-o jsonpath='{.status.phase}{"\n"}{.status.subPhase}{"\n"}'
Wait for the migration phase to show COMPLETE.
Step 8: Switch Kafka dependency to KRaftController
After the migration is finalized, update the Kafka CR to reference the KRaftController instead of ZooKeeper:
apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
name: kafka
namespace: confluent
annotations:
platform.confluent.io/kraft-migration-ibp-version: "3.9"
spec:
replicas: 3
dataVolumeCapacity: 100Gi
image:
application: confluentinc/cp-server:7.9.6
init: confluentinc/confluent-init-container:3.2.0
dependencies:
kRaftController:
clusterRef:
name: kraftcontroller
namespace: confluent
Apply the updated Kafka CR:
kubectl apply -f kafka-kraft-dependency.yaml
Wait for the Kafka cluster to be ready:
kubectl wait --for=condition=platform.confluent.io/cluster-ready \
kafka/kafka -n confluent --timeout=10m
Step 9: Decommission ZooKeeper
After verifying that the KRaft quorum is healthy and the Kafka cluster is operational, decommission the ZooKeeper cluster:
kubectl delete zookeeper zookeeper -n confluent
Your cluster is now running on pure KRaft with a three-voter dynamic quorum.
Verify the migration
Perform the following verification steps to ensure the migration completed successfully:
Verify all controllers are voters:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --status
Expected: CurrentVoters: [100,101,102] and CurrentObservers: []
Verify replication status:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --replication
All controllers should show status as Leader or Follower with low lag.
Migrate from static to dynamic KRaft quorum
This section describes how to migrate an existing KRaft cluster from static quorum (kraft.version=0) to dynamic quorum (kraft.version=1). This migration path is for clusters already running on KRaft with static quorum that want to adopt the flexibility of dynamic quorum membership for online controller addition and removal. Unlike greenfield deployments, this migration requires no bootstrap ConfigMap or RBAC resources since the cluster is already formatted.
Tip
For complete end-to-end examples, see Dynamic KRaft Quorum: Static to Dynamic Migration on GitHub. The repository includes quick start and secured MRC examples with complete YAML files and commands.
Prerequisites for static to dynamic migration
Review the general prerequisites in Prerequisites and requirements for version requirements and feature compatibility. In addition to those general requirements, static to dynamic KRaft migration requires:
Existing KRaft cluster running with static quorum (
kraft.version=0)Confluent Platform 8.0 or later (
kraft.version0→1 upgrade is not supported on Confluent Platform 7.9.x)CFK 3.2 or later with
dynamicQuorumConfigsupportAll controllers and brokers healthy and running
kubectl access to the cluster
Important
Critical version requirement: Confluent Platform 8.0 or later is required for static to dynamic KRaft migration. The kraft.version upgrade from 0 to 1 is not supported on Confluent Platform 7.9.x.
Pre-migration: Verify starting state
Before starting the migration, verify that the cluster is running with static quorum (kraft.version=0):
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-features --bootstrap-controller localhost:9074 describe | grep kraft.version
Expected output:
Feature: kraft.version SupportedMinVersion: 0 SupportedMaxVersion: 1 FinalizedVersionLevel: 0 Epoch: <epoch>
The FinalizedVersionLevel: 0 confirms the cluster is using static quorum.
Verify quorum health before proceeding:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --replication
All controllers should show as voters (Leader or Follower status) with low lag.
Phase 1: Add advertised listeners
Add advertisedListenersEnabled: true to the KRaftController CR before upgrading kraft.version:
apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
name: kraftcontroller
namespace: confluent
spec:
replicas: 3
listeners:
controller:
advertisedListenersEnabled: true # Required for cross-cluster communication via external DNS
externalAccess:
type: loadBalancer
loadBalancer:
domain: yourdomain.com
podTemplate:
annotations:
kafkacluster-manual-roll: "1" # Required to trigger a rolling restart (config change does not auto-roll)
Apply the updated KRaftController CR:
kubectl apply -f kraftcontroller-advertised-listeners.yaml
Wait for the rolling restart to complete:
kubectl wait --for=condition=platform.confluent.io/cluster-ready \
kraftcontroller/kraftcontroller -n confluent --timeout=10m
After this phase completes, cross-cluster admin commands (kafka-metadata-quorum, kafka-features) work from both regions.
Phase 2: Upgrade kraft.version
Upgrade kraft.version from 0 to 1 using the kafka-features CLI. This is a metadata-level operation that requires no YAML changes and no pod restarts.
Run the upgrade command from any controller pod:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-features --bootstrap-controller localhost:9074 \
upgrade --feature kraft.version=1
Verify the upgrade:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-features --bootstrap-controller localhost:9074 describe | grep kraft.version
Expected output:
Feature: kraft.version SupportedMinVersion: 0 SupportedMaxVersion: 1 FinalizedVersionLevel: 1 Epoch: <epoch>
The FinalizedVersionLevel: 1 confirms that dynamic quorum is now active at the metadata level.
Verify that directory IDs changed from placeholder values to unique UUIDs:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --replication
Each controller’s DirectoryId should now show a unique UUID instead of the placeholder AAAAAAAAAAAAAAAAAAAAAA used by static quorum. This is expected behavior - static quorum does not use directory IDs, and the upgrade to dynamic quorum assigns real directory IDs.
Important
After upgrading kraft.version to 1, proceed promptly to Phase 3. A cluster at kraft.version=1 with controller.quorum.voters still in the configuration is in a transitional state. While functional, controllers should be switched to controller.quorum.bootstrap.servers to fully benefit from dynamic quorum capabilities.
Phase 3: Switch to dynamicQuorumConfig
Enable dynamicQuorumConfig in the KRaftController CR. CFK generates controller.quorum.bootstrap.servers and removes controller.quorum.voters from the controller configuration. This triggers a rolling restart of the KRaft controllers.
Update the KRaftController CR:
apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
name: kraftcontroller
namespace: confluent
spec:
replicas: 3
dynamicQuorumConfig:
enabled: true # Generates bootstrap.servers instead of voters; no bootstrapPod needed
Apply the updated KRaftController CR:
kubectl apply -f kraftcontroller-dynamic-quorum.yaml
Wait for the rolling restart to complete:
kubectl wait --for=condition=platform.confluent.io/cluster-ready \
kraftcontroller/kraftcontroller -n confluent --timeout=10m
Verify the quorum is healthy after the rolling restart:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --status
All controllers should still be voters with the leader elected.
Verify replication status:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --replication
All controllers should show status as Leader or Follower with zero lag.
Phase 4: Roll Kafka brokers
Force a rolling restart of Kafka brokers so they pick up the new controller.quorum.bootstrap.servers configuration from the updated KRaft controllers.
Patch the Kafka CR to add a pod template annotation:
kubectl patch kafka kafka -n confluent --type merge \
-p '{"spec":{"podTemplate":{"annotations":{"kafkacluster-manual-roll":"phase4"}}}}'
Wait for the rolling restart to complete:
kubectl wait --for=condition=platform.confluent.io/cluster-ready \
kafka/kafka -n confluent --timeout=10m
After this phase, both KRaft controllers and Kafka brokers are using controller.quorum.bootstrap.servers for quorum discovery.
Verify the migration
Perform the following verification steps to ensure the migration completed successfully:
Verify quorum status:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --status
All controllers should be listed as CurrentVoters.
Troubleshoot
This section describes common issues when deploying and operating dynamic KRaft quorum.
Issue: Bootstrap pod stuck in init container
Symptom: kraftcontroller-0 is in Init:0/1 state indefinitely. The pod does not progress to the running state.
Cause: The bootstrap controller cannot update the ConfigMap due to insufficient RBAC permissions.
Solution: Verify RBAC permissions are correctly configured:
# Check if the service account has permission to update the ConfigMap
kubectl auth can-i update configmaps \
--as=system:serviceaccount:confluent:kraftcontroller-sa \
-n confluent
# Should output: yes
# Verify Role and RoleBinding exist
kubectl get role kraftcontroller-bootstrap-role -n confluent
kubectl get rolebinding kraftcontroller-bootstrap-rolebinding -n confluent
If permissions are missing, re-apply the RBAC resources from Step 2. For instructions, see Deploy a single-region cluster with dynamic quorum.
Check the init container logs for more details:
kubectl logs kraftcontroller-0 -n confluent -c config-init-container
Issue: Observers fail to auto-promote (Confluent Platform 8.2 or later)
Symptom: Controllers remain as observers indefinitely, even though auto-join should be enabled in Confluent Platform 8.2 or later.
Cause: Auto-join feature is not enabled. Check the properties file if controller.quorum.auto.join.enable is defined and set to true. If not, then it can be added using configOverrides.
Solution: Check the replication status to see if observers are caught up:
kubectl exec kraftcontroller-0 -n confluent -- \
kafka-metadata-quorum --bootstrap-controller localhost:9074 \
describe --replication
Check the Lag column for observers:
If
Lag > 1000, the lag is too high. Investigate replication issues or wait for the lag to decrease before promoting.If
Lag < 1000and is not increasing, replication is healthy. If observers still do not promote, auto-join might not be enabled. Manually promote observers using theadd-controllercommand. For instructions, see Deploy a single-region cluster with dynamic quorum.
Note
CFK does not currently expose a configuration toggle to disable/enable auto-join. CFK attempts to detect the Confluent Platform version from the image to automatically enable auto-join for 8.2 or later. Version detection can fail if you use custom Docker images or images without standard version tags.
Issue: Observer promotion fails during migration
Symptom: The add-controller command fails with an error message about version compatibility or feature flags.
Cause: The inter-broker protocol (IBP) version is less than 3.9. Dynamic quorum (kraft.version=1) requires a minimum IBP version of 3.9 or later.
Solution: Verify the IBP version is set correctly by checking the properties file:
kubectl exec kraftcontroller-0 -n confluent -- \
grep 'inter.broker.protocol.version' /opt/confluentinc/etc/kafka/kafka.properties
Expected output (3.9 or later):
inter.broker.protocol.version=3.9
If the IBP version is missing or less than 3.9, add or update the annotation:
kubectl annotate kafka kafka \
platform.confluent.io/kraft-migration-ibp-version=3.9 \
--overwrite \
-n confluent
After updating the annotation, restart the migration or retry observer promotion.
Issue: LoadBalancer configuration
Symptom: Controllers cannot connect to each other when advertisedListenersEnabled: true is set with a LoadBalancer within a single cluster or namespace.
Cause: The networking layer works correctly, but KRaft internal logic fails when using advertised external addresses for same-cluster communication. For details, see ControllerRegistrationManager registration issue.
Workaround:
Set
advertisedListenersEnabled: falseon KRaft controllers for single-region deployments.