Configure Dynamic KRaft Quorum for Confluent Platform Using Confluent for Kubernetes

Use dynamic KRaft quorum to add and remove controller nodes from the metadata quorum without recreating the entire cluster. This feature is essential for recovering from failures and supporting both single-region and multi-region deployments.

Dynamic quorum is available in Confluent Platform 7.9 and later, and provides flexible quorum management for production environments. CFK supports both greenfield deployments and migrations from existing ZooKeeper or static KRaft clusters.

Supported deployment scenarios

CFK 3.2 and later supports the following deployment scenarios for dynamic KRaft quorum:

  • Greenfield deployments: Deploy new single-region or multi-region clusters with dynamic quorum from initial setup.

  • ZooKeeper to KRaft migration: Migrate existing ZooKeeper-based clusters to KRaft with dynamic quorum. This requires Confluent Platform 7.9.6 or later.

  • Static to dynamic migration: Upgrade existing static KRaft clusters to dynamic quorum. This requires Confluent Platform 8.0 or later.

Multi-region greenfield deployments require Confluent Platform 7.9.6 or later in the 7.9.x series, or 8.1.2 or later in the 8.1.x series. This is due to a known issue (KMETA-2870) in earlier versions where controller registration can time out when advertised listeners are present from initial startup. Static to dynamic migration is not affected by this issue. For details, see Deploy a multi-region cluster with dynamic quorum.

Prerequisites and requirements

Review the following prerequisites and requirements before configuring dynamic KRaft quorum.

Version requirements

Feature

Version

Notes

KRaft (Static Quorum)

Confluent Platform 7.4 or later

This is the original KRaft implementation, and continues to be supported.

KRaft (Dynamic Quorum)

Confluent Platform 7.9 or later

Required for dynamic quorum features. Confluent Platform 7.9.6 or later is strongly recommended for dynamic quorum. Earlier 7.9.x patch versions (7.9.0 through 7.9.5) have known issues.

Auto-Join Quorum

Confluent Platform 8.2 or later

Simplifies observer promotion automatically. Confluent Platform 8.2 or later is recommended if you want automatic observer promotion using the auto-join feature.

ZooKeeper to KRaft Migration (KRaft Migration Job)

Confluent Platform 7.9.6 or later

Required for ZooKeeper to KRaft migration with dynamic quorum. ZooKeeper is removed in Confluent Platform 8.0 and later.

Static to Dynamic KRaft Migration

Confluent Platform 8.0 or later

Required for upgrading from static quorum (kraft.version=0) to dynamic quorum (kraft.version=1).

Infrastructure requirements

  • CFK 3.2 or later installed.

  • Confluent Platform 7.9.6 or later Docker images required.

Deploy a single-region cluster with dynamic quorum

This section describes how to deploy a new KRaft cluster in a single region with dynamic quorum enabled. This is the recommended approach for greenfield deployments.

Tip

For complete end-to-end examples, see Dynamic KRaft Quorum: Single-Region on GitHub. The repository includes plaintext and secured examples with complete YAML files and commands.

Prerequisites for single-region deployment

Review the general prerequisites in Prerequisites and requirements for version requirements and feature compatibility. In addition to those general requirements, single-region deployments require:

  • Single Kubernetes cluster with kubectl access.

  • CFK 3.2 or later installed and running in the cluster.

  • Confluent Platform 7.9.6 or later Docker images (confluentinc/cp-server:7.9.6 and confluentinc/confluent-init-container:3.2.0).

  • Namespace created for the deployment.

  • Sufficient cluster resources, such as CPU, memory, and storage, for the planned number of controllers and brokers.

Step 1: Create bootstrap coordination ConfigMap

Create a ConfigMap to coordinate which pod becomes the bootstrap controller for dynamic quorum. This ConfigMap ensures that only one controller formats storage with --standalone mode.

Create the following ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kraftcontroller-dynamic-quorum
  namespace: confluent
data:
  bootstrap-status: '{"bootstrap_formatted": false}'

Apply the ConfigMap:

kubectl apply -f kraftcontroller-bootstrap-configmap.yaml

The ConfigMap is required on the cluster which has the bootstrap pod.

Important

  • After the cluster is running and bootstrapped, do not update this ConfigMap to mark bootstrap_formatted as false.

  • The absence of this ConfigMap is treated the same as bootstrap_formatted: true.

Step 2: Create RBAC resources

The bootstrap controller needs permissions to update the ConfigMap to signal that bootstrap formatting is complete.

Create the following RBAC resources:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kraftcontroller-sa
  namespace: confluent

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: kraftcontroller-bootstrap-role
  namespace: confluent
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["kraftcontroller-dynamic-quorum"]  # Must match ConfigMap name from Step 1
  verbs: ["get", "update", "patch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kraftcontroller-bootstrap-rolebinding
  namespace: confluent
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kraftcontroller-bootstrap-role
subjects:
- kind: ServiceAccount
  name: kraftcontroller-sa
  namespace: confluent

Apply the RBAC resources:

kubectl apply -f kraftcontroller-rbac.yaml

Step 3: Deploy KRaft controllers

Create and configure a KRaftController CR with dynamic quorum enabled.

apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
  name: kraftcontroller
  namespace: confluent
  annotations:
    platform.confluent.io/broker-id-offset: "100"  # Avoids ID conflicts with broker node IDs
spec:
  replicas: 3
  dataVolumeCapacity: 10Gi

  image:
    application: confluentinc/cp-server:7.9.6  # Use 7.9.6+; earlier 7.9.x versions have known issues
    init: confluentinc/confluent-init-container:3.2.0

  dynamicQuorumConfig:
    enabled: true   # Uses controller.quorum.bootstrap.servers instead of .voters
    bootstrapPod: 0  # Only this pod formats with --standalone mode; others join as observers

  podTemplate:
    serviceAccountName: kraftcontroller-sa  # Service account from Step 2

Apply the KRaftController CR:

kubectl apply -f kraftcontroller.yaml

Note

The bootstrap controller (kraftcontroller-0) formats its storage and creates a single-voter quorum. Additional controllers (kraftcontroller-1 and kraftcontroller-2) join as observers and wait to be promoted.

Step 4: Verify quorum formation

After the controllers are running, verify that the bootstrap controller is the only voter and other controllers are observers.

Check that all pods are running:

kubectl get pods -n confluent -l app=kraftcontroller

Check the quorum status:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --replication

Expected output:

NodeId  LogEndOffset  Lag   LastFetchTimestamp  LastCaughtUpTimestamp  Status
100     <offset>      0     <timestamp>         <timestamp>            Leader
101     <offset>      0     <timestamp>         <timestamp>            Observer
102     <offset>      0     <timestamp>         <timestamp>            Observer

This confirms:

  • Node 100 (kraftcontroller-0) is the leader and only voter.

  • Nodes 101 and 102 (kraftcontroller-1 and kraftcontroller-2) are observers.

  • All observers show Lag: 0, which means they are caught up and ready for promotion. If observers show significant lag, wait for them to catch up before promoting in the next step.

Step 5: Promote observers to voters

To create a full three-controller quorum, promote the observers to voters. On Confluent Platform 8.2 or later with auto-join enabled, observers automatically promote themselves after they are caught up with the leader. You can skip manual promotion and proceed to verification.

For Confluent Platform 7.9.x or 8.x without auto-join, manually promote observers:

# Promote kraftcontroller-1
kubectl exec kraftcontroller-1 -n confluent -- \
  kafka-metadata-quorum \
  --bootstrap-controller kraftcontroller-0.kraftcontroller.confluent.svc.cluster.local:9074 \
  --command-config <admin-client-properties-file> \
  add-controller

# Promote kraftcontroller-2
kubectl exec kraftcontroller-2 -n confluent -- \
  kafka-metadata-quorum \
  --bootstrap-controller kraftcontroller-0.kraftcontroller.confluent.svc.cluster.local:9074 \
  --command-config <admin-client-properties-file> \
  add-controller

You must ensure the following when promoting observers:

  • Wait for each promotion to complete before promoting the next controller.

  • The add-controller command must be run from the controller being promoted. This is required because the admin-client properties file contains advertised.listeners, which is specific to each controller pod. Running this command from a different pod results in incorrect listener information being registered.

  • The admin-client properties file must include global SSL/SASL security settings and advertised.listeners for the controller pod being promoted.

Step 6: Verify all controllers are voters

Verify that all controllers are now voters with zero lag:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --replication

Expected output:

NodeId  LogEndOffset  Lag   LastFetchTimestamp  LastCaughtUpTimestamp  Status
100     <offset>      0     <timestamp>         <timestamp>            Leader
101     <offset>      0     <timestamp>         <timestamp>            Follower
102     <offset>      0     <timestamp>         <timestamp>            Follower

All controllers should show:

  • Lag of 0, which means they are fully caught up.

  • Recent fetch and caught-up timestamps.

  • Status as Leader or Follower, indicating all are voters.

Step 7: Deploy Kafka brokers

After the KRaft controller quorum is healthy, deploy Kafka brokers.

For KRaft-enabled Kafka with dynamic quorum, add a kRaftController cluster reference in the dependencies section:

apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
  name: kafka
  namespace: confluent
spec:
  replicas: 3
  dataVolumeCapacity: 100Gi

  image:
    application: confluentinc/cp-server:7.9.6
    init: confluentinc/confluent-init-container:3.2.0

  dependencies:
    kRaftController:
      clusterRef:
        name: kraftcontroller
        namespace: confluent

When dynamicQuorumConfig.enabled: true is set on the controllers, CFK automatically configures brokers to use controller.quorum.bootstrap.servers to discover the current quorum membership dynamically.

Apply the Kafka CR:

kubectl apply -f kafka.yaml

Verify the brokers are running:

kubectl get pods -n confluent -l app=kafka

Your single-region KRaft cluster with dynamic quorum is now deployed and operational.

Deploy a multi-region cluster with dynamic quorum

This section describes how to deploy a new KRaft cluster across multiple regions with dynamic quorum enabled. This is the recommended approach for multi-region (MRC) greenfield deployments requiring high availability across geographically distributed Kubernetes clusters.

Tip

For complete end-to-end examples, see Dynamic KRaft Quorum: Multi-Region Greenfield on GitHub. The repository includes secured examples with TLS, SASL/PLAIN, OAuth, and RBAC configuration.

Multi-region deployments require external access (LoadBalancer) for cross-cluster communication and advertised listeners so that controllers and brokers in different regions can communicate.

Prerequisites for multi-region deployment

In addition to the prerequisites listed in Prerequisites and requirements, multi-region deployments require:

  • LoadBalancer support in each Kubernetes cluster

  • DNS provider integration - for example, external-dns with GCP Cloud DNS, AWS Route 53, or Azure DNS

  • Confluent Platform 7.9.6 or later in the 7.9.x series, or 8.1.2 or later in the 8.1.x series. Earlier Confluent Platform versions 8.0 and 8.2 have known issues.

  • Shared cluster ID across all regions

Step 1: Deploy infrastructure in both regions

Deploy the necessary infrastructure components in both Kubernetes clusters before deploying KRaft controllers.

Create namespaces in both regions:

kubectl --context <region1-context> create namespace central
kubectl --context <region2-context> create namespace east

Generate TLS certificates with wildcard SANs covering both regions. The certificates must include Subject Alternative Names (SANs) for:

  • Wildcard DNS for external LoadBalancer endpoints (*.yourdomain.com)

  • Internal Kubernetes service DNS for both namespaces

  • Headless service DNS for StatefulSet pods

Create TLS secrets in both regions:

kubectl --context <region1-context> create secret generic tls-kraftcontroller \
  --from-file=fullchain.pem=<path-to-fullchain> \
  --from-file=privkey.pem=<path-to-private-key> \
  --from-file=cacerts.pem=<path-to-ca-cert> \
  -n central

kubectl --context <region2-context> create secret generic tls-kraftcontroller \
  --from-file=fullchain.pem=<path-to-fullchain> \
  --from-file=privkey.pem=<path-to-private-key> \
  --from-file=cacerts.pem=<path-to-ca-cert> \
  -n east

Create SASL/PLAIN credential secrets in both regions:

kubectl --context <region1-context> create secret generic credential \
  --from-file=plain.txt=<path-to-plain-credentials> \
  --from-file=plain-users.json=<path-to-user-list> \
  --from-file=plain-interbroker.txt=<path-to-interbroker-credentials> \
  -n central

kubectl --context <region2-context> create secret generic credential \
  --from-file=plain.txt=<path-to-plain-credentials> \
  --from-file=plain-users.json=<path-to-user-list> \
  --from-file=plain-interbroker.txt=<path-to-interbroker-credentials> \
  -n east

Deploy external-dns in both regions to synchronize LoadBalancer IPs with your DNS provider:

kubectl --context <region1-context> apply -f external-dns-region1.yaml
kubectl --context <region2-context> apply -f external-dns-region2.yaml

Deploy Confluent for Kubernetes in both regions:

helm --kube-context <region1-context> upgrade --install \
  confluent-operator confluentinc/confluent-for-kubernetes \
  --namespace central

helm --kube-context <region2-context> upgrade --install \
  confluent-operator confluentinc/confluent-for-kubernetes \
  --namespace east

Step 2: Create bootstrap ConfigMap and RBAC in region 1 only

The bootstrap ConfigMap and RBAC resources are only required in the bootstrap region (Region 1). Region 2 controllers join the existing quorum without needing their own bootstrap coordination.

Create the bootstrap ConfigMap in region 1:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kraftcontroller-dynamic-quorum
  namespace: central
data:
  bootstrap-status: '{"bootstrap_formatted": false}'

Apply the ConfigMap:

kubectl --context <region1-context> apply -f kraftcontroller-bootstrap-configmap.yaml

Create RBAC resources in region 1:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kraftcontroller-sa
  namespace: central

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: kraftcontroller-bootstrap-role
  namespace: central
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["kraftcontroller-dynamic-quorum"]
  verbs: ["get", "update", "patch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kraftcontroller-bootstrap-rolebinding
  namespace: central
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kraftcontroller-bootstrap-role
subjects:
- kind: ServiceAccount
  name: kraftcontroller-sa
  namespace: central

Apply the RBAC resources:

kubectl --context <region1-context> apply -f kraftcontroller-rbac.yaml

Step 3: Deploy KRaft controllers in region 1 (bootstrap region)

Deploy the bootstrap region controllers first. Region 1 is the bootstrap region where the initial quorum is created.

Create and configure a KRaftController CR with dynamic quorum and advertised listeners enabled:

apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
  name: kraftcontroller
  namespace: central
  annotations:
    platform.confluent.io/broker-id-offset: "100"  # Region 1 controllers use IDs 100, 101, 102
spec:
  replicas: 3

  image:
    application: confluentinc/cp-server:7.9.6  # Use 7.9.6+ or 8.1.2+; see version requirements
    init: confluentinc/confluent-init-container:3.2.0

  dataVolumeCapacity: 10Gi

  dynamicQuorumConfig:
    enabled: true
    bootstrapPod: 0  # Only kraftcontroller-0 formats with --standalone mode

  configOverrides:
    server:
      - "controller.quorum.voters=100@kraft-central0.yourdomain.com:9074,101@kraft-central1.yourdomain.com:9074,102@kraft-central2.yourdomain.com:9074"  # Region 1 controller external addresses; required for cross-cluster communication

  listeners:
    controller:
      advertisedListenersEnabled: true  # Required for cross-region communication via external DNS
      tls:
        enabled: true
        secretRef: tls-kraftcontroller
      authentication:
        type: plain
        jaasConfig:
          secretRef: credential
      externalAccess:
        type: loadBalancer
        loadBalancer:
          domain: yourdomain.com  # Your DNS domain (for example, example.com)
          prefix: kraft-central   # Results in kraft-central0/1/2.yourdomain.com

  podTemplate:
    serviceAccountName: kraftcontroller-sa  # Service account from Step 2

Apply the KRaftController CR in region 1:

kubectl --context <region1-context> apply -f kraftcontroller-region1.yaml

Wait for region 1 controllers to be ready:

kubectl --context <region1-context> wait \
  --for=condition=platform.confluent.io/cluster-ready \
  kraftcontroller/kraftcontroller -n central --timeout=10m

Verify that region 1 controllers are running:

kubectl --context <region1-context> get pods -n central -l app=kraftcontroller

Step 4: Retrieve cluster ID from region 1

The cluster ID must be shared with Region 2 so that Region 2 controllers join the same cluster instead of creating a separate cluster.

Retrieve the cluster ID from region 1:

kubectl --context <region1-context> get kraftcontroller kraftcontroller \
  -n central -o jsonpath='{.status.clusterID}'

Note this cluster ID for use in the next step.

Step 5: Deploy KRaft controllers in region 2 (observer region)

Deploy Region 2 controllers with the same cluster ID retrieved from Region 1. Region 2 controllers join the existing quorum as observers.

Create and configure a KRaftController CR for region 2:

apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
  name: kraftcontroller
  namespace: east
  annotations:
    platform.confluent.io/broker-id-offset: "200"  # Region 2 controllers use IDs 200, 201, 202
spec:
  replicas: 3
  clusterID: <cluster-id-from-region1>  # Must match the cluster ID from Region 1

  image:
    application: confluentinc/cp-server:7.9.6
    init: confluentinc/confluent-init-container:3.2.0

  dataVolumeCapacity: 10Gi

  dynamicQuorumConfig:
    enabled: true  # No bootstrapPod; all Region 2 controllers join as observers

  configOverrides:
    server:
      - "controller.quorum.voters=100@kraft-central0.yourdomain.com:9074,101@kraft-central1.yourdomain.com:9074,102@kraft-central2.yourdomain.com:9074"  # Region 1 voter addresses; needed to join the existing quorum

  listeners:
    controller:
      advertisedListenersEnabled: true
      tls:
        enabled: true
        secretRef: tls-kraftcontroller
      authentication:
        type: plain
        jaasConfig:
          secretRef: credential
      externalAccess:
        type: loadBalancer
        loadBalancer:
          domain: yourdomain.com
          prefix: kraft-east  # Results in kraft-east0/1/2.yourdomain.com

Apply the KRaftController CR in region 2:

kubectl --context <region2-context> apply -f kraftcontroller-region2.yaml

Wait for Region 2 controllers to be ready:

kubectl --context <region2-context> wait \
  --for=condition=platform.confluent.io/cluster-ready \
  kraftcontroller/kraftcontroller -n east --timeout=10m

Step 6: Verify quorum formation across regions

Verify that all controllers have joined the quorum. Region 1 controller-0 should be the leader and only voter. All other controllers should be observers.

Check quorum status from region 1:

kubectl --context <region1-context> exec kraftcontroller-0 -n central -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  --command-config <admin-properties-file> \
  describe --replication

Expected output:

NodeId  LogEndOffset  Lag   LastFetchTimestamp  LastCaughtUpTimestamp  Status
100     <offset>      0     <timestamp>         <timestamp>            Leader
101     <offset>      0     <timestamp>         <timestamp>            Observer
102     <offset>      0     <timestamp>         <timestamp>            Observer
200     <offset>      0     <timestamp>         <timestamp>            Observer
201     <offset>      0     <timestamp>         <timestamp>            Observer
202     <offset>      0     <timestamp>         <timestamp>            Observer

This confirms:

  • Node 100 (kraftcontroller-0 in Region 1) is the leader and only voter.

  • Nodes 101-102 (Region 1) are observers.

  • Nodes 200-202 (Region 2) are observers.

  • All observers show Lag: 0, which means they are caught up and ready for promotion.

Note

Use a separate admin properties file (not the built-in kafka.properties) for the --command-config flag. The built-in kafka.properties contains listener-level security properties that admin CLI tools cannot use. Your admin properties file must include global client security properties: ssl.truststore.location, sasl.jaas.config, and security.protocol.

Step 7: Promote observers to voters

To create a full multi-region quorum, promote all observer controllers to voters. On Confluent Platform 8.2 or later with auto-join enabled, observers automatically promote themselves after they are caught up with the leader. You can skip manual promotion and proceed to verification.

For Confluent Platform 7.9.x or 8.x without auto-join, manually promote observers using the add-controller command.

Promote region 1 observers (controllers 1 and 2):

# Promote kraftcontroller-1 in Region 1
kubectl --context <region1-context> exec kraftcontroller-1 -n central -- \
  kafka-metadata-quorum \
  --bootstrap-controller kraft-central0.yourdomain.com:9074 \
  --command-config <admin-properties-file> \
  add-controller

# Promote kraftcontroller-2 in Region 1
kubectl --context <region1-context> exec kraftcontroller-2 -n central -- \
  kafka-metadata-quorum \
  --bootstrap-controller kraft-central0.yourdomain.com:9074 \
  --command-config <admin-properties-file> \
  add-controller

Promote region 2 observers (all three controllers):

# Promote kraftcontroller-0 in Region 2
kubectl --context <region2-context> exec kraftcontroller-0 -n east -- \
  kafka-metadata-quorum \
  --bootstrap-controller kraft-central0.yourdomain.com:9074 \
  --command-config <admin-properties-file> \
  add-controller

# Promote kraftcontroller-1 in Region 2
kubectl --context <region2-context> exec kraftcontroller-1 -n east -- \
  kafka-metadata-quorum \
  --bootstrap-controller kraft-central0.yourdomain.com:9074 \
  --command-config <admin-properties-file> \
  add-controller

# Promote kraftcontroller-2 in Region 2
kubectl --context <region2-context> exec kraftcontroller-2 -n east -- \
  kafka-metadata-quorum \
  --bootstrap-controller kraft-central0.yourdomain.com:9074 \
  --command-config <admin-properties-file> \
  add-controller

Important considerations when promoting observers:

  • Wait for each promotion to complete before promoting the next controller.

  • The add-controller command must be run from the controller being promoted because the properties file contains the node.id and log.dirs specific to that pod.

  • For secured clusters, the admin properties file must include global security properties (ssl.truststore.location, sasl.jaas.config, security.protocol), not listener-level properties from kafka.properties.

Step 8: Verify all controllers are voters

Verify that all controllers across both regions are now voters with zero lag:

kubectl --context <region1-context> exec kraftcontroller-0 -n central -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  --command-config <admin-properties-file> \
  describe --replication

Expected output:

NodeId  LogEndOffset  Lag   LastFetchTimestamp  LastCaughtUpTimestamp  Status
100     <offset>      0     <timestamp>         <timestamp>            Leader
101     <offset>      0     <timestamp>         <timestamp>            Follower
102     <offset>      0     <timestamp>         <timestamp>            Follower
200     <offset>      0     <timestamp>         <timestamp>            Follower
201     <offset>      0     <timestamp>         <timestamp>            Follower
202     <offset>      0     <timestamp>         <timestamp>            Follower

All controllers should show:

  • Lag that is not consistently rising, indicating controllers are actively replicating

  • Recent fetch and caught-up timestamps

  • Status as Leader or Follower, indicating all are voters

Step 9: Deploy Kafka brokers in both regions

After the KRaft controller quorum is healthy across both regions, deploy Kafka brokers in each region.

Deploy brokers in region 1:

apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
  name: kafka
  namespace: central
spec:
  replicas: 2
  dataVolumeCapacity: 100Gi

  image:
    application: confluentinc/cp-server:7.9.6
    init: confluentinc/confluent-init-container:3.2.0

  dependencies:
    kRaftController:
      clusterRef:
        name: kraftcontroller
        namespace: central

  listeners:
    external:
      externalAccess:
        type: loadBalancer
        loadBalancer:
          domain: yourdomain.com
          prefix: kafka-central-ext
      tls:
        enabled: true
        secretRef: tls-kafka
      authentication:
        type: plain
        jaasConfig:
          secretRef: credential

Apply the Kafka CR in region 1:

kubectl --context <region1-context> apply -f kafka-region1.yaml

Deploy brokers in region 2:

apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
  name: kafka
  namespace: east
spec:
  replicas: 2
  dataVolumeCapacity: 100Gi

  image:
    application: confluentinc/cp-server:7.9.6
    init: confluentinc/confluent-init-container:3.2.0

  dependencies:
    kRaftController:
      clusterRef:
        name: kraftcontroller
        namespace: east

  listeners:
    external:
      externalAccess:
        type: loadBalancer
        loadBalancer:
          domain: yourdomain.com
          prefix: kafka-east-ext
      tls:
        enabled: true
        secretRef: tls-kafka
      authentication:
        type: plain
        jaasConfig:
          secretRef: credential

Apply the Kafka CR in region 2:

kubectl --context <region2-context> apply -f kafka-region2.yaml

Wait for brokers to be ready in both regions:

kubectl --context <region1-context> wait \
  --for=condition=platform.confluent.io/cluster-ready \
  kafka/kafka -n central --timeout=10m

kubectl --context <region2-context> wait \
  --for=condition=platform.confluent.io/cluster-ready \
  kafka/kafka -n east --timeout=10m

Verify brokers are running:

kubectl --context <region1-context> get pods -n central -l app=kafka
kubectl --context <region2-context> get pods -n east -l app=kafka

Your multi-region KRaft cluster with dynamic quorum is now deployed and operational.

For migration steps, see ZooKeeper to KRaft Migration.

Migrate from ZooKeeper to KRaft with dynamic quorum

This section describes how to migrate an existing ZooKeeper-based Kafka cluster to KRaft with dynamic quorum enabled. This migration path is for clusters running on Confluent Platform 7.9.x that want to adopt KRaft with the flexibility of dynamic quorum membership. This migration is only supported on Confluent Platform 7.9.x, as ZooKeeper is removed in Confluent Platform 8.0 and later.

Tip

For complete end-to-end examples, see Dynamic KRaft Quorum: ZooKeeper to KRaft Migration on GitHub. The repository includes quick start and secured (MRC) examples with complete YAML files and commands.

Prerequisites for ZooKeeper to KRaft migration

Review the general prerequisites in Prerequisites and requirements for version requirements and feature compatibility. In addition to those general requirements, ZooKeeper to KRaft migration with dynamic quorum requires:

  • Existing ZooKeeper-based Kafka cluster running on Confluent Platform 7.9.6 or later.

  • CFK 3.2 or later with KRaftMigrationJob support.

  • Confluent Platform 7.9.6 or later Docker images (earlier 7.9.x versions have critical bugs).

  • The Kafka CR must have the IBP 3.9 annotation before starting migration.

  • Bootstrap ConfigMap and RBAC resources for dynamic quorum coordination.

  • Namespace with sufficient resources for both ZooKeeper and KRaft controllers during dual-write phase.

Important

Critical version requirement: Confluent Platform 7.9.6 or later is required for ZooKeeper to KRaft migration with dynamic quorum. Earlier versions (7.9.0 through 7.9.5) have known issues.

IBP version annotation requirement

The Kafka CR must be annotated with platform.confluent.io/kraft-migration-ibp-version: "3.9" before starting the migration. The default IBP version 3.6 is incompatible with kraft.version=1 (dynamic quorum), and without this annotation:

  • The kraft.version cannot be finalized to 1

  • Direct-to-controller APIs are blocked

  • Observer promotion fails

Add the annotation to your Kafka CR before deploying the KRaftMigrationJob:

apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
  name: kafka
  namespace: confluent
  annotations:
    platform.confluent.io/kraft-migration-ibp-version: "3.9"  # Required for kraft.version=1

Step 1: Deploy bootstrap ConfigMap and RBAC resources

Create the bootstrap ConfigMap to coordinate which controller pod becomes the bootstrap controller:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kraftcontroller-dynamic-quorum
  namespace: confluent
data:
  bootstrap-status: '{"bootstrap_formatted": false}'

Apply the ConfigMap:

kubectl apply -f kraftcontroller-bootstrap-configmap.yaml

Create RBAC resources to allow the bootstrap pod to update the ConfigMap:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kraftcontroller-sa
  namespace: confluent

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: kraftcontroller-bootstrap-role
  namespace: confluent
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["kraftcontroller-dynamic-quorum"]
  verbs: ["get", "update", "patch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kraftcontroller-bootstrap-rolebinding
  namespace: confluent
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kraftcontroller-bootstrap-role
subjects:
- kind: ServiceAccount
  name: kraftcontroller-sa
  namespace: confluent

Apply the RBAC resources:

kubectl apply -f kraftcontroller-rbac.yaml

Step 2: Deploy KRaftController

Deploy the KRaftController CR with dynamic quorum enabled. The controllers remain on hold until the KRaftMigrationJob starts the migration process.

apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
  name: kraftcontroller
  namespace: confluent
  annotations:
    platform.confluent.io/broker-id-offset: "100"  # Controllers use IDs 100, 101, 102
    platform.confluent.io/kraft-migration-hold-krc-creation: "true"  # Delays pod creation until the migration job starts
    platform.confluent.io/use-log4j1: "true"  # Required for |co| 3.0 or later with |cp| 7.9.x
spec:
  replicas: 3

  image:
    application: confluentinc/cp-server:7.9.6  # 7.9.6+ required for ZK-to-KRaft with dynamic quorum
    init: confluentinc/confluent-init-container:3.2.0

  dataVolumeCapacity: 10Gi

  dynamicQuorumConfig:
    enabled: true   # Uses controller.quorum.bootstrap.servers instead of .voters
    bootstrapPod: 0  # kraftcontroller-0 formats with --standalone; others join as observers

  podTemplate:
    serviceAccountName: kraftcontroller-sa  # Service account from Step 1

Apply the KRaftController CR:

kubectl apply -f kraftcontroller.yaml

The controllers start but remain in a holding state until the migration job begins.

Step 3: Start the migration

Create and apply the KRaftMigrationJob to begin the ZooKeeper to KRaft migration:

apiVersion: platform.confluent.io/v1beta1
kind: KRaftMigrationJob
metadata:
  name: kraftmigrationjob
  namespace: confluent
spec:
  kafkaClusterRef:
    name: kafka
    namespace: confluent
  kRaftControllerClusterRef:
    name: kraftcontroller
    namespace: confluent

Apply the KRaftMigrationJob:

kubectl apply -f kraftmigrationjob.yaml

Step 4: Monitor migration progress

Monitor the KRaftMigrationJob status until it reaches the DUAL_WRITE phase:

kubectl get kraftmigrationjob kraftmigrationjob -n confluent -w

The migration progresses through several phases:

  • SETUP: Preparing the migration

  • MIGRATE: Starting the migration process

  • DUAL_WRITE: Both ZooKeeper and KRaft controllers are active and receiving metadata updates

Wait for the migration to reach the DUAL_WRITE phase before proceeding to the next step.

Step 5: Verify kraft.version=1 in DUAL_WRITE phase

After the migration reaches DUAL_WRITE, verify that kraft.version is finalized to 1 (dynamic quorum mode):

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-features --bootstrap-controller localhost:9074 describe | grep kraft.version

Expected output:

Feature: kraft.version  SupportedMinVersion: 0  SupportedMaxVersion: 1  FinalizedVersionLevel: 1  Epoch: <epoch>

The FinalizedVersionLevel: 1 confirms that dynamic quorum is active.

Check the initial quorum state:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --status

Expected output shows one voter and multiple observers:

ClusterId:        <cluster-id>
LeaderId:         100
LeaderEpoch:      <epoch>
HighWatermark:    <offset>
MaxFollowerLag:   <lag>
MaxFollowerLagTimeMs: <lag-ms>
CurrentVoters:    [100]
CurrentObservers: [101,102]

Check replication status to verify all controllers have low lag:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --replication

Expected output:

NodeId  LogEndOffset  Lag   LastFetchTimestamp  LastCaughtUpTimestamp  Status
100     <offset>      0     <timestamp>         <timestamp>            Leader
101     <offset>      0     <timestamp>         <timestamp>            Observer
102     <offset>      0     <timestamp>         <timestamp>            Observer

All observers should show low lag that is not rising, indicating they are ready for promotion.

Step 6: Promote observers to voters

Since ZooKeeper to KRaft migration requires Confluent Platform 7.9.x (ZooKeeper is removed in 8.0+), the auto-join feature is not available. You must manually promote each observer to a voter.

Important

Observer promotion must be completed during the DUAL_WRITE phase, before finalizing the migration. Do not proceed to finalization until all controllers are voters.

Run the add-controller command from each observer pod, pointing to an existing voter:

# Promote kraftcontroller-1
kubectl exec kraftcontroller-1 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller \
    kraftcontroller-0.kraftcontroller.confluent.svc.cluster.local:9074 \
    --command-config <admin-client-properties-file> \
    add-controller

# Promote kraftcontroller-2
kubectl exec kraftcontroller-2 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller \
    kraftcontroller-0.kraftcontroller.confluent.svc.cluster.local:9074 \
    --command-config <admin-client-properties-file> \
    add-controller

Important considerations:

  • The add-controller command must be run from the controller being promoted.

  • The admin-client properties file must include global SSL/SASL security settings and advertised.listeners for the controller pod being promoted.

  • Wait for each promotion to complete before promoting the next controller.

Verify all three controllers are now voters:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --status

Expected output:

CurrentVoters:    [100,101,102]
CurrentObservers: []

Verify replication status:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --replication

All controllers should show status as Leader or Follower (not Observer) with zero lag.

Step 7: Finalize the migration

Once all controllers are voters and the quorum is healthy, finalize the migration to complete the transition to pure KRaft mode:

kubectl annotate kraftmigrationjob kraftmigrationjob -n confluent \
  platform.confluent.io/kraft-migration-trigger-finalize-to-kraft='true'

Monitor the finalization process:

kubectl get kraftmigrationjob kraftmigrationjob -n confluent \
  -o jsonpath='{.status.phase}{"\n"}{.status.subPhase}{"\n"}'

Wait for the migration phase to show COMPLETE.

Step 8: Switch Kafka dependency to KRaftController

After the migration is finalized, update the Kafka CR to reference the KRaftController instead of ZooKeeper:

apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
  name: kafka
  namespace: confluent
  annotations:
    platform.confluent.io/kraft-migration-ibp-version: "3.9"
spec:
  replicas: 3
  dataVolumeCapacity: 100Gi

  image:
    application: confluentinc/cp-server:7.9.6
    init: confluentinc/confluent-init-container:3.2.0

  dependencies:
    kRaftController:
      clusterRef:
        name: kraftcontroller
        namespace: confluent

Apply the updated Kafka CR:

kubectl apply -f kafka-kraft-dependency.yaml

Wait for the Kafka cluster to be ready:

kubectl wait --for=condition=platform.confluent.io/cluster-ready \
  kafka/kafka -n confluent --timeout=10m

Step 9: Decommission ZooKeeper

After verifying that the KRaft quorum is healthy and the Kafka cluster is operational, decommission the ZooKeeper cluster:

kubectl delete zookeeper zookeeper -n confluent

Your cluster is now running on pure KRaft with a three-voter dynamic quorum.

Verify the migration

Perform the following verification steps to ensure the migration completed successfully:

Verify all controllers are voters:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --status

Expected: CurrentVoters: [100,101,102] and CurrentObservers: []

Verify replication status:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --replication

All controllers should show status as Leader or Follower with low lag.

Migrate from static to dynamic KRaft quorum

This section describes how to migrate an existing KRaft cluster from static quorum (kraft.version=0) to dynamic quorum (kraft.version=1). This migration path is for clusters already running on KRaft with static quorum that want to adopt the flexibility of dynamic quorum membership for online controller addition and removal. Unlike greenfield deployments, this migration requires no bootstrap ConfigMap or RBAC resources since the cluster is already formatted.

Tip

For complete end-to-end examples, see Dynamic KRaft Quorum: Static to Dynamic Migration on GitHub. The repository includes quick start and secured MRC examples with complete YAML files and commands.

Prerequisites for static to dynamic migration

Review the general prerequisites in Prerequisites and requirements for version requirements and feature compatibility. In addition to those general requirements, static to dynamic KRaft migration requires:

  • Existing KRaft cluster running with static quorum (kraft.version=0)

  • Confluent Platform 8.0 or later (kraft.version 0→1 upgrade is not supported on Confluent Platform 7.9.x)

  • CFK 3.2 or later with dynamicQuorumConfig support

  • All controllers and brokers healthy and running

  • kubectl access to the cluster

Important

Critical version requirement: Confluent Platform 8.0 or later is required for static to dynamic KRaft migration. The kraft.version upgrade from 0 to 1 is not supported on Confluent Platform 7.9.x.

Pre-migration: Verify starting state

Before starting the migration, verify that the cluster is running with static quorum (kraft.version=0):

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-features --bootstrap-controller localhost:9074 describe | grep kraft.version

Expected output:

Feature: kraft.version  SupportedMinVersion: 0  SupportedMaxVersion: 1  FinalizedVersionLevel: 0  Epoch: <epoch>

The FinalizedVersionLevel: 0 confirms the cluster is using static quorum.

Verify quorum health before proceeding:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --replication

All controllers should show as voters (Leader or Follower status) with low lag.

Phase 1: Add advertised listeners

Add advertisedListenersEnabled: true to the KRaftController CR before upgrading kraft.version:

apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
  name: kraftcontroller
  namespace: confluent
spec:
  replicas: 3

  listeners:
    controller:
      advertisedListenersEnabled: true  # Required for cross-cluster communication via external DNS
      externalAccess:
        type: loadBalancer
        loadBalancer:
          domain: yourdomain.com

  podTemplate:
    annotations:
      kafkacluster-manual-roll: "1"  # Required to trigger a rolling restart (config change does not auto-roll)

Apply the updated KRaftController CR:

kubectl apply -f kraftcontroller-advertised-listeners.yaml

Wait for the rolling restart to complete:

kubectl wait --for=condition=platform.confluent.io/cluster-ready \
  kraftcontroller/kraftcontroller -n confluent --timeout=10m

After this phase completes, cross-cluster admin commands (kafka-metadata-quorum, kafka-features) work from both regions.

Phase 2: Upgrade kraft.version

Upgrade kraft.version from 0 to 1 using the kafka-features CLI. This is a metadata-level operation that requires no YAML changes and no pod restarts.

Run the upgrade command from any controller pod:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-features --bootstrap-controller localhost:9074 \
  upgrade --feature kraft.version=1

Verify the upgrade:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-features --bootstrap-controller localhost:9074 describe | grep kraft.version

Expected output:

Feature: kraft.version  SupportedMinVersion: 0  SupportedMaxVersion: 1  FinalizedVersionLevel: 1  Epoch: <epoch>

The FinalizedVersionLevel: 1 confirms that dynamic quorum is now active at the metadata level.

Verify that directory IDs changed from placeholder values to unique UUIDs:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --replication

Each controller’s DirectoryId should now show a unique UUID instead of the placeholder AAAAAAAAAAAAAAAAAAAAAA used by static quorum. This is expected behavior - static quorum does not use directory IDs, and the upgrade to dynamic quorum assigns real directory IDs.

Important

After upgrading kraft.version to 1, proceed promptly to Phase 3. A cluster at kraft.version=1 with controller.quorum.voters still in the configuration is in a transitional state. While functional, controllers should be switched to controller.quorum.bootstrap.servers to fully benefit from dynamic quorum capabilities.

Phase 3: Switch to dynamicQuorumConfig

Enable dynamicQuorumConfig in the KRaftController CR. CFK generates controller.quorum.bootstrap.servers and removes controller.quorum.voters from the controller configuration. This triggers a rolling restart of the KRaft controllers.

Update the KRaftController CR:

apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
  name: kraftcontroller
  namespace: confluent
spec:
  replicas: 3

  dynamicQuorumConfig:
    enabled: true  # Generates bootstrap.servers instead of voters; no bootstrapPod needed

Apply the updated KRaftController CR:

kubectl apply -f kraftcontroller-dynamic-quorum.yaml

Wait for the rolling restart to complete:

kubectl wait --for=condition=platform.confluent.io/cluster-ready \
  kraftcontroller/kraftcontroller -n confluent --timeout=10m

Verify the quorum is healthy after the rolling restart:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --status

All controllers should still be voters with the leader elected.

Verify replication status:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --replication

All controllers should show status as Leader or Follower with zero lag.

Phase 4: Roll Kafka brokers

Force a rolling restart of Kafka brokers so they pick up the new controller.quorum.bootstrap.servers configuration from the updated KRaft controllers.

Patch the Kafka CR to add a pod template annotation:

kubectl patch kafka kafka -n confluent --type merge \
  -p '{"spec":{"podTemplate":{"annotations":{"kafkacluster-manual-roll":"phase4"}}}}'

Wait for the rolling restart to complete:

kubectl wait --for=condition=platform.confluent.io/cluster-ready \
  kafka/kafka -n confluent --timeout=10m

After this phase, both KRaft controllers and Kafka brokers are using controller.quorum.bootstrap.servers for quorum discovery.

Verify the migration

Perform the following verification steps to ensure the migration completed successfully:

Verify quorum status:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --status

All controllers should be listed as CurrentVoters.

Troubleshoot

This section describes common issues when deploying and operating dynamic KRaft quorum.

Issue: Bootstrap pod stuck in init container

Symptom: kraftcontroller-0 is in Init:0/1 state indefinitely. The pod does not progress to the running state.

Cause: The bootstrap controller cannot update the ConfigMap due to insufficient RBAC permissions.

Solution: Verify RBAC permissions are correctly configured:

# Check if the service account has permission to update the ConfigMap
kubectl auth can-i update configmaps \
  --as=system:serviceaccount:confluent:kraftcontroller-sa \
  -n confluent

# Should output: yes

# Verify Role and RoleBinding exist
kubectl get role kraftcontroller-bootstrap-role -n confluent
kubectl get rolebinding kraftcontroller-bootstrap-rolebinding -n confluent

If permissions are missing, re-apply the RBAC resources from Step 2. For instructions, see Deploy a single-region cluster with dynamic quorum.

Check the init container logs for more details:

kubectl logs kraftcontroller-0 -n confluent -c config-init-container

Issue: Observers fail to auto-promote (Confluent Platform 8.2 or later)

Symptom: Controllers remain as observers indefinitely, even though auto-join should be enabled in Confluent Platform 8.2 or later.

Cause: Auto-join feature is not enabled. Check the properties file if controller.quorum.auto.join.enable is defined and set to true. If not, then it can be added using configOverrides.

Solution: Check the replication status to see if observers are caught up:

kubectl exec kraftcontroller-0 -n confluent -- \
  kafka-metadata-quorum --bootstrap-controller localhost:9074 \
  describe --replication

Check the Lag column for observers:

  • If Lag > 1000, the lag is too high. Investigate replication issues or wait for the lag to decrease before promoting.

  • If Lag < 1000 and is not increasing, replication is healthy. If observers still do not promote, auto-join might not be enabled. Manually promote observers using the add-controller command. For instructions, see Deploy a single-region cluster with dynamic quorum.

Note

CFK does not currently expose a configuration toggle to disable/enable auto-join. CFK attempts to detect the Confluent Platform version from the image to automatically enable auto-join for 8.2 or later. Version detection can fail if you use custom Docker images or images without standard version tags.

Issue: Observer promotion fails during migration

Symptom: The add-controller command fails with an error message about version compatibility or feature flags.

Cause: The inter-broker protocol (IBP) version is less than 3.9. Dynamic quorum (kraft.version=1) requires a minimum IBP version of 3.9 or later.

Solution: Verify the IBP version is set correctly by checking the properties file:

kubectl exec kraftcontroller-0 -n confluent -- \
  grep 'inter.broker.protocol.version' /opt/confluentinc/etc/kafka/kafka.properties

Expected output (3.9 or later):

inter.broker.protocol.version=3.9

If the IBP version is missing or less than 3.9, add or update the annotation:

kubectl annotate kafka kafka \
  platform.confluent.io/kraft-migration-ibp-version=3.9 \
  --overwrite \
  -n confluent

After updating the annotation, restart the migration or retry observer promotion.

Issue: LoadBalancer configuration

Symptom: Controllers cannot connect to each other when advertisedListenersEnabled: true is set with a LoadBalancer within a single cluster or namespace.

Cause: The networking layer works correctly, but KRaft internal logic fails when using advertised external addresses for same-cluster communication. For details, see ControllerRegistrationManager registration issue.

Workaround:

  • Set advertisedListenersEnabled: false on KRaft controllers for single-region deployments.