Migrate a Single Cluster from ZooKeeper to KRaft

This topic explains how to migrate your Confluent Platform deployment from ZooKeeper to KRaft by using Confluent for Kubernetes.

Before you start, ensure you have met all the prerequisites in KRaft Migration Prerequisites.

Tip

This procedure covers a single-cluster migration. To migrate a multi-region cluster (MRC), you run these steps one region at a time. For the multi-region procedure, see Migrate a Multi-Region Cluster from ZooKeeper to KRaft.

Review the complete end-to-end examples on GitHub: CFK Examples for KRaft Migration.

Step 1: Configure IBP version

The migration requires the correct inter-broker protocol (IBP) version. Do not set IBP in configOverrides.

Check your image type:

kubectl get kafka <kafka-name> -n <namespace> -o jsonpath='{.spec.image.application}'

Expected:

# The image path for your Confluent Platform version, for example:

confluentinc/cp-server:7.9.0

CFK derives the IBP version automatically from standard Confluent images published under confluentinc/, for example, confluentinc/cp-server:7.9.0. If you use a standard image, no action is needed. Skip to Step 2.

If you use a custom image, apply the IBP annotation that matches your Confluent Platform version.

Confluent Platform version

IBP version

7.9.x

3.9

7.8.x

3.8

7.7.x

3.7

7.6.x

3.6

Apply the annotation, replacing <ibp-version> with the value for your Confluent Platform version:

kubectl annotate kafka <kafka-name> \
  platform.confluent.io/kraft-migration-ibp-version="<ibp-version>" \
  -n <namespace>

Expected:

kafka.platform.confluent.io/<kafka-name> annotated

Step 2: Deploy KRaftController CR

The migration does not automatically copy configurations from Kafka to KRaftController. You must explicitly configure KRaftController to match your existing Kafka setup.

Tip

For MRC deployments, create and apply the KRaftController CR in each region.

Step 2.1: Export your Kafka CR

Use this file as the reference for the settings you copy into the KRaftController CR in Step 2.3. Export your existing Kafka CR:

kubectl get kafka <kafka-name> -n <namespace> -o yaml > current-kafka-config.yaml

Expected:

No terminal output. The command writes the Kafka CR to current-kafka-config.yaml in your working directory.

Step 2.2: Create KRaftController CR

  1. Find the image tags of your Kafka brokers. Use the same application and init image tags as your Kafka brokers to keep versions aligned:

    kubectl get kafka <kafka-name> -n <namespace> -o jsonpath='application: {.spec.image.application}{"\n"}init: {.spec.image.init}{"\n"}'
    

    Expected:

    application: confluentinc/cp-server:<cp-version>
    init: confluentinc/confluent-init-container:<init-container-version>
    
  2. Create the kraftcontroller.yaml file:

    apiVersion: platform.confluent.io/v1beta1
    kind: KRaftController
    metadata:
      name: kraftcontroller
      namespace: <namespace>
      annotations:
        platform.confluent.io/kraft-migration-hold-krc-creation: "true"
        platform.confluent.io/use-log4j1: "true"  # Required for CFK 3.0 or later
    spec:
      replicas: 3                  # Example value. Quorum size should be odd; 3 or 5 is typical.
      image:
        application: confluentinc/cp-server:<cp-version>  # Match your Kafka version
        init: confluentinc/confluent-init-container:<init-container-version>
      dataVolumeCapacity: 10Gi     # Example value. Size for your metadata workload.
    

    The replicas and dataVolumeCapacity values are examples. Size them based on your production requirements.

Required annotations

kraft-migration-hold-krc-creation: "true"

Delays pod creation until the migration job modifies the CR.

use-log4j1: "true"

Forces the KRaftController to use Log4j 1, which is compatible with Confluent Platform 7.x brokers during migration. By default, CFK 3.0 or later uses Log4j 2. Remove this annotation after the migration completes. For details, see Remove the Log4j1 annotation.

Tip

If no security is enabled (RBAC, TLS, authentication, or password encoder), skip to Step 2.4. Otherwise, apply only the configurations in Step 2.3 that are enabled on your Kafka CR.

Step 2.3: Add security configurations

Using the exported current-kafka-config.yaml as reference, open the tab for each security configuration that is enabled on your Kafka CR, then add the settings to your KRaftController CR.

If RBAC is enabled on the Kafka CR, configure two related blocks on the KRaftController CR:

Super users

In spec.authorization.superUsers, copy the principals from your Kafka CR. User:kraftcontroller in the example is illustrative. Include it only if your deployment requires the KRaftController principal to be a super user. Replace it with the principal that your mdsKafkaCluster.authentication.jaasConfig credential resolves to. For PLAIN JAAS, this is the secret’s username field. For mTLS, this is the certificate subject.

MDS Kafka cluster dependency

Set spec.dependencies.mdsKafkaCluster so the KRaftController can authenticate to Kafka for MDS. Reuse an existing Kafka super user secret in secretRef or create a new one. Ports 9071 (internal Kafka listener) and 8090 (MDS) are the CFK defaults. If your Kafka CR uses different ports, substitute them. Add the following to the KRaftController CR:

spec:
  authorization:
    type: rbac
    superUsers:
      - User:kafka
      - User:kraftcontroller   # Include only if the KRaftController principal needs super-user permissions
  dependencies:
    mdsKafkaCluster:
      authentication:
        type: plain
        jaasConfig:
          secretRef: kraftcontroller-credential   # Reuse an existing super-user secret or create a new one
      bootstrapEndpoint: kafka.confluent.svc.cluster.local:9071
      tls:
        enabled: true

For a complete RBAC-enabled example, see the confluent-platform.yaml file in the CFK examples repository.

Configure the password encoder on KRaftController using the same value as your Kafka CR. The mechanism differs:

Kafka CR

Use spec.passwordEncoder.secretRef. The secret stays out of the CR.

KRaftController CR

Set password.encoder.secret in configOverrides.server. KRaftController has no dedicated field, so the value is stored in plain text in the CR.

Verify the secret exists:

kubectl get secret password-encoder-secret -n <namespace>

Read the password value from the secret:

kubectl get secret password-encoder-secret -n <namespace> \
  -o jsonpath='{.data.password-encoder-secret}' | base64 -d

Add to KRaftController, using the same value configured on your Kafka CR:

spec:
  configOverrides:
    server:
      - password.encoder.secret=<value-from-secret>

Configure the controller listener to use the same authentication type and TLS settings as your Kafka listeners:

spec:
  tls:
    secretRef: tls-group1  # Same as Kafka
  listeners:
    controller:
      authentication:
        type: plain  # Match Kafka's authentication type
        jaasConfig:
          secretRef: credential
      tls:
        enabled: true

The -Xms and -Xmx values shown are examples. Match the custom JVM settings configured on your Kafka CR:

spec:
  configOverrides:
    jvm:
      - -Xms4g
      - -Xmx4g

Step 2.4: Apply KRaftController CR

Apply the kraftcontroller.yaml file you created:

kubectl apply -f kraftcontroller.yaml

Expected:

kraftcontroller.platform.confluent.io/kraftcontroller created

Step 2.5: Verify KRaftController state

  1. Verify that the KRaftController status is HOLD:

    kubectl get kraftcontroller <kraftcontroller-name> -n <namespace>
    

    Expected:

    NAME              REPLICAS   READY   STATUS   AGE
    kraftcontroller   0                  HOLD     25s
    
  2. Verify that no pods are created:

    kubectl get pods -n <namespace> | grep <kraftcontroller-name>
    

    Expected:

    # no output
    

Troubleshoot: If the KRaftController status is not HOLD or pods are created, verify that both the annotations (kraft-migration-hold-krc-creation and use-log4j1) are set to "true" or not in the KRaftController CR.

Step 3: Start migration

The KRaftMigrationJob drives the migration through the SETUP, MIGRATE, and DUAL-WRITE phases and locks the ZooKeeper, Kafka, and KRaft CRs to prevent modifications.

Tip

For MRC deployments, create and apply the KRaftMigrationJob CR in each region.

Step 3.1: Create KRaftMigrationJob CR

Create the kraftmigrationjob.yaml file:

apiVersion: platform.confluent.io/v1beta1
kind: KRaftMigrationJob
metadata:
  name: <migration-job-name>
  namespace: <namespace>
spec:
  dependencies:
    kafka:
      name: <kafka-name>
      namespace: <namespace>
    zookeeper:
      name: <zookeeper-name>
      namespace: <namespace>
    kRaftController:
      name: <kraftcontroller-name>
      namespace: <namespace>

Note

In MRC deployments, you set zookeeper.connect manually on the KRaftController CR. This field is on the CFK pre-flight blocklist, so CFK validates it before starting the migration. If you want to bypass the pre-flight check, add the kraft-migration-bypass-prechecks: "true" annotation to the KRaftMigrationJob CR:

apiVersion: platform.confluent.io/v1beta1
kind: KRaftMigrationJob
metadata:
  name: <migration-job-name>
  namespace: <namespace>
  annotations:
    platform.confluent.io/kraft-migration-bypass-prechecks: "true"

Do not use this annotation for single-cluster deployments, because it can cause migration failures from conflicting configurations. For details, see Multi-region cluster considerations.

Step 3.2: Apply KRaftMigrationJob CR

Apply the kraftmigrationjob.yaml file you created:

kubectl apply -f kraftmigrationjob.yaml

Expected:

kraftmigrationjob.platform.confluent.io/<migration-job-name> created

Step 3.3: Verify migration started

List all the migration jobs in the namespace:

kubectl confluent cluster kraft-migration list -n <namespace>

For command details, see kubectl confluent cluster kraft-migration list.

Expected:

NAME                KAFKA  KRAFTCONTROLLER  ZOOKEEPER  PHASE  SUBPHASE                AGE
kraftmigrationjob   kafka  kraftcontroller  zookeeper  SETUP  SubPhaseSetupPreChecks  2m

Check the migration job status:

kubectl get kraftmigrationjob <migration-job-name> -n <namespace> \
  -o jsonpath='Phase: {.status.phase} | SubPhase: {.status.subPhase}{"\n"}'

Expected:

Phase: SETUP | SubPhase: SubPhaseSetup<...>

<...> is the current subphase suffix (for example, SubPhaseSetupPreChecks or SubPhaseSetupCheckHealthyKafka).

Troubleshoot: If the migration job fails to start, verify CR syntax and dependency names (Kafka, ZooKeeper, KRaftController).

Step 4: Monitor migration

The migration progresses through several phases and pauses in DUAL-WRITE until you apply the finalize annotation in Step 5.3. For the full phase list, sub-phases, and key concepts (dual-write, point of no return, FINALIZE), see ZooKeeper to KRaft Migration Phases and Sub-phases.

Check the status of a specific migration job:

kubectl confluent cluster kraft-migration status --name <migration-job-name> -n <namespace>

For command details, see kubectl confluent cluster kraft-migration status.

Expected:

KRaft Migration Job: kraftmigrationjob
Namespace:           confluent
Phase:               SETUP
SubPhase:            SubPhaseSetupEnsureIBPUpgradeComplete
Time in state:       2m
IBP Version:         3.9

Dependencies:
  Kafka:             kafka (confluent) [RUNNING 1/1]
  KRaftController:   kraftcontroller (confluent) [HOLD 0/0]
  ZooKeeper:         zookeeper (confluent) [RUNNING 1/1]

Status:
  --> KRaft migration workflow setup. No action required.

Watch the migration job status:

kubectl get kraftmigrationjob <migration-job-name> -n <namespace> -w

Expected: Real-time progression through phases, with the STATUS column updating as the migration advances.

NAME                  STATUS    AGE
<migration-job-name>  SETUP     2m
<migration-job-name>  MIGRATE   8m
<migration-job-name>  DUAL-WRITE   15m

Troubleshoot: For deeper debugging, watch pods and operator logs in parallel terminals as described in Watch pods and operator logs in separate terminals, and if a subphase takes longer than expected, see Check for errors if migration stalls.

Step 5: Validate and finalize migration

Once the migration reaches the DUAL-WRITE phase, validate the cluster before finalizing. If validation fails, roll back instead.

Step 5.1: Verify DUAL-WRITE mode

Confirm the migration has reached the DUAL-WRITE phase:

kubectl confluent cluster kraft-migration status --name <migration-job-name> -n <namespace>

For command details, see kubectl confluent cluster kraft-migration status.

Expected:

KRaft Migration Job: kraftmigrationjob
Namespace:           confluent
Phase:               DUAL-WRITE
SubPhase:            SubPhaseMigrationDualWrite
Time in state:       1m
Kafka Cluster ID:    <kafka-cluster-id>
IBP Version:         3.9

Dependencies:
  Kafka:             kafka (confluent) [RUNNING 1/1]
  KRaftController:   kraftcontroller (confluent) [RUNNING 1/1]
  ZooKeeper:         zookeeper (confluent) [RUNNING 1/1]

Action Required:
  --> Cluster is in dual-write mode. Validate your cluster, then choose:
    To proceed further with Kraft: kubectl confluent cluster kraft-migration finalize --name <migration-job-name> -n <namespace>
    To rollback to Zookeeper: kubectl confluent cluster kraft-migration rollback --name <migration-job-name> -n <namespace>

Check the migration job status:

kubectl get kraftmigrationjob <migration-job-name> -n <namespace>

Expected:

NAME                STATUS       AGE
kraftmigrationjob   DUAL-WRITE   11m

Step 5.2: Validate cluster health

  1. Verify all Kafka and KRaftController pods are running:

    kubectl get pods -n <namespace> -l app=kafka
    kubectl get pods -n <namespace> -l app=kraftcontroller
    

    Expected:

    NAME      READY   STATUS    RESTARTS   AGE
    kafka-0   1/1     Running   0          8m
    
    NAME                READY   STATUS    RESTARTS   AGE
    kraftcontroller-0   1/1     Running   0          9m
    
  2. Verify Kafka and KRaftController status:

    kubectl get kafka <kafka-name> -n <namespace>
    kubectl get kraftcontroller <kraftcontroller-name> -n <namespace>
    

    Expected:

    NAME    REPLICAS   READY   STATUS    AGE
    kafka   1          1       RUNNING   43m
    
    NAME              REPLICAS   READY   STATUS    AGE
    kraftcontroller   1          1       RUNNING   13m
    

Note

If validation fails, roll back to ZooKeeper instead of finalizing. Rollback is supported during the SETUP, MIGRATE, and DUAL-WRITE phases.

Step 5.3: Finalize migration to KRaft

Finalization moves the cluster from dual-write mode to KRaft-only mode.

Warning

Finalizing the migration removes ZooKeeper dependency from Kafka, removes migration configuration from KRaftController, and transitions the cluster irreversibly to KRaft mode. You cannot roll back after this point.

Trigger finalization:

kubectl confluent cluster kraft-migration finalize --name <migration-job-name> -n <namespace>

For command details, see kubectl confluent cluster kraft-migration finalize.

The command prompts for confirmation before proceeding:

WARNING: Finalizing migration to KRaft is irreversible. Rollback will no longer be possible.
Current phase: DUAL-WRITE
Proceed? [y/N]: y
✓ Migration finalize triggered successfully!
  Annotation applied: platform.confluent.io/kraft-migration-trigger-finalize-to-kraft=true

Apply the finalize annotation:

kubectl annotate kraftmigrationjob <migration-job-name> \
  platform.confluent.io/kraft-migration-trigger-finalize-to-kraft=true \
  -n <namespace>

Expected:

kraftmigrationjob.platform.confluent.io/<migration-job-name> annotated

Step 5.4: Verify migration completed

The migration is complete when the KRaftMigrationJob status shows COMPLETE. Watch the status:

kubectl get kraftmigrationjob <migration-job-name> -n <namespace> -w

Expected:

NAME                  STATUS     AGE
<migration-job-name>  COMPLETE   30m

Step 6: Complete post-migration

You have completed the ZooKeeper to KRaft migration. After the KRaftMigrationJob status shows COMPLETE, complete these tasks in order to release migration locks, validate KRaft-only operation, remove the ZooKeeper cluster, and clean up migration resources.

Step 6.1: Release migration locks

Manually release the migration locks applied in Step 3 of this procedure.

Release the migration locks on the Kafka, KRaftController, and ZooKeeper CRs:

kubectl confluent cluster kraft-migration release-lock --name <migration-job-name> -n <namespace>

For command details, see kubectl confluent cluster kraft-migration release-lock.

The command prompts for confirmation before proceeding:

This will release the CR lock on Kafka, KRaftController, and ZooKeeper resources.
Current phase: COMPLETE
Proceed? [y/N]: y
✓ CR lock release triggered successfully!
  Annotation applied: platform.confluent.io/kraft-migration-release-cr-lock=true

Apply the release lock annotation:

kubectl annotate kraftmigrationjob <migration-job-name> \
  platform.confluent.io/kraft-migration-release-cr-lock=true \
  -n <namespace>

Verify that the locks are removed:

kubectl get kafka <kafka-name> -n <namespace> -o yaml | grep kraft-migration-cr-lock
kubectl get kraftcontroller <kraftcontroller-name> -n <namespace> -o yaml | grep kraft-migration-cr-lock

Expected:

# No output, which confirms the locks are released.

Important

Without releasing locks, you cannot modify Kafka or KRaftController configurations, scale resources, or apply upgrades.

Step 6.2: Remove the Log4j1 annotation

  1. If using CFK 3.0 or later, remove the platform.confluent.io/use-log4j1 annotation:

    kubectl annotate kraftcontroller <kraftcontroller-name> \
      platform.confluent.io/use-log4j1- \
      -n <namespace>
    
  2. Verify the annotation is removed:

    kubectl get kraftcontroller <kraftcontroller-name> -n <namespace> \
      -o jsonpath='{.metadata.annotations.platform\.confluent\.io/use-log4j1}'
    

    Expected:

    # no output
    

Note

This triggers a KRaftController pod roll to apply Log4j 2 configuration, which is normal and safe after the migration completes. Skip this step if you are using CFK 2.x versions.

Step 6.3: Validate KRaft-only operation

Before deleting ZooKeeper, validate Kafka operates correctly in KRaft-only mode.

  1. Verify Kafka has no ZooKeeper dependency:

    kubectl get kafka <kafka-name> -n <namespace> -o jsonpath='{.spec.dependencies}'
    

    Expected: The output lists only the kRaftController dependency, with no zookeeper entry. For example:

    {"kRaftController":{"clusterRef":{"name":"kraftcontroller","namespace":"<namespace>"},"controllerListener":{}}}
    
  2. Verify Kafka is running without ZooKeeper errors:

    kubectl get kafka <kafka-name> -n <namespace>
    kubectl logs <kafka-pod-name> -n <namespace> --since=24h | grep -iE "zookeeper.*(error|failed|disconnect|timeout|expired)" | grep -v " = "
    

    Expected: The Kafka status shows RUNNING and the log search returns no ZooKeeper connection errors.

    # kubectl get kafka
    NAME    REPLICAS   READY   STATUS    AGE
    kafka   1          1       RUNNING   3h
    
    # log search
    (no output)
    

Step 6.4: Delete ZooKeeper cluster

After Kafka operates correctly in KRaft-only mode, delete the ZooKeeper cluster to free its resources.

Warning

Delete ZooKeeper only after confirming:

  • Kafka has been stable in KRaft-only mode.

  • All validation tests pass.

  • No other Kafka clusters use this ZooKeeper.

  • You have backups of ZooKeeper data if needed.

  1. Verify that no other Kafka CR depends on the ZooKeeper instance you are about to delete. The following command lists every Kafka CR in the cluster along with the ZooKeeper name and namespace it depends on (if any):

    kubectl get kafka --all-namespaces -o yaml | grep -A 5 "zookeeper:"
    

    Expected:

    # no output
    
  2. Delete the ZooKeeper cluster:

    kubectl delete zookeeper <zookeeper-name> -n <namespace>
    

    Expected:

    zookeeper.platform.confluent.io "<zookeeper-name>" deleted
    
  3. Watch the ZooKeeper pods terminate:

    kubectl get pods -n <namespace> -l app=zookeeper -w
    

    Expected: The ZooKeeper pods move to Terminating and then no longer appear.

    NAME          READY   STATUS        AGE
    zookeeper-0   1/1     Terminating   5d
    zookeeper-1   1/1     Terminating   5d
    zookeeper-2   1/1     Terminating   5d
    
  4. Verify Kafka remains operational:

    kubectl get kafka <kafka-name> -n <namespace>
    

    Expected:

    NAME    REPLICAS   READY   STATUS    AGE
    kafka   1          1       RUNNING   3h
    
  5. Clean up the ZooKeeper Persistent Volume Claims (PVCs).

    Warning

    This action permanently deletes ZooKeeper data. Delete PVCs only if you do not need the data.

    Deleting the ZooKeeper CR may already remove its PVCs, depending on your CFK version and volume retention settings. Check for existing ZooKeeper PVCs:

    kubectl get pvc -n <namespace> | grep zookeeper
    

    If the command returns no PVCs, skip this step. Otherwise, delete each PVC it returns, using the exact names from the output:

    kubectl delete pvc data0-zookeeper-0 data0-zookeeper-1 data0-zookeeper-2 -n <namespace>
    

    Expected:

    persistentvolumeclaim "data0-zookeeper-0" deleted
    persistentvolumeclaim "data0-zookeeper-1" deleted
    persistentvolumeclaim "data0-zookeeper-2" deleted
    

Step 6.5: Clean up migration resources

After migration, delete the KRaftMigrationJob CR and any migration-specific ConfigMaps or Secrets. Cleanup is optional.

  1. Before deleting the job, you can save the final status for records:

    kubectl get kraftmigrationjob <migration-job-name> -n <namespace> -o yaml > kmj-final-status.yaml
    
  2. Delete the migration job:

    kubectl delete kraftmigrationjob <migration-job-name> -n <namespace>
    

    Expected:

    kraftmigrationjob.platform.confluent.io "<migration-job-name>" deleted
    
  3. Check for any migration-specific ConfigMaps or Secrets:

    kubectl get configmaps -n <namespace> | grep migration
    kubectl get secrets -n <namespace> | grep migration
    
  4. Delete any resources returned by the previous commands:

    kubectl delete configmap <configmap-name> -n <namespace>
    kubectl delete secret <secret-name> -n <namespace>
    

Step 6.6: Download updated CRs (optional)

Download the updated CRs for backup or GitOps repository updates:

# Kafka CR
kubectl get kafka <kafka-name> -n <namespace> -o yaml > kafka-kraft-mode.yaml

# KRaftController CR
kubectl get kraftcontroller <kraftcontroller-name> -n <namespace> -o yaml > kraftcontroller.yaml

# Optional: ZooKeeper backup before deletion
kubectl get zookeeper <zookeeper-name> -n <namespace> -o yaml > zookeeper-backup.yaml