Migrate from ZooKeeper to KRaft on Confluent Platform

Migrating from ZooKeeper to KRaft means migrating existing metadata from Kafka brokers that are using ZooKeeper to store metadata, to brokers that are using a KRaft quorum controller to store metadata in Apache Kafka®.

This topic walks you through how to perform the migration. To learn more about KRaft, see KRaft Overview for Confluent Platform.

For deployments managed by Confluent for Kubernetes or Confluent Ansible, you can leverage these tools to migrate your deployments to KRaft. For instructions, refer to Migrate Confluent Platform from ZooKeeper to KRaft using Confluent for Kubernetes and Migrate Confluent Platform from ZooKeeper to KRaft using Ansible Playbooks.

Before you begin: Download checklists

This checklist will help guide you through the migration from ZooKeeper to KRaft on the Confluent Platform. Complete these steps sequentially. These steps help ensure the safety of your data.

Important

  • ZooKeeper has been removed from Confluent Platform version 8.0 and later, so you must migrate to KRaft before you upgrade to Confluent Platform version 8.0 or later. These steps assume that you are currently running Confluent Platform version 7.9 or earlier.

  • Migration of production clusters is generally available in Confluent Platform version 7.6.1; however, it is recommended that you migrate Confluent Platform version 7.7.0 or later.

  • You should not attempt production migration on any earlier versions of Confluent Platform or Kafka version 3.6.1 or earlier.

Phases of migration

There are several phases of migration:

  • Phase 1: The initial phase when brokers are using ZooKeeper to manage metadata and a ZooKeeper controller is running.

  • Phase 2: You provision and start KRaft controllers.

  • Phase 3: A hybrid phase where a KRaft controller that contains the cluster metadata is running, and you configure the brokers for migration. In this phase, you load broker metadata from ZooKeeper to the KRaft controller. This means that some brokers might run in KRaft mode and others in ZooKeeper during this phase.

  • Phase 4: A dual write phase where all brokers are moved to KRaft, but the KRaft controller also writes the metadata to ZooKeeper.

  • Phase 5: A final phase when brokers are writing metadata only to KRaft and the metadata is not copied to ZooKeeper.

Migration tool

Before and during the migration, you can use the kafka-migration-check utility to check your cluster configuration and the status of the migration. This utility is included in Confluent Platform 7.9.2 or later and is in the bin directory of your Confluent Platform installation.

For example, before you start migration, you can run the preflight check to identify any issues with your configuration.

kafka-migration-check preflight-check --controller-config ./etc/kafka/kraft/controller.properties

For detailed instructions on how to run this tool, see Check Clusters for KRaft Migration. Fix any identified issues before proceeding.

Validate ZooKeeper ACLs before migration

Before you begin the migration from ZooKeeper to KRaft, it is essential that you manually validate all your ZooKeeper ACLs to ensure they are correctly formatted. The migration tools currently assume that all ACLs in ZooKeeper are valid. However, malformed ACLs, which can result from user-generated scripts or manual configuration, are migrated to KRaft. These malformed ACLs can cause KRaft controllers to fail during startup, which may lead to cluster downtime.

To prevent this, you must review and correct any malformed ACLs before starting the migration process. For more information, see ACL concepts.

Phase 1: Cluster is running in ZooKeeper mode

Note

If your brokers were ever configured with zookeeper.set.acl=true (even if it is now set to false), ZooKeeper ACLs might still be present. If ACLs exist, the new KRaft controllers must have the same ZooKeeper permissions as your brokers. To ensure this, the Client section in the controllers’ jaas.conf file must match the jaas.conf file used by your brokers, or the migration will fail in phase 2.

As an alternative, to avoid potential ACL conflicts, you can disable ZooKeeper security before starting the migration. To do this, add skipACL=yes to the zookeeper.properties file on all ZooKeeper instances and perform a rolling restart of the ZooKeeper ensemble.

Before you begin the migration, the following preparation and validation checklist will help ensure a successful migration:

  • Ensure you are on the latest Confluent Platform version that supports ZooKeeper (for example, 7.9 or earlier, as 8.0+ removes ZooKeeper). Before you migrate from ZooKeeper to KRaft, upgrade to the latest Confluent Platform version that supports ZooKeeper. For more information, see ZooKeeper Topic Guide.

  • Verify that every node in your cluster runs the exact same version of Confluent Platform or serious errors can occur. This includes new KRaft controllers and brokers and existing brokers in ZooKeeper mode.

  • Confirm all brokers are healthy, stable, and operational.

  • Verify your current inter.broker.protocol.version. You will be instructed to update this in phase 3.

  • Run the pre-flight check utility using kafka-migration-check preflight-check --controller-config and fix all issues identified before proceeding.

  • Validate all ZooKeeper ACLs manually and correct any malformed entries. The pre-flight check helps find these, but manual validation is also recommended. For more information, see Validate ZooKeeper ACLs before migration.

  • Check log directories. If you are using multiple log.dirs on ZooKeeper brokers, ensure no directories have failed. During migration, if a ZooKeeper broker is running with multiple log directories, any directory failure causes the broker to shutdown. Brokers with broken log directories cannot be migrated to KRaft until the directories are repaired. For more information, see KAFKA-16431.

  • Review the KRaft Limitations and Known Issues. Do not migrate if you are using any unsupported features.

  • Review KRaft security requirements, especially for SASL/SCRAM. For configuring SASL/SCRAM for broker-to-broker communication, see KRaft-based Confluent Platform clusters. For general security information for KRaft, see KRaft Security in Confluent Platform.

  • To help with debugging, enable TRACE level logging for metadata migration. Add the following line to the log4j.properties file found in the CONFLUENT_HOME/etc/kafka/ directory:

    log4j.logger.org.apache.kafka.metadata.migration=TRACE
    

Important

You can roll back at earlier phases, but you cannot revert the cluster to ZooKeeper after you finalize the migration. Review the Reverting to ZooKeeper mode section to understand the process.

Phase 2: Start KRaft controllers

This phase involves configuring and starting one or more KRaft controllers. Your KRaft controllers should be the same Confluent Platform version as your existing brokers that you plan to migrate.

  1. Get the cluster ID. Before you start a KRaft controller, you must format storage for your Kafka cluster with the ID of the existing cluster.

    You can get this ID using the zookeeper-shell tool:

    1. Run ./bin/zookeeper-shell <zk_host:port> to connect to ZooKeeper.

    2. Execute get /cluster/id and copy the id value.

    For example:

    ./bin/zookeeper-shell localhost:2181
    
    Connecting to localhost:2181
    Welcome to ZooKeeper!
    
    get /cluster/id
    
    {"version":"1","id":"WZEKwK-bS62oT3ZOSU0dgw"}
    
  2. Configure each KRaft controller.

    Deploy a set of KRaft controllers that will take over from ZooKeeper. In each controller’s controller.properties file, configure the following properties:

    • process.roles=controller - Set this to indicate the node runs as a controller.

    • node.id=<unique_id> - Assign a unique numeric ID for each controller. This ID must differ from any existing broker broker.id values and be unique across all controllers in the quorum.

    • controller.quorum.voters=<voter_string> - Specify all controller nodes in the quorum using the format <node.id>@<host>:<port>, for example 3000@host1:9093,3001@host2:9093. Include all controllers, including this one, separated by commas.

    • controller.listener.names=CONTROLLER - Define the listener name for the controller.

    • listeners=CONTROLLER://:9093 - Configure the listener address and port for the controller, using port 9093 or your chosen port. Use the same port format as specified in controller.quorum.voters.

    • zookeeper.metadata.migration.enable=true - Enable migration mode so the controller can read metadata from ZooKeeper.

    • zookeeper.connect=<zk_connection_string> - Specify the ZooKeeper connection string, for example, localhost:2181 or zk1:2181,zk2:2181,zk3:2181.

    • confluent.cluster.link.metadata.topic.enable=true - This property is used by Cluster Linking during the migration.

    • password.encoder.secret=<your_secret> - Set this property if you have Cluster Linking configured. Remove this property after migration is complete.

    • Copy over settings that are on your existing Kafka brokers like described in Other properties. Ensure other configuration settings from your existing brokers are also applied to the controllers.

    Note

    The KRaft controller node.id values must be different from any existing ZooKeeper broker broker.id property. In KRaft mode, the brokers and controllers share the same node ID namespace.

    For more details on configuring KRaft controllers, see KRaft Configuration for Confluent Platform.

    The following example shows a controller.properties file for a controller listening on port 9093. This example does not list every property, but only properties that are required for migration:

    process.roles=controller
    node.id=3000
    controller.quorum.voters=3000@localhost:9093
    controller.listener.names=CONTROLLER
    listeners=CONTROLLER://:9093
    
    # Enable the migration
    zookeeper.metadata.migration.enable=true
    
    # ZooKeeper client configuration
    zookeeper.connect=localhost:2181
    
    # Enable migrations for cluster linking
    confluent.cluster.link.metadata.topic.enable=true
    
    # If you have cluster linking configured, you must have password.encoder.secret set for migration.
    # password.encoder.secret=<your_secret>
    
    # Other configuration entries ...
    
  3. Format storage on each controller node. On each controller node, run the following command:

    ./bin/kafka-storage format --config ./etc/kafka/controller.properties --cluster-id=<cluster_id_from_step_1>
    

    Replace <cluster_id_from_step_1> with the cluster ID you retrieved in step 1.

    Your output looks similar to the following:

    Formatting /tmp/kraft-controller-logs with metadata version 4.1
    
  4. Start each controller by running the following command on each controller node:

    ./bin/kafka-server-start.sh ./etc/kafka/controller.properties
    
  5. Validate that the controller started correctly by checking the runtime status of the metadata quorum.

    Use the kafka-metadata-quorum tool with the --bootstrap-controller option to verify the controller is running. For example:

    ./bin/kafka-metadata-quorum --bootstrap-controller localhost:9093 describe --status
    

    The output should show the cluster ID, leader information, and current voters. For example:

    $ kafka-metadata-quorum --bootstrap-controller localhost:29092 describe --status
    
    ClusterId:              xNLMySgrTlO4_aE9Qo2gbw
    LeaderId:               6
    LeaderEpoch:            1
    HighWatermark:          276
    MaxFollowerLag:         0
    MaxFollowerLagTimeMs:   385
    CurrentVoters:          [{"id": 4, "directoryId": null, "endpoints": ["CONTROLLER://controller1:29092"]}, {"id": 5, "directoryId": null, "endpoints": ["CONTROLLER://controller2:29093"]}, {"id": 6, "directoryId": null, "endpoints": ["CONTROLLER://controller3:29094"]}]
    CurrentObservers:       []
    

    For more information, see Describe runtime status.

    If you want to revert the migration from this phase, see Rollback from phase 2: KRaft controllers provisioned.

Phase 3: Migrate broker metadata from ZooKeeper to KRaft

After the KRaft controllers start, you will reconfigure each broker for KRaft migration and restart the broker. You can perform a rolling restart to help ensure cluster availability during the migration. Metadata migration automatically starts when all of the brokers have been restarted.

If your cluster uses SCRAM authentication, your credentials are automatically migrated from ZooKeeper to the KRaft controller nodes during this phase. Your existing clients continue to work without interruption.

  1. Perform a rolling restart of brokers, one at a time:

  2. On each broker, modify server.properties with the following properties:

    • inter.broker.protocol.version=4.1 - Set the inter-broker protocol version (or the version specified for your Confluent Platform).

    • zookeeper.metadata.migration.enable=true - Enable migration mode so the broker can participate in metadata migration.

    • zookeeper.connect=<zk_connection_string> (should already be present) - The ZooKeeper connection string. This property should already be configured.

    • controller.quorum.voters=<voter_string> (same as controllers) - Specify the same controller quorum voters string as configured in phase 2.

    • controller.listener.names=CONTROLLER - Define the listener name for the controller.

    • Add CONTROLLER to listener.security.protocol.map (for example, ...CONTROLLER:PLAINTEXT) - Add the CONTROLLER listener to the security protocol map with the appropriate security protocol.

    • confluent.cluster.link.metadata.topic.enable=true - This property is used by Cluster Linking during the migration.

  3. Restart the broker.

  4. Wait for the broker to rejoin the cluster before proceeding to the next one.

    The following example shows a configuration file for a broker that is ready for the KRaft migration.

    # Sample ZK broker server.properties listening on 9092
    broker.id=0
    listeners=PLAINTEXT://:9092
    advertised.listeners=PLAINTEXT://localhost:9092
    listener.security.protocol.map=PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT
    
    # Set the IBP
    inter.broker.protocol.version=4.1
    
    # Enable the migration
    zookeeper.metadata.migration.enable=true
    
    # Cluster linking metadata topic enabled
    confluent.cluster.link.metadata.topic.enable=true
    
    # ZooKeeper client configuration
    zookeeper.connect=localhost:2181
    
    # KRaft controller quorum configuration
    # For a single controller: controller.quorum.voters=3000@localhost:9093
    # For multiple controllers, list all controller nodes separated by commas (single property, comma-separated values):
    # controller.quorum.voters=3000@localhost:9093,4000@localhost:9094,5000@localhost:9095
    controller.quorum.voters=3000@localhost:9093
    controller.listener.names=CONTROLLER
    

    Start each broker with the modified configuration file. When all Kafka brokers running with ZooKeeper for metadata management are restarted with migration properties set, the migration automatically begins. The broker does not start if you have not configured migration correctly in the properties file. Check the server.log if there are broker failures for the root cause.

  5. Monitor for completion by checking the active KRaft controller log at INFO level for the following message after you restart all brokers:

    Completed migration of metadata from ZooKeeper to KRaft.
    
  6. (Optional) Check broker logs for a znode_type entry for /controller showing kraftControllerEpoch.

    If you want to revert the migration from this phase, see Rollback from phase 3: Brokers in hybrid mode.

Phase 4: Migrate the brokers to use the KRaft controller

At this point, the metadata migration is complete, but the Kafka brokers still running in ZooKeeper mode. The KRaft controller is running in migration mode, and it sends remote procedure calls (RPCs) such as UpdateMetadata and LeaderAndIsr to the ZooKeeper-mode brokers.

After you check your migration with the migration tool and fix any issues, migrate the brokers to KRaft by reconfiguring them as KRaft brokers and restarting them.

  1. Perform a rolling restart of brokers, one at a time.

  2. On each broker, modify server.properties with the following properties:

    • process.roles=broker - Set this to indicate the node runs as a broker.

    • node.id=<broker_id> (use the same ID number as the old broker.id) - Replace broker.id with node.id using the same numeric value.

    • Remove or comment out the following properties:

      • broker.id - This property is replaced by node.id in KRaft mode.

      • inter.broker.protocol.version

      • zookeeper.metadata.migration.enable=true

      • zookeeper.connect

      • confluent.cluster.link.metadata.topic.enable=true (if set)

    • Do not change these properties from phase 3: controller.quorum.voters and controller.listener.names.

    • Change authorizer.class.name to org.apache.kafka.metadata.authorizer.StandardAuthorizer - If using ACLs, update the authorizer class for KRaft mode. For more information, see ACL concepts.

    The following example shows a server.properties file for a migrated broker. Note that ZooKeeper-specific properties are commented out.

    process.roles=broker
    node.id=0
    listeners=PLAINTEXT://:9092
    advertised.listeners=PLAINTEXT://localhost:9092
    listener.security.protocol.map=PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT
    
    # IBP
    # inter.broker.protocol.version=4.1
    
    # Remove the migration enabled flag
    # zookeeper.metadata.migration.enable=true
    
    # Remove the cluster linking metadata topic setting
    # confluent.cluster.link.metadata.topic.enable=true
    
    # Remove ZooKeeper client configuration
    # zookeeper.connect=localhost:2181
    
    # Keep the KRaft controller quorum configuration
    controller.quorum.voters=3000@localhost:9093
    controller.listener.names=CONTROLLER
    
    # If using ACLs, change the authorizer from AclAuthorizer used for ZooKeeper to the StandardAuthorizer used for KRaft.
    authorizer.class.name=org.apache.kafka.metadata.authorizer.StandardAuthorizer
    
  3. Restart the broker.

  4. Wait for the broker to rejoin the cluster before proceeding to the next one.

    All of the brokers are running in KRaft mode after this step.

  5. Run the cluster in this dual-write mode for an extended period (for example, 1-2 weeks) to ensure stability before finalizing. Running the ZooKeeper cluster for a longer period may prevent issues after migration.

    If you want to revert the migration from this phase, see Rollback from phase 4: Brokers migrated to KRaft.

Phase 5: Take KRaft controllers out of migration mode

Important

After you complete this phase, you cannot revert to ZooKeeper mode. Before you proceed, ensure you have validated the cluster in KRaft mode for an extended period (recommended: 1-2 weeks).

In this phase, you remove the migration entries, such as zookeeper.metadata.migration.enable, to take the controllers out of migration mode, and remove the ZooKeeper configuration entry.

  1. Perform a rolling restart of controllers, one at a time.

  2. On each controller, before you restart it, modify the controller.properties file to remove or comment out the following properties:

    • zookeeper.metadata.migration.enable=true - Migration is complete.

    • zookeeper.connect - ZooKeeper connection is no longer needed.

    • confluent.cluster.link.metadata.topic.enable=true (if set) - This property was set in phase 2 if you have Cluster Linking configured.

    The following example shows a controller.properties file for a controller migrated to KRaft mode and listening on port 9093. Note that the ZooKeeper-specific properties are commented out.

    process.roles=controller
    node.id=3000
    controller.quorum.voters=3000@localhost:9093
    controller.listener.names=CONTROLLER
    listeners=CONTROLLER://:9093
    
    # Disable migration
    # zookeeper.metadata.migration.enable=true
    
    # Remove the cluster linking metadata topic setting
    # confluent.cluster.link.metadata.topic.enable=true
    
    # Remove ZooKeeper client configuration
    # zookeeper.connect=localhost:2181
    
  3. Restart the controller.

  4. Wait for the controller to rejoin the quorum before proceeding to the next one.

Migration is now complete. At this point, you can safely shut down the ZooKeeper cluster. Reverting is no longer possible.

Reverting to ZooKeeper mode

If the cluster is still in migration mode, you can revert it back to ZooKeeper mode.

The revert process depends on how far the migration has progressed. To revert successfully, complete each step fully and in the correct order. If you did not fully complete a phase, back out those changes and use the specified revert steps.

Important

You cannot revert to ZooKeeper mode after Phase 5 (migration finalized). If you have finalized the migration, rollback is not possible.

Rollback from phase 2: KRaft controllers provisioned

This is the simplest rollback.

  • Deprovision controllers by shutting down and deleting the KRaft controller nodes.

Rollback from phase 3: Brokers in hybrid mode

  1. Deprovision controllers by shutting down and deleting the KRaft controller nodes.

  2. Clear ZooKeeper state by connecting to ZooKeeper using ./bin/zookeeper-shell <zk_host:port>, then:

    • Run deleteall /controller to allow a broker to become the new ZooKeeper controller.

    • Run get /migration to check the migration state.

    • Run delete /migration to clear the migration state from ZooKeeper.

    These steps help ensure that future migrations are not disrupted.

    Note

    You must perform the zookeeper-shell step quickly to minimize the amount of time that the cluster lacks a controller. Until the /controller znode is deleted, ignore any errors in the broker log about failing to connect to the KRaft controller. Those error logs should disappear after a second roll to ZooKeeper mode.

  3. Clean metadata logs on all brokers by removing the _cluster_metadata log file from Kafka’s log directories.

  4. Perform a rolling restart of brokers, one at a time:

    1. In the server.properties, remove or comment out the following properties:

      • zookeeper.metadata.migration.enable

      • controller.listener.names

      • controller.quorum.voters

      • confluent.cluster.link.metadata.topic.enable (if set)

      • inter.broker.protocol.version (or set back to original)

    2. Restart the broker.

    3. Wait for the broker to rejoin the cluster before proceeding.

  5. Verify the cluster is working in ZooKeeper mode.

  6. Perform additional checks to ensure the cluster is working correctly in ZooKeeper mode.

Rollback from phase 4: Brokers migrated to KRaft

This is a multi-stage rollback.

Step 1: First rolling restart (brokers back to hybrid)

For each broker, one at a time:

  1. Modify server.properties with the following properties:

    • Remove or comment out the process.roles property.

    • Replace node.id=<id> with broker.id=<id> (use a matching ID) - Replace the KRaft node.id with broker.id using the same numeric value.

    • Restore zookeeper.connect=<zk_connection_string> - Restore the ZooKeeper connection string to its previous value.

    • Restore any other previous ZooKeeper-specific configs (for example, zookeeper.ssl.protocol) - If your cluster requires other ZooKeeper configurations for brokers, re-add those configurations.

    Important

    Keep zookeeper.metadata.migration.enable=true set to true for this restart. This property must remain enabled during the first rolling restart to allow proper rollback.

  2. Restart the broker.

  3. Wait for the broker to rejoin the cluster before proceeding.

Step 2: Deprovision controllers and clean ZooKeeper

  1. Deprovision controllers by shutting down and deleting the KRaft controller nodes.

  2. Clear ZooKeeper state by connecting to ZooKeeper using ./bin/zookeeper-shell <zk_host:port>, then:

    • Run deleteall /controller to allow a broker to become the new ZooKeeper controller.

    • Run get /migration to check the migration state.

    • Run delete /migration to clear the migration state from ZooKeeper.

    These steps help ensure that future migrations are not disrupted.

    Note

    You must perform the zookeeper-shell step quickly, to minimize the amount of time that the cluster lacks a controller. Until the /controller ZooKeeper node is deleted, you can ignore errors in the broker log about failing to connect to the KRaft controller. Those error logs should disappear after second roll to pure ZooKeeper mode.

Step 3: Clean metadata logs

  • On all brokers, remove the _cluster_metadata log file from Kafka’s log directories.

Step 4: Perform second rolling restart (brokers back to ZooKeeper-only)

For each broker, one at a time:

  1. Modify server.properties with the following properties:

    • Remove or comment out the following properties:

    • zookeeper.metadata.migration.enable - This property can now be removed when reverting migration.

    • controller.listener.names - Controller listener configuration is not needed in ZooKeeper mode.

    • controller.quorum.voters - Controller quorum configuration is not needed in ZooKeeper mode.

    • confluent.cluster.link.metadata.topic.enable (if set) - Remove this property if you set it during migration.

    • (If using ACLs) Change authorizer.class.name back to the ZooKeeper authorizer (for example, kafka.security.authorizer.AclAuthorizer) - Restore the ZooKeeper authorizer class that you used before migration.

    • inter.broker.protocol.version (or set back to original) - Restore the original inter-broker protocol version you were using before the migration.

  2. Restart the broker.

  3. Wait for the broker to rejoin the cluster before proceeding.

  4. Rollback complete.

Rollback from phase 5: Migration finalized

  • You cannot revert to ZooKeeper mode after you have finalized the phase 5 migration.