Migrate from ZooKeeper to KRaft on Confluent Platform¶

Migrating from ZooKeeper to KRaft means migrating existing metadata from Kafka brokers that are using ZooKeeper to store metadata, to brokers that are using a KRaft quorum controller to store metadata in Apache Kafka®.

This topic will walk you through how to perform the migration. To learn more about KRaft, see KRaft Overview for Confluent Platform.

For deployments managed by Confluent for Kubernetes or Confluent Ansible, you can leverage these tools to migrate your deployments to KRaft. For instructions, refer to Migrate Confluent Platform from ZooKeeper to KRaft using Confluent for Kubernetes and Migrate Confluent Platform from ZooKeeper to KRaft using Ansible Playbooks.

Important

ZooKeeper has been removed from Confluent Platform version 8.0 and later, so you must migrate to KRaft before you upgrade to Confluent Platform version 8.0 or later. These steps assume that you are currently running Confluent Platform version 7.9 or earlier.
Migration of production clusters is generally available in Confluent Platform version 7.6.1; however, it is recommended that you migrate Confluent Platform version 7.7.0 or later.
You should not attempt production migration on any earlier versions of Confluent Platform or Kafka version 3.6.1 or earlier.

Phases of migration¶

There are several phases of migration:

Phase 1: The initial phase when brokers are using ZooKeeper to manage metadata and a ZooKeeper controller is running.
Phase 2: You provision and start KRaft controllers.
Phase 3: A hybrid phase where a KRaft controller that contains the cluster metadata is running, and you configure the brokers for migration. In this phase, you load broker metadata from ZooKeeper to the KRaft controller. This means that some brokers may be running in KRaft mode and others in ZooKeeper during this phase.
Phase 4: A dual write phase where all brokers are moved to KRaft but the KRaft controller also writes the metadata to ZooKeeper.
Phase 5: A final phase when brokers are writing metadata only to KRaft and the metadata is not copied to ZooKeeper.

Phase 1: Cluster is running in ZooKeeper mode¶

Note the following before you start migration:

When you are migrating a cluster from ZooKeeper mode to KRaft mode, changing the metadata version (inter.broker.protocol.version) is not supported.
Every node in your cluster must be running the same version of Confluent Platform or serious errors can occur. This includes new KRaft controllers and brokers and existing brokers in ZooKeeper mode.
Upgrade to Confluent Platform 7.9. Before you migrate from ZooKeeper to KRaft you should upgrade to the latest Confluent Platform version that supports ZooKeeper. For more information, see ZooKeeper Topic Guide.
You cannot revert the cluster to ZooKeeper after you have finalized the migration. Steps to roll back at earlier phases are noted in the Reverting to ZooKeeper mode section.
During migration, if a ZooKeeper broker is running with multiple log directories, any directory failure will cause the broker to shutdown. Brokers with broken log directories will only be able to migrate to KRaft once the directories are repaired. For more information, see KAFKA-16431.
Review the KRaft Limitations and Known Issues before you migrate. Do not migrate if you are using unsupported features.
Review security requirements for KRaft. For configuring SASL/SCRAM for broker to broker communication, see KRaft-based Confluent Platform clusters. For general security information for KRaft, see KRaft Security in Confluent Platform.
To help with debugging, enable TRACE level logging for metadata migration. Add the following line to the log4j.properties file found in the CONFLUENT_HOME/etc/kafka/ directory:
```
log4j.logger.org.apache.kafka.metadata.migration=TRACE
```

Phase 2: Start KRaft controllers¶

This phase involves configuring and starting one or more KRaft controllers. Your KRaft controllers should be the same Confluent Platform version as your existing brokers that you plan to migrate.

Retrieve the cluster ID.

Before you start a KRaft controller, you must format storage for your Kafka cluster with the ID of the existing cluster. You can get this ID with the zookeeper-shell tool. For example:
```
./bin/zookeeper-shell localhost:2181

Connecting to localhost:2181
Welcome to ZooKeeper!

get /cluster/id

{"version":"1","id":"WZEKwK-bS62oT3ZOSU0dgw"}
```
Configure and deploy a KRaft controller quorum. For more details on configuring KRaft controllers, see KRaft Configuration for Confluent Platform.

Deploy a set of KRaft controllers that will take over from ZooKeeper. Configure each of the KRaft controllers with the following:
- A node.id that is unique across all brokers and controllers.
- Migration enabled with zookeeper.metadata.migration.enable=true.
- ZooKeeper connection configuration.
- Other KRaft-mode required properties, such as controller.quorum.voters, and controller.listener.names.
- If you have Cluster Linking configured, you must have password.encoder.secret set in the configuration file during migration. After migration, you can remove this property.
- Copy over settings that are on your existing Kafka brokers like described in Other properties.
  
  Following is an example controller.properties file, for a controller listening on port 9093. This example does not list every property, but only properties that are required for migration:
Note

The KRaft controller node.id values must be different from any existing ZooKeeper broker broker.id property. In KRaft mode, the brokers and controllers share the same node ID namespace.
```
process.roles=controller
node.id=3000
controller.quorum.voters=3000@localhost:9093
controller.listener.names=CONTROLLER
listeners=CONTROLLER://:9093

 # Enable the migration
  zookeeper.metadata.migration.enable=true

 # ZooKeeper client configuration
  zookeeper.connect=localhost:2181

 # Enable migrations for cluster linking
  confluent.cluster.link.metadata.topic.enable=true

 # If you have cluster linking configured, you must have password.encoder.secret set for migration.
 # Other configuration entries ...
```

Format storage on each node with the ID and the controller configuration file. For example:

./bin/kafka-storage format --config ./etc/kafka/controller.properties --cluster-id=WZEKwK-bS62oT3ZOSU0dgw

You might see output like the following:

Formatting /tmp/kraft-controller-logs with metadata version 4.0

Start each controller, specifying the configuration file with migration enabled.
```
./bin/kafka-server-start.sh ./etc/kafka/controller.properties
```

Phase 3: Migrate broker metadata from ZooKeeper to KRaft¶

Once the KRaft controllers are started, you will reconfigure each broker for KRaft migration and restart the broker. You can do a rolling restart to help ensure cluster availability during the migration. Metadata migration automatically starts when all of the brokers have been restarted.

Set the following for brokers:

inter.broker.protocol.version to version 4.0.
Enable migration with zookeeper.metadata.migration.enable=true.
Enable the cluster linking metadata topic (if using cluster linking) with confluent.cluster.link.metadata.topic.enable=true.
The ZooKeeper connection configuration (zookeeper.connect).
Any other KRaft-mode required properties, such as controller.quorum.voters and controller.listener.names.
The controller.listener.names should also be added to listener.security.protocol.map.

Following is an example configuration file for a broker that is ready for the KRaft migration.

# Sample ZK broker server.properties listening on 9092
broker.id=0
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://localhost:9092
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT

# Set the IBP
  inter.broker.protocol.version=4.0

# Enable the migration
  zookeeper.metadata.migration.enable=true

# Cluster linking metadata topic enabled
  confluent.cluster.link.metadata.topic.enable=true

# ZooKeeper client configuration
  zookeeper.connect=localhost:2181

# KRaft controller quorum configuration
  controller.quorum.voters=3000@localhost:9093
  controller.listener.names=CONTROLLER

Restart each broker with the modified configuration file. When all of the Kafka brokers running with ZooKeeper for metadata management are restarted with migration properties set, the migration automatically begins. When migration is complete, you should see the following entry in active controller log at INFO level.

Completed migration of metadata from ZooKeeper to KRaft.

You can also check your broker logs for the znode_type entry for /controller. You should see kraftControllerEpoch

Phase 4: Migrate the brokers to use the KRaft controller¶

At this point, the metadata migration has been completed, but the Kafka brokers are still running in ZooKeeper mode.

The KRaft controller is running in migration mode, and it will send remote procedure calls (RPCs) such as UpdateMetadata and LeaderAndIsr to the ZooKeeper-mode brokers.

To migrate the brokers to KRaft, they need to be reconfigured as KRaft brokers and restarted. Using the broker configuration in the previous phase as an example:

broker.id is replaced with with node.id.
Add process.roles=broker.
Remove ZooKeeper configuration entries.
If you are using ACLs, change the authorizer class. For more information, see ACL concepts.
Remove migration entries.

Following is an example of how a server.properties file for a migrated broker might look. Note that ZooKeeper-specific properties are commented out.

process.roles=broker
node.id=0
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://localhost:9092
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT

# IBP
# inter.broker.protocol.version=4.0

# Remove the migration enabled flag
# zookeeper.metadata.migration.enable=true

# Remove the cluster linking metadata topic setting
# confluent.cluster.link.metadata.topic.enable=true

# Remove ZooKeeper client configuration
# zookeeper.connect=localhost:2181

# Keep the KRaft controller quorum configuration
  controller.quorum.voters=3000@localhost:9093
  controller.listener.names=CONTROLLER

# If using ACLs, change the authorizer from AclAuthorizer used for ZooKeeper to the StandardAuthorizer used for KRaft.
  authorizer.class.name=org.apache.kafka.metadata.authorizer.StandardAuthorizer

Configure and restart each broker in a rolling fashion. All of the brokers are running in KRaft mode.

To help ensure migration will be successful, you can wait for a week or two before finalizing the migration, and use this time to validate KRaft mode for your cluster. This requires that you run the ZooKeeper cluster for a longer period of time, but may prevent issues after migration.

Phase 5: Take KRaft controllers out of migration mode¶

The final step of the migration is to remove the migration entries such as zookeeper.metadata.migration.enable. to take the controllers out of migration mode, and to remove the ZooKeeper configuration entry. An example controller.properties file for a controller that is migrated to KRaft mode and is listening on port 9093 is shown. Note that the ZooKeeper-specific properties are commented out.

process.roles=controller
node.id=3000
controller.quorum.voters=3000@localhost:9093
controller.listener.names=CONTROLLER
listeners=CONTROLLER://:9093

# Disable migration
# zookeeper.metadata.migration.enable=true

# Remove the cluster linking metadata topic setting
# confluent.cluster.link.metadata.topic.enable=true

# Remove ZooKeeper client configuration
# zookeeper.connect=localhost:2181

After this step, restart each controller and your cluster should be migrated to KRaft mode.

Reverting to ZooKeeper mode¶

If the cluster is still in migration mode, it is possible to revert it back to ZooKeeper mode.

The process to follow for reverting depends on how far the migration has progressed. The following table lists migration phases and the revert process at that phase.

To revert successfully, you must have completed the step fully, in the correct order. If you did not fully complete a phase, back out those changes and use the specified revert steps.

Migration phase completed	Steps to revert	Notes
You have provisioned the KRaft controller quorum (Phase 2).	Deprovision the KRaft controller quorum.	No additional steps required.
You have enabled migration on the brokers (Phase 3).	Deprovision the KRaft controller quorum. Use the `zookeeper-shell` tool, and run `deleteall /controller` so that one of the brokers can become the new ZooKeeper-mode controller. Additionally, run `get /migration` followed by `delete /migration` to clear the migration state from ZooKeeper. Both of these steps help ensure that future migrations are not disrupted. On each broker, remove the `zookeeper.metadata.migration.enable`, `controller.listener.names`, and `controller.quorum.voters` configurations, and replace `node.id` with `broker.id`. Finally, perform a rolling restart of all brokers.	You must perform the `zookeeper-shell` step quickly to minimize the amount of time that the cluster lacks a controller. Until the `/controller` znode is deleted, ignore any errors in the broker log about failing to connect to the KRaft controller. Those error logs should disappear after a second roll to ZooKeeper mode.
Migrated brokers to KRaft (Phase 4).	On each broker, remove the `process.roles` configuration, replace the `node.id` with `broker.id` Restore the `zookeeper.connect` configuration property to its previous value. If your cluster requires other ZooKeeper configurations for brokers, such as `zookeeper.ssl.protocol`, re-add those configurations as well. Perform a rolling restart of all brokers. Deprovision the KRaft controller quorum. Connect to ZooKeeper using `zookeeper-shell` tool and run `deleteall /controller` so that one of the brokers can become the new ZooKeeper mode controller. Additionally, run `get /migration` followed by `delete /migration` to clear the migration state from ZooKeeper. Both of these steps help ensure that future migrations are not disrupted. On each broker, remove the following configuration properties from the configuration file: `zookeeper.metadata.migration.enable`, `controller.listener.names`, and `controller.quorum.voters`. Finally, perform a second rolling restart of all brokers. When this is done, the roll-back is complete.	You must perform the `zookeeper-shell` step quickly, to minimize the amount of time that the cluster lacks a controller. Until the `/controller` ZooKeeper node is deleted, you can ignore errors in the broker log about failing to connect to the KRaft controller. Those error logs should disappear after second roll to pure ZooKeeper mode. Make sure that on the first cluster roll, `zookeeper.metadata.migration.enable` remains set to `true`. Do not set it to `false` until the second cluster roll.
The controllers are migrated and migration is finalized (Phase 5).	If you have finalized the ZooKeeper migration, you cannot revert to ZooKeeper mode.	To help ensure migration will be successful, you can wait for a week or two before finalizing the migration, and use this time to validate KRaft mode for your cluster. This requires that you run the ZooKeeper cluster for a longer period of time, but may prevent issues after migration.