ZooKeeper Topic Guide

In older versions of Apache Kafka®, Kafka ZooKeeper was used to store persistent cluster metadata. In Confluent Platform 8.0 and later, ZooKeeper has been removed and you now run Kafka in KRaft mode, meaning brokers that are in KRaft mode store their metadata in a KRaft quorum. This is the new and improved mode of handling metadata in Kafka. For more information on KRaft, see KRaft Overview for Confluent Platform.

To find topics on ZooKeeper, search in the Confluent Platform 7.9 documentation for the term ‘ZooKeeper’, or see the following links:

  • Hardware requirements: For ZooKeeper hardware requirements, configuration, monitoring, and more in Confluent Platform 7.9.x and earlier, see hardware.

  • Security with ZooKeeper: For a tutorial on securing a ZooKeeper- based cluster Confluent Platform version 7.9.x and earlier, see ZooKeeper Security, and Enable Security for a ZooKeeper-Based Cluster in Confluent Platform.

  • Best practices: For best practices running ZooKeeper in production for Confluent Platform 7.9.x and earlier, see Kafka in ZooKeeper mode.

  • Upgrade Kafka with ZooKeeper: For ZooKeeper upgrade instructions in Confluent Platform version 7.9.x and earlier, see upgrade.

  • Running ZooKeeper with Docker: For instructions on configuring Kafka in ZooKeeper mode with Docker, see ZooKeeper Mode.

Broker recovery after network disruptions in ZooKeeper-based deployments

In ZooKeeper-based deployments running Confluent Platform 7.9 and earlier, brokers register themselves with ZooKeeper on startup. After a transient ZooKeeper authentication failure — such as one caused by a network disruption — a broker’s ZooKeeper session may be invalidated. Unlike KRaft mode, ZooKeeper mode does not include automatic retry logic for broker re-registration. As a result, the broker process may continue running, but the broker will not automatically rejoin the cluster.

Important

ZooKeeper-based deployments have a known limitation in broker re-registration behavior. Confluent will not pursue a software fix for this behavior in ZooKeeper mode. KRaft mode resolves this limitation and includes automatic broker re-registration retry logic.

If brokers fail to recover after a transient ZooKeeper authentication failure, perform the following recovery steps:

  1. Identify affected brokers. Review your monitoring or cluster health dashboards to identify brokers that appear offline or are not receiving traffic even though the network disruption has resolved.

  2. Restart the affected brokers. Restart broker processes one at a time to minimize cluster impact:

    sudo systemctl restart confluent-server
    

    If you installed the community package, use confluent-kafka instead of confluent-server.

    Wait for each broker to fully rejoin the cluster before restarting the next one.

  3. Verify cluster recovery. Confirm that all brokers have re-registered with ZooKeeper and are accepting traffic. Check broker status using:

    kafka-broker-api-versions --bootstrap-server <broker-host>:<port> | grep 'id: '
    

To resolve this limitation, migrate your cluster from ZooKeeper to KRaft. For instructions, see Migrate from ZooKeeper to KRaft on Confluent Platform.