Auto Data Balancing in Confluent Platform

The confluent-rebalancer tool balances data so that the number of leaders and disk usage are even across brokers and racks on a per topic and cluster level while minimizing data movement. It also integrates closely with the replication quotas feature in Apache Kafka® to dynamically throttle data-balancing traffic.

The tool is part of Confluent Platform and can also be installed on its own using the confluent-rebalancer package.

Important

  • The Manage Self-Balancing Kafka Clusters in Confluent Platform feature is the preferred alternative to Auto Data Balancer, starting in Confluent Platform 6.0.0. For a detailed feature comparison, see Self-Balancing vs. Auto Data Balancer.

  • Do not use Auto Data Balancer and the log directory cordoning feature, added in Confluent Platform 8.3, together. The two features can conflict and lead to unexpected behavior.

  • Auto Data Balancer and Self-Balancing cannot be used together. If you want to run Auto Data Balancer, you must first make sure that Self-Balancing is off.

  • If Configure Multi-Region Clusters in Confluent Platform is enabled and a topic is created with a replica placement policy, Auto Data Balancer will redistribute the preferred leaders among all racks that have replicas.