Scale Kafka Clusters and Balance Data

At a high level, adding brokers to a cluster involves a few key steps:

Define the configuration for each of the new brokers.
Provision storage, networking, and compute resources to the brokers.
Start the brokers with the defined configurations and provisioned resources.
Reassign partitions across the cluster so that the new brokers share the load and the cluster’s overall performance improves.

To automate the above process, Confluent for Kubernetes leverages Self-Balancing, which is enabled by default with CFK.

If you need to manually enable Self-Balancing, see Enable Self-Balancing for the steps.

Scale up

To scale up the Kafka cluster:

Increase the number of Kafka replicas in the Kafka CR:
```
kafka:
  replicas:
```
Update Kafka with the new settings:
```
kubectl apply -f <Kafka CR file>
```
Ensure that proper DNS records are configured for the new brokers, and ensure that the CFK can resolve the new broker hostname, using a command such as nslookup.

If you are using hosts file instead of a DNS service, update hosts file with the new brokers information. For example:
1. Get the new broker IP addresses:
```
kubectl get services -n <namespace>
```
2. Refer to the existing broker host names with the broker prefix, and derive the hostnames of the new brokers.
3. Add the new broker hosts to the /etc/hosts file, and inject the updated file to the CFK pod as described in Adding entries to Pod /etc/hosts.

Scale down

With Confluent for Kubernetes, you must scale down Kafka clusters one broker at a time. Scaling down Kafka clusters multiple brokers at a time is not supported.

When Confluent for Kubernetes scales down a Kafka cluster by one broker, it deletes the broker pod and the backing persistentVolume. This deletes any partitions that were stored on that broker.

To scale down a Kafka cluster:

Decrease the number of Kafka brokers by one:

kafka:
  replicas: 4 # In this example, scaling down from 5

Wait an appropriate time for partitions that were on that broker to be replicated to other existing brokers.

You can check if the partitions were replicated by looking at the Kafka metric, Total Under Replicated Partitions Across Kafka Brokers, in Confluent Control Center (Legacy) UI or in your monitoring solution. The value should be 0.

Depending on the cluster size and the number of topics, this time could vary.
Repeat the above steps, decreasing the number of Kafka brokers by one, and waiting, until the desired broker count is achieved.

Enable Self-Balancing

The Self-Balancing feature is enabled by default in Confluent for Kubernetes.

To disable Self-Balancing, pass the setting in configOverrides in the Kafka CR configuration file as shown below.

kafka:
  configOverrides:
    server:
      - confluent.balancer.enable=false

To balance the load across the cluster whenever an imbalance is detected, set confluent.balancer.heal.uneven.load.trigger to ANY_UNEVEN_LOAD. The default is EMPTY_BROKER.

kafka:
  configOverrides:
    server:
      - confluent.balancer.heal.uneven.load.trigger=ANY_UNEVEN_LOAD

For a complete list of available settings you can use to control Self-Balancing, see Configuration Options and Commands for Self-Balancing Clusters.