Scale Confluent Platform Clusters and Balance Data

Scale Kafka cluster

At a high level, adding brokers to a cluster involves a few key steps:

  • Define the configuration for each of the new brokers.
  • Provision storage, networking, and compute resources to the brokers.
  • Start the brokers with the defined configurations and provisioned resources.
  • Reassign partitions across the cluster so that the new brokers share the load and the cluster’s overall performance improves.

To automate the above process, Confluent for Kubernetes (CFK) leverages Self-Balancing, which is enabled by default with CFK.

If you need to manually enable Self-Balancing, see Enable Self-Balancing for the steps.

Scale up Kafka cluster

To scale up a Kafka cluster:

  1. Increase the number of Kafka replicas using one of the following options:

    • Use the kubectl scale command:

      kubectl scale kafka <Kafka-CR-name> --replicas=N
      
    • Increase the number of Kafka replicas in the Kafka custom resource (CR) and apply the new setting with the kubectl apply command:

      spec:
        replicas:
      
  2. Ensure that proper DNS records are configured for the new brokers, and ensure that the CFK can resolve the new broker hostname, using a command such as nslookup.

    If you are using hosts file instead of a DNS service, update hosts file with the new brokers information. For example:

    1. Get the new broker IP addresses:

      kubectl get services -n <namespace>
      
    2. Refer to the existing broker host names with the broker prefix, and derive the hostnames of the new brokers.

    3. Add the new broker hosts to the /etc/hosts file, and inject the updated file to the CFK pod as described in Adding entries to Pod /etc/hosts.

Scale down Kafka cluster

You have two options to scale down Kafka clusters:

  • Automatic scale down

    Use this option for Confluent Platform versions 7.x and later.

  • Manual scale down

    Use this option for Confluent Platform versions 6.x.

With either option above, do not decrease the number of Kafka brokers to less than the largest replication factor of any topic in your Kafka cluster. CFK sets a default replication factor of 3 for all Kafka topics.

Automatically scale down Kafka cluster

With Confluent Platform 7.x and later, you can have CFK scale down Kafka clusters.

CFK leverages the Self-Balancing feature to automate the shrinking process, which is enabled by default with CFK.

To have CFK automatically scale down your cluster, the following requirements must be satisfied:

  • Set up Admin REST API as described in Kafka Admin REST API. CFK uses the KafkaRestClass resource in the namespace where the Kafka cluster is running.

  • If the Admin REST API is set up with the basic authentication for the REST client, the first user listed in basic.txt will be used to shrink the cluster. See Basic authentication for details on basic.txt.

    This first user must have a role that is listed under spec.services.kafkaRest.authentication.basic.roles in the Kafka custom resource (CR).

  • You cannot use the auto scaling down capability when the Kafka brokers use the DirectoryPathInContainer feature to specify the credentials to authenticate to Confluent Admin REST API or MDS. Instead, you can use the auto scaling down capability when using Kubernetes secrets to specify credentials to authenticate to Confluent Admin REST API or MDS.

To automatically scale down a Kafka cluster:

  1. Make sure the Kafka cluster is stable.

  2. Enable the feature through annotation:

    kubectl annotate <Kafka CR name> platform.confluent.io/enable-shrink="true"
    
  3. Decrease the number of brokers in the Kafka CR and apply the change using the kubectl apply command:

    spec:
      replicas:
    

    replicas: should not be set to less than 3.

    CFK triggers the workflow to shrink the Kafka cluster according to the value of replicas updated in the Kafka custom resource (CR).

Manually scale down Kafka cluster

With Confluent Platform 6.x, you must scale down Kafka clusters one broker at a time. Scaling down Kafka clusters multiple brokers at a time is not supported.

When CFK scales down a Kafka cluster by one broker, it deletes the broker pod and the backing persistentVolume. This deletes any partitions that were stored on that broker.

To manually scale down a Kafka cluster:

  1. Decrease the number of Kafka brokers by one using one of the following options. The number of Kafka brokers should not be set to fewer than 3.

    The examples scale down Kafka from 5 to 4.

    • Use the kubectl scale command:

      kubectl scale kafka <Kafka-CR-name> --replicas=4
      
    • Decrease the number of Kafka replicas in the Kafka CR and apply the new setting with the kubectl apply command:

      spec:
        replicas: 4
      
  2. Wait an appropriate time for partitions that were on that broker to be replicated to other existing brokers.

    You can check if the partitions were replicated by looking at the Kafka metric, Total Under Replicated Partitions Across Kafka Brokers, in Confluent Control Center UI or in your monitoring solution. The value should be 0.

    Depending on the cluster size and the number of topics, this time could vary.

  3. Repeat the above steps, decreasing the number of Kafka brokers by one, waiting until the desired broker count is achieved.

Enable Self-Balancing

The Self-Balancing feature is enabled by default in Confluent for Kubernetes.

To disable Self-Balancing, pass the setting in configOverrides in the Kafka CR configuration file as shown below.

spec:
  configOverrides:
    server:
      - confluent.balancer.enable=false

To balance the load across the cluster whenever an imbalance is detected, set confluent.balancer.heal.uneven.load.trigger to ANY_UNEVEN_LOAD. The default is EMPTY_BROKER.

kafka:
  configOverrides:
    server:
      - confluent.balancer.heal.uneven.load.trigger=ANY_UNEVEN_LOAD

For a complete list of available settings you can use to control Self-Balancing, see Configuration Options and Commands for Self-Balancing Clusters.

Scale other Confluent Platform clusters

Use the below command to scale up or down other Confluent Platform components:

kubectl scale <CP-component-CR-kind> <component-CR-name> --replicas=N