Cluster Load Metric for Dedicated Clusters in Confluent Cloud

The Cluster Load metric for Dedicated clusters helps provide visibility into the current load on a cluster.

Access the cluster load metric

You can access the cluster load metric as both a current value and a time series graph that represents historical load values. Use the Metrics API or view the metric in the Confluent Cloud Console.

To view the cluster load metric in the Cloud Console:

  1. Navigate to the clusters page for your environment and choose a cluster.

  2. View the Cluster load metric on the cluster Overview page.

  3. Use the drop-down to choose the cluster load averaged over the Last hour, Last 6 hours, Last 24 hours, or Last 7 days.


Details about cluster load

Cluster load is a metric that measures how utilized a Confluent Cloud Dedicated cluster is with its current workload. In the Cloud Console, the cluster load graph shows percentage values between 0 and 100 over the time period selected, with 0% indicating no load, and 100% representing a fully-utilized cluster. When the cluster load increases, you are also likely to see an increase in observed production and consumption latency.

There are a number of considerations when calculating the cluster load for a Dedicated cluster.

A CKU is the unit that specifies the maximum capability of the following dimensions of a Dedicated cluster:

  • ingress
  • egress
  • connections
  • requests

The ability to utilize any one of these dimension depends on workload behavior, and a wide range of workload behaviors can overload a Confluent Cloud cluster without fully utilizing any of these individual dimensions. For example, application patterns such as many partitions or a lot of small requests will drive cluster load increases despite using relatively low amounts of throughput. In contrast, applications with long-lived connections and efficient request batching generally can drive more throughput without excessively increasing cluster load.

As a general rule, a higher cluster load results in higher latencies. Latency increases from high cluster load are most visible at the 99th percentile (p99). At p99, very high latencies that result in request or connection timeouts may cause the cluster to appear unavailable.

Clusters in Confluent Cloud protect themselves from getting to an over-utilitized state via request backpressure (throttling). 80% is the cluster load threshold where applications may start seeing high latencies or timeouts on the cluster, or when they are more likely to experience throttling as the cluster backpressures requests or connection attempts. To understand if your applications are being throttled, see Throttling.

Evaluate cluster expansion with cluster load

Cluster expansion is a reasonable first step in troubleshooting performance issues in your Apache Kafka® applications. Use the cluster load metric to decide whether to expand your clusters, or if this does not resolve the performance issues, shrink them back to the original size.

Generally, expanding clusters provides more capacity for your workloads, and in many cases, will help improve the performance of your Kafka applications. However, there are scenarios in which cluster expansion does not adequately resolve application performance concerns. For more complicated scenarios, see Dedicated Cluster Performance and Expansion.