Monitor a Confluent Platform cluster in Confluent Cloud¶
After you register a Confluent Platform cluster, you can monitor its key metrics directly in the Confluent Cloud Console. The Unified Stream Manager (USM) Agent collects telemetry data from your Confluent Platform deployment and displays it in Confluent Cloud.
The Cluster Overview page is your primary dashboard for monitoring. It provides a high-level summary of your cluster’s operational health and key performance indicators. This summary helps you quickly assess the state of your streaming infrastructure.
View the cluster overview page¶
To view the monitoring dashboard for your Confluent Platform cluster, follow these steps:
- In the Confluent Cloud Console, navigate to the Environments page and select your environment.
- In the navigation menu, click Clusters and select your Confluent Platform cluster.
The Cluster Overview tab is displayed by default.
Understand cluster overview metrics¶
The Cluster Overview page includes the following panels. The data displayed corresponds to the time range that you select from the dropdown menu.
- Throughput: A graph that shows the historical flow of data into and out of your Confluent Platform cluster.
- Production (bytes/sec):: The rate at which producers write data into the cluster’s brokers.
- Consumption (bytes/sec): The rate at which consumers read data from the cluster’s brokers.
- Storage: A graph that displays the total disk space used by the brokers in your cluster over time.
- Topics: The total number of topics in the cluster.
- Partitions: The total number of partitions across all topics in the cluster. The number of partitions directly impacts the parallelism and throughput capacity of your topics.
View cluster settings¶
The Cluster Settings page displays read-only identification details for your connected Confluent Platform cluster. This information is synchronized from your self-managed environment and can’t be modified from the Confluent Cloud Console.
To access cluster settings, follow these steps:
- In the Confluent Cloud Console, navigate to the Environments page and select your environment.
- In the navigation menu, click Clusters and select your cluster.
- In the cluster’s navigation menu, click Cluster Settings.
Monitor broker metrics¶
The Brokers tab provides a detailed view of the health and performance of the individual Kafka brokers in your Confluent Platform cluster. You can use these metrics to troubleshoot issues and identify performance bottlenecks.
This page contains a cluster-wide summary and a detailed, per-broker table.
Cluster-wide broker metrics¶
The top of the page displays a Broker Throughput graph and a series of panels that show aggregated metrics from across the entire cluster.
- Broker Throughput: A graph that shows the combined throughput—Bytes In and Bytes Out—for all brokers.
- Brokers: The total number of active brokers in your cluster.
- Broker disk usage: The total disk space used by Kafka data across all brokers.
- Network pool usage: The percentage of network threads used to handle client requests. High usage can indicate a network bottleneck.
- Request I/O usage: The percentage of I/O threads used to process disk reads and writes. High usage can indicate a disk bottleneck.
- Unclean leader elections: A count of leader elections where a replica that was not fully in-sync was elected as the new leader. This event indicates that topic availability was prioritized over data consistency and can result in data loss.
- Active controllers: The number of brokers acting as the cluster controller.
- Offline partitions: The total number of partitions that do not have an active leader. These partitions are unavailable for reads and writes.
Per-broker metrics table¶
The Brokers table provides granular metrics for each node. You can use these metrics to compare performance and diagnose issues for a specific broker.
- Broker ID: The unique identifier for the broker.
- Average produce/consume rate: The average rate of messages being produced to (written) and consumed from (read) the broker.
- Failed produce/consume requests: The number of failed requests from producers or consumers.
- Leader count: The number of partitions for which this broker is the current leader.
- Under replicated partitions: The number of partitions on this broker that are missing replicas. This indicates a problem with data durability.
- Under min-ISR partitions: The number of partitions where the current number of in-sync replicas is below the configured minimum.
View broker metrics¶
To view broker metrics, follow these steps:
- In the Confluent Cloud Console, navigate to the Environments page and select your environment.
- In the navigation menu, click Clusters and select your Confluent Platform cluster.
- In the cluster’s navigation menu, click Cluster Settings, and then click the Brokers tab.