Confluent Control Center

Confluent Control Center is a web-based tool for managing and monitoring Apache Kafka®. Control Center provides a user interface that allows developers and operators to get a quick overview of cluster health, observe and control messages, topics, and Schema Registry, and to develop and run ksqlDB queries.

Control Center includes the following pages where you can drill down to view data and configure features in your Apache Kafka® environment.

Clusters overview
View healthy and unhealthy clusters at a glance and search for a cluster being managed by Control Center. Click on a cluster tile to drill into views of critical metrics and connected services for that cluster.
Brokers overview
View essential Kafka metrics for brokers in a cluster.
Topics
Add and edit topics, view production and consumption metrics for a topic, browse and download messages, and manage Schema Registry for topics.
Connect
Manage, monitor, and configure connectors with Kafka Connect, the toolkit for connecting external systems to Kafka.
ksqlDB
Develop applications against ksqlDB, the streaming SQL engine for Kafka. Use the ksqlDB page in Control Center to: run, view, and terminate SQL queries; browse and download messages from query results; add, describe, and drop streams and tables; and view schemas of available streams and tables in a cluster.
Consumers
View the consumer groups associated with a selected Kafka cluster, including the number of consumers per group, the number of topics being consumed, and consumer lag across all relevant topics. The Consumers feature also contains the redesigned streams monitoring page.
Replicators
Monitor and configure replicated topics and create replica topics that preserve topic configuration in the source cluster.
Cluster Settings
View and edit cluster properties and broker configurations.
Alerts
Use Alerts to define the trigger criteria for anomalous events that occur during data monitoring and to trigger an alert when those events occur. Set triggers, actions, and view alert history across all of your Control Center clusters.

Architecture

Control Center is comprised of these parts:

  • Metrics interceptors that collect metric data on clients (producers and consumers).
  • Kafka to move metric data.
  • The Control Center application server for analyzing stream metrics.

Here is a common Kafka environment that uses Kafka to transport messages from a set of producers to a set of consumers that are in different data centers, and uses Replicator to copy data from one cluster to another:

../_images/kafka_example.png

An example system using Kafka

Confluent Control Center helps you detect any issues when moving data, including any late, duplicate, or lost messages. By adding lightweight code to clients, stream monitoring can count every message sent and received in a streaming application. By using Kafka to send metrics information, stream monitoring metrics are transmitted quickly and reliably to the Control Center application.

../_images/kafka_example_CC.png

An example system using Kafka, monitored by Control Center

Time windows and metrics

Stream monitoring is designed to efficiently audit the set of messages that are sent and received. To do this, Control Center uses a set of techniques to measure and verify delivery.

The interceptors work by collecting metrics on messages produced or consumed on each client, and sending these to Control Center for analysis and reporting. Interceptors use Kafka message timestamps to group messages. Specifically, the interceptors will collect metrics during a one minute time window based on this timestamp. You can calculate this by a function like floor(messageTimestamp / 60) * 60. Metrics are collected for each combination of producer, consumer group, consumer, topic, and partition. Currently, metrics include a message count and cumulative checksum for producers and consumer, and latency information from consumers.

Latency and system clock implications

Latency is measured by calculating the difference between the system clock time on the consumer and the timestamp in the message. In a distributed environment, it can be difficult to keep clocks synchronized. If the clock on the consumer is running faster than the clock on the producer, then Control Center might show latency values that are higher than the true values. If the clock on the consumer is running slower than the clock on the producer, then Control Center might show latency values that are lower than the true values (and in the worst case, negative values).

If your clocks are out of sync, you might notice some unexpected results in Confluent Control Center. Confluent recommends using a mechanism like NTP to synchronize time between production machines; this can help keep clocks synchronized to within 20ms over the public internet, and to within 1 ms for servers on the same local network.

Tip

NTP practical example: In an environment where messages take one second or more to be produced and consumed, and NTP is used to synchronize clocks between machines, the latency information should be accurate to within 2%.