Monitor Kafka Streams Applications in Confluent Cloud

Confluent Cloud provides tools to monitor and manage your Kafka Streams applications. Access the Kafka Streams monitoring page by navigating to your cluster’s overview page in Confluent Cloud Console and clicking Kafka Streams.

Note

The Kafka Streams monitoring features in Confluent Cloud Console require your applications to be built with Kafka Streams and Apache Kafka® client libraries version 4.0.0 or later. For more information, see Kafka Streams Upgrade Guide.

For this guide, you create a Kafka Streams application by using Confluent for VS Code, or you can run an existing Kafka Streams application that uses Kafka topics in Confluent Cloud.

If you’re using an existing Kafka Streams application, you can skip to Step 7: Monitor the application. Ensure that the application is built with the latest version of the Kafka Streams and Kafka client libraries. To use all of the monitoring features, Kafka version 4.0.0 or later is required.

For a complete list of metrics available for Kafka Streams applications, see Kafka Streams Metrics.

Prerequisites

Confluent for VS Code.
Docker installed and running in your development environment.
A Kafka cluster running in Confluent Cloud.
- Kafka bootstrap server host:post, for example, pkc-abc123.<cloud-provider-region>.<cloud-provider-region>.confluent.cloud:9092, which you can get from the Cluster Settings page in Cloud Console. For more information, see How do I view cluster details with Cloud Console?.
- Kafka cluster API key and secret, which you can get from the Cluster Overview > API Keys page in Cloud Console.

Step 1: Create the Kafka Streams project

Create the Kafka Streams project by using the Kafka Streams Application template and filling in a form with the required parameters.

Open the template in VS Code directly

To go directly to the Kafka Streams Application template in VS Code, click this button:

Open template in VS Code

The Kafka Streams Application form opens.

Skip the manual steps and proceed to Step 2: Fill in the template form.

Open the template in VS Code manually

Follow these steps to open the Kafka Streams Application template manually.

Open VS Code.
In the Activity Bar, click the Confluent icon. If you have many extensions installed, you may need to click … to access Additional Views and select Confluent from the context menu.
In the extension’s Side Bar, locate the Support section and click Generate Project from Template.
The palette opens and shows a list of available project templates.
Click Kafka Streams Application.
The Kafka Streams Application template opens.

Step 2: Fill in the template form

The project needs a few parameters to connect with your Kafka cluster.

In the Kafka Streams Application form, provide the following values.
- Kafka Bootstrap Server: Enter the host:port string from the Cluster Settings page in Cloud Console. If you’re logged in with Confluent for VS Code, you can right-click on the Kafka cluster in the Resources pane and select Copy bootstrap server.
- Kafka Cluster API Key: Enter the Kafka cluster API key.
- Kafka Cluster API Secret: Enter the Kafka cluster API secret.
- Input Topic: The name of a topic that the Kafka Streams application consumes messages from. Enter input_topic. You create this topic in a later step.
- Output Topic: The name of a topic that the Kafka Streams application produces messages to. Enter output_topic. You create this topic in a later step.
Click Generate & Save, and in the save dialog, navigate to the directory in your development environment where you want to save the project files and click Save to directory.
Confluent for VS Code generates the project files.
- The Kafka Streams code is saved in the src/main/java/examples directory, in a file named KafkaStreamsApplication.java.
- A docker-compose.yml file declares how to build the Kafka Streams code.
- Configuration settings, like bootstrap.servers, are saved in a file named config.properties.
- Secrets, like the Kafka cluster API key, are saved in a file named .env.
- A README.md file has instructions for compiling and running the project.
Open the generated file named build.gradle and update the version of the Kafka Streams and Kafka client libraries to the latest version. Kafka version 4.0.0 or later is required to use the monitoring features.
```
dependencies {
  implementation 'org.apache.kafka:kafka-streams:4.0.0'
  implementation 'org.apache.kafka:kafka-clients:4.0.0'
}
```

Step 3: Connect to Confluent Cloud

In the extension’s side bar, click Sign in to Confluent Cloud.
In the dialog that appears, click Allow.
A browser window opens to the Confluent Cloud login page.
Enter your Confluent Cloud credentials, and click Log in.
When you’re authenticated, you’re redirected back to VS Code, and your Confluent Cloud resources are displayed in the extension’s Side Bar.

Step 4: Create topics

Confluent for VS Code enables creating Kafka topics easily within VS Code.

In the extension’s Side Bar, open Local in the Resources section and click cluster-local.
The Topics section refreshes, and the cluster’s topics are listed.
In the Topics section, click ＋ to create a new topic.
The palette opens with a text box for entering the topic name.
In the palette, enter input_topic. Press ENTER to confirm the default settings for the partition count and replication factor properties.
The new topic appears in the Topics section.
Repeat the previous steps for another new topic named output_topic.

Step 5: Compile and run the project

Your Kafka Streams project is ready to build and run in a Docker container.

In your terminal, navigate to the directory where you saved the project.
The Confluent for VS Code extension saves the project files in a subdirectory named kafka-streams-simple-example. Run the following command to navigate to this directory.
```
cd kafka-streams-simple-example
```
Run the following command to build and run the Kafka Streams application.
```
docker compose up --build
```
Docker downloads the required images and starts a container that compiles the project.

Step 6: Produce messages to the input topic

The Kafka Streams application you created in the previous step consumes messages from input_topic and produces messages to an output_topic. For convenience, this guide uses a Datagen Source connector to produce messages to input_topic.

In your browser, log in to Confluent Cloud Console and navigate to your Kafka cluster.
In the navigation menu, click Connectors.
Click Add connector.
On the Search box, type datagen.
Select the Datagen Source connector, and in the Launch Sample Data dialog, click Additional configuration.
In the Choose the topics you want to send data to section, select input_topic, and click Continue.
In the API key section, click Generate API key & download and click Continue.
In the Configuration page, select JSON for the output record value format, and Orders for the schema, and click Continue.
For Connector sizing, leave the slider at the default of 1 task and click Continue.
Name the connector Kafka_Streams_data_source and click Launch connector.
The connector is provisioned, and after a short time starts producing messages to input_topic.

Step 7: Monitor the application

After your Kafka Streams application is running, you can monitor it by using the Confluent Cloud Console.

In Cloud Console, navigate to your Kafka cluster’s overview page, and in the navigation menu, click Kafka Streams.
The Kafka Streams page displays a list of all the Kafka Streams applications running in your Kafka cluster. The displayed metrics are aggregated across all the Kafka Streams applications in your Kafka cluster.
The displayed metrics include:
- Application Name: The name of the Kafka Streams application.
- Client version: The version of the client used by the application.
- Status: The status of the application.
- Running threads: The number of threads running in the application.
- Total production: The total number of messages produced by the application in the last minute.
- Total consumption: The total number of messages consumed by the application in the last minute.
- Total lag: The total lag of the application.
In the list, find the application you want to monitor. You can search for the application by name. If you created it with Confluent for VS Code, the application name starts with vscode-kafka-streams-simple-example-.
Click the link in the Application column to open the application’s overview page.
The overview page displays the following metrics:
- Size of memtables
- Estimated number of keys
- Block cache usage
Graphs display the following metrics:
- End-to-end latency
- Process ratio
It takes a few minutes for the metrics to be collected and displayed.

Performance considerations

Network traffic and metadata fetching

Kafka Streams applications have different metadata requirements compared to standard Kafka producers and consumers. Understanding these differences is important for monitoring network usage and application performance.

Metadata fetching behavior

Kafka Streams applications fetch more comprehensive metadata than plain Kafka clients. Specifically, Kafka Streams applications need to pull metadata for all topics in the cluster, not just the topics they directly produce to or consume from. This behavior is required for the internal operations of the Kafka Streams framework.

Impact on service accounts with DataDiscovery or DataSteward roles

When service accounts configured with DataDiscovery or DataSteward roles are used with Kafka Streams applications, you may observe increased network traffic compared to applications using service accounts with more restrictive permissions.

This increase occurs because:

DataDiscovery and DataSteward roles provide read access to topic metadata across all topics within an environment.
Kafka Streams applications fetch comprehensive topic metadata as part of their normal operation.
The combination results in additional network requests for metadata retrieval.

Monitoring recommendations

When monitoring Kafka Streams applications using service accounts with DataDiscovery or DataSteward roles, you should:

Monitor network utilization metrics to understand the baseline metadata traffic
Consider this additional metadata fetching when planning network capacity.
Use the Metrics API to track request_count and other network-related metrics for your applications.
Evaluate whether the DataDiscovery or DataSteward roles are necessary for your specific Kafka Streams application use case.

For more information about role-based access control and service account configuration, see Role-based Access Control (RBAC) on Confluent Cloud and Service Accounts on Confluent Cloud.