Managing Streams Application Topics¶
A Kafka Streams application continuously reads from Apache Kafka® topics, processes the read data, and then writes the processing results back into Kafka topics. The application may also auto-create other Kafka topics in the Kafka brokers, for example state store changelogs topics. This section describes the differences these topic types and how to manage the topics and your applications.
Kafka Streams distinguishes between user topics and internal topics.
User topics¶
User topics exist externally to an application and are read from or written to by the application, including:
- Input topics
- Topics that are specified via source processors in the application’s topology; e.g. via
StreamsBuilder#stream()
,StreamsBuilder#table()
andTopology#addSource()
. - Output topics
- Topics that are specified via sink processors in the application’s topology; e.g. via
KStream#to()
,KTable.to()
andTopology#addSink()
. - Intermediate topics
- Topics that are both input and output topics of the application’s topology.
User topics must be created and manually managed ahead of time (e.g., via the topic tools). If user topics are shared among multiple applications for reading and writing, the application users must coordinate topic management. If user topics are centrally managed, application users would not need to manage topics themselves but simply obtain access to them.
Note
You should not use the auto-create topic feature on the brokers to create user topics, because:
- Auto-creation of topics may be disabled in your Kafka cluster.
- Auto-creation automatically applies the default topic settings such as the replicaton factor. These default settings might not be what you want for certain output topics (e.g.,
auto.create.topics.enable=true
in the Kafka broker configuration).
Internal topics¶
Internal topics are used internally by the Kafka Streams application while executing, for example the changelog topics for state stores. These topics are created by the application and are only used by that stream application.
Note
From the broker perspective, internal topics are regular topics in contrast to “broker internal” topics
like __consumer_offsets
or __transaction_state
.
If security is enabled on the Kafka brokers, you must grant the underlying clients admin permissions so that they can create internal topics set. For more information, see Kafka Streams Security.
Note
The internal topics follow the naming convention <application.id>-<operatorName>-<suffix>
, but this convention
is not guaranteed for future releases. For more information about configuring parameters for internal topics,
see Internal topic parameters.
If automatic topic creation has been disabled, Kafka Streams applications will continue to work.
Kafka Streams applications use the Admin Client, so internal topics are still created.
Even if automatic topic creation is enabled, internal topics will be created with
a specific number of partitions and a replication factor as specified in the StreamsConfig
.
The following settings apply to the default configuration for internal topics:
- For all internal topics,
message.timestamp.type
is set toCreateTime
. - For internal repartition topics, the compaction policy is
delete
and the retention time is-1
(infinite). - For internal changelog topics for key-value stores, the compaction policy is
compact
. - For internal changelog topics for windowed key-value stores, the compaction
policy is
delete,compact
. The retention time is set to 24 hours plus your setting for the windowed store.
Note
This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.