Topic Operations

A Kafka topic is the fundamental unit of organization in Kafka. Adding, modifying and deleting topics are operations you will perform on a regular basis. You can either add topics manually with the kafka-topics.sh tool or set them up be created automatically when data is first published to a non-existent topic. You can also use this tool to describe the current state of a topic, or delete a topic. Use the kafka-configs.sh tool to modify settings for a topic that already exists.

This topic provides more information on how to use these tools.

Confluent Tip

The Confluent CLI with the kafka-topic subcommand makes it easier create, modify and delete topics using the command line. For more information, see confluent kafka topic.

Alternatively, you can use the Kafka CLI tools provided when you install Confluent Platform. Confluent’s versions of these tools are located under $CONFLUENT_HOME/bin. Confluent has dropped the .sh extensions, so you do not need to use the extensions when calling the Confluent versions of these tools. For more information, see CLI Tools for Confluent Platform.

Add a topic

You can use the kafka-topics.sh tool to create new topics and and specify topic configurations that override the broker’s default configuration settings, such as data retention time.

Following are two key configuration properties.

  • replication-factor specifies the number of servers a message will be written to. You should use a replication factor of 2 or 3 so that you can transparently bounce machines without interrupting data consumption. For example, if you specify a replication factor of 3 then up to 2 servers can fail before you will lose access to the data.
  • partitions controls how many event logs the topic are sharded into. Partition count has several impacts. First, each partition must fit entirely on a single server. So, if you have 20 partitions, the full data set, and read and write load, will be handled by no more than 20 servers (not counting replicas). Finally, the partition count impacts the maximum parallelism of your consumers. To learn more about partition count, see Kafka Replication and Committed Messages and Choose and Change the Partition Count.

The complete set of per-topic configurations is documented in topic level configurations.

The following command shows how to create a topic, specifying the partitions, replication factor, and an additional configuration value.

bin/kafka-topics.sh --bootstrap-server <host:port> --create --topic <topic-name> \
--partitions 20 --replication-factor 3 --config <configName>=<configValue>

Describe a topic

You can describe a topic using the kafka-topics.sh tool. The following command shows how to describe a topic:

bin/kafka-topics.sh --bootstrap-server <host:port> --describe --topic <topic-name>

The tool output lists the configuration properties and values for the topic.

Change the retention value for a topic

If topics are auto-created you may want to tune the default topic configurations applied to them.

To modify most configuration settings for a topic, you can use the kafka-configs.sh tool, which can be found in the /bin directory. You can change how the data in a topic is retained by changing the following topic properties:

  • retention.ms - Sets retention by time
  • retention.bytes - Sets retention by size

The following example shows how to change topic retention. This command sets the retention size to 500 MB, and the retention time to -1, meaning infinite.

bin/kafka-configs.sh --bootstrap-server <host:port> --entity-type topics --entity-name <topic-name> --alter --add-config retention.ms=-1,retention.bytes=524288000

Increase partitions for a topic

To increase partitions for a topic, you can use the kafka-topics tool.

Remember that partitions can be used to semantically partition data, and adding partitions doesn’t change the partitioning of existing data. This means that when you add partitions, you may disturb consumers that rely on a particular partition. For example, if data is partitioned by hash(key) % number_of_partitions, partitioning will be shuffled when partitions are added.

bin/kafka-topics.sh --bootstrap-server <broker:port> --topic <topic-name> --alter --partitions <number>

Delete a topic

Use the kafka-topics.sh tool to delete a topic from a cluster. The following command shows how to delete a topic.

bin/kafka-topics.sh --bootstrap-server <host:port> --delete --topic <topic-name>

Note

This website includes content developed at the Apache Software Foundation under the terms of the Apache License v2.