Looking for Confluent Platform Cluster Linking docs? This page describes Cluster Linking on Confluent Cloud. If you are looking for Confluent Platform documentation, check out Cluster Linking on Confluent Platform.

Share Data Across Clusters, Regions, and Clouds

The steps below guide you through a basic topic data sharing scenario. By the end of this tutorial, you will have configured two clusters and successfully used Cluster Linking to create a mirror topic and share topic data across the clusters. You will also learn how to stop mirroring to make the topic writable, and verify that the two topics have diverged.

Prerequisites

Got Confluent Cloud? Make sure it’s up-to-date. If you already have Confluent Cloud installed, just use ccloud update to get the latest version of the Confluent Cloud CLI with new Cluster Linking commands and tools.

  • Make sure you have followed the steps under Commands and Prerequisites in the overview. These steps tell you the easiest way to get an up-to-date version of Confluent Cloud if you don’t already have it, and provide a quick overview of Cluster Linking commands.
  • Your destination cluster must be a Dedicated cluster with Public internet endpoints.
  • Your source cluster can be a Basic, Standard, or Dedicated cluster with Public internet endpoints. If you do not have these clusters already, you can create them in the Confluent Cloud UI or in the Confluent Cloud CLI.

What the Tutorial Covers

For this tutorial, you will:

  • Create two clusters; one of which will serve as the source and the other as the destination cluster. The destination cluster must be a Dedicated cluster.
  • Set up a cluster link.
  • Create a mirror topic based on a topic on the source cluster.
  • Produce data to the source topic.
  • Consume data on the mirror topic (destination) over the link.
  • Promote the mirror topic, which will change it from read-only to read/write.

Let’s get started!

Set up two clusters

If you already have two Confluent Cloud clusters set up, one of which is a Dedicated cluster to use as the destination, you can skip to the next task.

Otherwise, set up your clusters as follows.

Tip

If you need more guidance than given below, see Create a Cluster in Confluent Cloud and Step 1: Create a Kafka cluster in Confluent Cloud in the Getting Started guide.

  1. Log on to the Confluent Cloud Console.

  2. Create two clusters in the same environment, as described in Create a Cluster in Confluent Cloud.

    At least one of these must be a Dedicated cluster, which will serve as the destination cluster.

    For example, you could create a Basic cluster called US-EAST to use as the source, and a Dedicated cluster called US-WEST to use as the destination.

  3. When you have completed these steps, you should have two clusters, similar to the following.

    ../../_images/clink-source-dest.png

Populate the source cluster

Create a topic on the source cluster.

For example, create a topic called tasting-menu on US-EAST (the Basic cluster that will act as the source).

  • To add a topic from the Web UI, navigate to the Topics page on the source cluster (US-EAST > Topics), click Add a topic, fill in the topic name, and click Create with defaults.

  • To add a topic from the Confluent Cloud CLI, log in to the CLI (ccloud login), select the environment and cluster you want to use, and enter the command ccloud kafka topic create <topic>. For example:

    ccloud kafka topic create tasting-menu
    

    More detail about working with the Confluent Cloud CLI is provided in the next tasks, so if you don’t yet know how to select an environment or cluster on the CLI, this is explained below.

Mirror a topic

Now that you have a cluster link, you can mirror topics across it; from source to destination.

  1. List the topics on the source cluster.

    ccloud kafka topic list --cluster <src-cluster-id>
    

    For example:

    $ ccloud kafka topic list --cluster lkc-7k6kj
          Name
    +--------------+
      stocks
      tasting-menu
      transactions
    
  2. Create a mirror topic.

    Choose a source topic to mirror and use your cluster link to mirror it.

    Tip

    If you don’t already have a topic in mind, create one on the source cluster now with ccloud kafka topic create <topic-name> --cluster <src-cluster-id>. If you’ve been following along, use tasting-menu.

    You create mirror topics on the destination cluster just as you would create a normal topic, but with a few extra parameters.

    ccloud kafka mirror create <topic-name> --link <link-name> --cluster <dst-cluster-id>
    

    For example:

    $ ccloud kafka mirror create tasting-menu --link usa-east-west --cluster lkc-161v5
    Created topic "tasting-menu".
    

    Note

    • The mirror topic name (on the Destination) must be the same as the Source topic name. (The mirror topic automatically takes its name from the original topic it’s based on.) Topic renaming is not yet supported.
    • Make sure that you use the Destination cluster ID in the command to create the mirror topic.

Test the mirror topic by sending data

To test that the cluster link is mirroring data, use the Confluent Cloud CLI to produce some data to the topic on the source cluster, and consume it from the mirror topic on the destination cluster.

To do this, you must associate your CLI with an API key for each cluster. These should be different API keys than the one used for the cluster link. These do not have to be associated with a service account.

For example, to create the API key for one of the clusters (if you’ve not already done so): ccloud api-key create --resource <src-or-dst-cluster-id>

To tell the CLI to use the API key associated with one or the other cluster: ccloud api-key use <api-key> --resource <src-or-dst-cluster-id>

  1. Tell the CLI to use your destination API key for the destination cluster:

    ccloud api-key use <dst-api-key> --resource <dst-cluster-id>
    

    You will get a verification that the API key is set as the active key for the given cluster ID.

    Note

    This is a one-time action that will persist forever. This API key will be used whenever you perform one-time actions on your destination cluster. It will not be stored on the cluster link. If you create a cluster link with this API key, then it will continue to run even if you later disable this API key.

  2. Open two new command windows for a producer and consumer.

    In each of them, log on to Confluent Cloud, and make sure you are using the environment that contains both your Source and Destination clusters.

    As before, use the commands ccloud environment list, ccloud environment use <environment-ID>, and ccloud kafka cluster list to navigate and verify where you are.

  3. In one of the windows, start a producer to produce to your source topic.

    ccloud kafka topic produce <topic-name> --cluster <src-cluster-id>
    
  4. In the other window, start a consumer to read from your mirror topic.

    ccloud kafka topic consume <topic-name> --cluster <dst-cluster-id>
    
  5. Type entries to produce in the first terminal on your source and watch the messages appear in your second terminal on the mirror topic on the destination.

    ../../_images/clink-produce-consume.png

    You can even open another command window and start a consumer for the source cluster to verify that you are producing directly to the source topic. Both the source and mirror topic consumers will match, showing the same data consumed.

    Tip

    The consumer command example shown above reads data from a topic in real time. To consume from the beginning: ccloud kafka topic consume --from-beginning <topic> --cluster <cluster-id>

Stop the mirror topic

There may come a point when you want to stop mirroring your topic. For example, if you complete a cluster migration, or need to failover to your destination cluster in a disaster event, you may need to stop mirroring topics on the destination.

You can stop mirroring on a per-topic basis. The destination’s mirror topic will stop receiving new data from the source, and become a standard, writable topic into which your producers can send data. No topics or data will be deleted, and this will not affect the source cluster.

To stop mirroring a specific mirror topic on the destination cluster, use the following command:

ccloud kafka mirror promote <mirror-topic-name> --link <link-name> --cluster <dst-cluster-id>

To stop mirroring the topic tasting-menu using the destination cluster ID from the examples:

$ ccloud kafka mirror promote tasting-menu --link usa-east-west --cluster lkc-161v5
MirrorTopicName | Partition | PartitionMirrorLag | ErrorMessage | ErrorCode
-------------------------------------------------------------------------
tasting-menu    |         0 |                  0 |              |
tasting-menu    |         1 |                  0 |              |
tasting-menu    |         2 |                  0 |              |
tasting-menu    |         3 |                  0 |              |
tasting-menu    |         4 |                  0 |              |
tasting-menu    |         5 |                  0 |              |

The fact that there are no Error messages or error codes means this operation succeeded for all topics.

What happens when you stop mirroring a topic

The mirror promote command stops new data mirroring from the source to the destination for the specified topic(s), and promotes the destination topic to a regular, writable topic.

This action is not reversible. Once you change your mirror topic to a regular topic, you cannot change it back to a mirror topic. If you want it to be a mirror topic once again, you will need to delete it and recreate it as a mirror topic.

If consumer.offset.sync.enable is on, consumer offsets syncing is also stopped for those topic(s).

The command does not affect ACL syncing. (See (Usually Optional) Use a config File.)

How to restart mirroring for a topic

To restart mirroring for that topic, you will need to delete the destination topic, and then recreate the destination topic as a mirror.

Migration Best Practices

If you are migrating data from source to destination, and you want to make sure no lagged data is lost, you may want to stop producers first and make sure any lag is mirrored before stopping the mirror topic:

  1. Stop your producers on your source cluster.

  2. Wait for any lag to be mirrored.

    Tip

    Look at the end offsets for both the source and mirror topic (high watermark) and make sure they are both at the same offset.

  3. Run the mirror promote command.

Failover Considerations

If you’re failing over from source to destination because of a disaster event, please note these considerations.

Order of actions and promoting the Destination as post-failover active cluster

You should first stop mirror topics, and then move all of your producers and consumers over to the destination cluster. The destination cluster should become your new active cluster, at least for the duration of the disaster and the recovery. If it works for your use case, we suggest making the Destination cluster your new, permanent active cluster.

Recover lagged data

There may be lagged data that did not make it to the destination before the disaster occurred. When you move your consumers, if any had not already read that data on the source, then they will not read that data on the destination. If/when the disaster resolves your source cluster, that lagged data will still be there. So, you are free to consume it / handle it as fits with your use case.

For example, if the Source was up to offset 105, but the Destination was only up to offset 100, then the source data from offsets 101-105 will not be present on the Destination. The Destination will get new, fresh data from the producers that will go into its offsets 101-105. When the disaster resolves, the Source will still have its data from offsets 101-105 available to consume manually.

Lagged consumer offsets may result in duplicate reads

There may be lagged consumer offsets that did not make it to the destination before the disaster occurred. If this is the case, then when you move your consumers to the destination, they may read duplicate data.

For example, if at the time that you stop your mirroring:

  • Consumer A had read up to offset 100 on the Source
  • Cluster Linking had mirrored the data through offset 100 to the Destination
  • Cluster Linking had last mirrored consumer offsets that showed Consumer A was only at offset 95

Then when you move Consumer A to the Destination, it may read offsets 96-100 again, resulting in duplicate reads.

Promoting (stopping) a mirror topic clamps consumer offsets

The promote command “clamps” consumer offsets.

This means that, when you run mirror promote, if:

  • Consumer A was on source offset 105 – and that was successfully mirrored to the Destination, and
  • the data on the Destination was lagging and was only up to offset 100 (so it did not have offsets 101-105)

then when you call promote, Consumer A’s offset on the Destination will be “clamped” down to offset 100, since that is the highest available offset on the Destination.

Note that this will cause Consumer A to “re-consume” offsets 101-105. If your producers send new, fresh data to the Destination, then Consumer A will not read duplicate data. (However, if you had custom-coded your producers to re-send offsets 101-105 with the same data, then your consumers could read the same data twice. This is a rare case, and is likely not how you have designed your system.)

Use consumer.offset.sync.ms

Keep in mind that you can configure consumer.offset.sync.ms to suit your needs (default is 30 seconds). A more frequent sync might give you a better failover point for your consumer offsets, at the cost of bandwidth and throughput during normal operation.

Migrate a consumer group

To migrate a consumer group called <consumer-group-name> from one cluster to another, stop the consumers and update the cluster link to stop mirroring the consumer offsets:

ccloud kafka link update <link-name> --cluster <src-cluster-id> --config \
consumer.offset.group.filters="consumer.offset.group.filters={\"groupFilters\": \
[{\"name\": \"*\",\"patternType\": \"LITERAL\",\"filterType\": \"INCLUDE\"},\
{\"name\":\"<consumer-group-name>\",\"patternType\":\"LITERAL\",\"filterType\":\"EXCLUDE\"}]}"

Then, point your consumers at the destination, and they will restart at the offsets where they left off.

Migrate a producer

To migrate a producer:

  1. Stop the producer.

  2. Make the destination topic writable:

    $ ccloud kafka mirror promote <mirror-topic-name> --link <link-name> --cluster <dst-cluster-id>
    
  3. Point your producer at the destination cluster.

(Usually Optional) Use a config File

You have the option to set up a .config file, which can be useful for any of the following scenarios. This file must have a .config extension. Use your favorite text editor to add the file to your working directory.

  • A configuration file is an alternate way to pass an API key and secret to a Confluent Cloud cluster. If you store your API key and secret in a .config file, don’t have to enter your credentials each time on the command line. Instead, use the .config file to authenticate into the source cluster.
  • If you want to add optional configuration settings to the link (like consumer group sync) then you must pass in a .config file with those properties. Note that you might still want to use the command line to specify your API key and secret, and use the configuration file simply for these additional link properties. In that case, you need only specify the link properties in the file, not the key and secret.
  • If the other cluster is not Confluent Cloud, you must use the .config file to pass in the security credentials. This is the one case where a config file is not optional.

An example of using this file to sync consumer group offsets and ACLs is provided as the last step (3) under Create a cluster link.

To use a config file to store credentials, copy this starter text into source.config and replace <src-bootstrap-url>, <src-api-key>, and <src-api-secret> with the values for your source cluster.

security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='<src-api-key>' password='<src-api-secret>';

Important

  • The last entry must be all on one line, from sasl.jaas.config all the way to password='<src-api-secret>';. Do not add returns, as this will cause the configs to break.
  • The configuration options are case-sensitive. Be sure to use upper and lower case as shown in the example.
  • Use punctuation marks such as single quotes and semicolon exactly as shown.
  • To protect your credentials, delete the config file after the link is created.

Suggested Resources

  • This tutorial covered a basic use case for sharing data across topics in the same or disparate clusters, regions, and clouds. Disaster Recovery and Failover provides a tutorial on how to use Cluster Linking to save your data and recover from an outage.
  • Mirror Topics provides a concept overview of this feature of Cluster Linking.