Looking for Confluent Platform Cluster Linking docs? This page describes Cluster Linking on Confluent Cloud. If you are looking for Confluent Platform documentation, check out Cluster Linking on Confluent Platform.

Hybrid Cloud and Bridge-to-Cloud

Introduction

This tutorial is a hands-on look at how to use Cluster Linking for hybrid use cases that link Confluent Platform and Confluent Cloud clusters.

You will create a deployment with data flowing in both directions:

  • From Confluent Cloud to Confluent Platform
  • From Confluent Platform to Confluent Cloud
    • This direction will require a “source initiated” cluster link; a new feature introduced in Confluent Platform 7.0.0.

In both cases, Confluent Platform brokers will initiate the connection to Confluent Cloud brokers. Therefore, you will not have to open up your firewall to let Confluent Cloud connect to your Confluent Platform brokers.

To see what clusters can use Cluster Linking, see Supported Cluster Types.

../../_images/cluster-link-hybrid.png

What the tutorial covers

By the end of this tutorial, you will have configured two clusters, one on Confluent Platform and one on Confluent Cloud, and successfully used Cluster Linking to share topic data bidirectionally across the clusters, all without opening up your firewall to Confluent Cloud.

Install Confluent Platform

Download and extract Confluent Platform version 7.0.0.

The rest of the tutorial expects these variables to be set:

export CONFLUENT_HOME=<CP installation directory>
export CONFLUENT_CONFIG=$CONFLUENT_HOME/etc/kafka

Add these two lines to your .bashrc or .bash-profile so that they are executed whenever you open a terminal window.

About Prerequisites and Command Examples

Ports and Configuration Mapping

The example deployment in this tutorial uses the following port and feature configurations, and assumes that services will run on localhost.

Confluent Platform
Kafka Brokers 9092
ZooKeeper 2181

Tip

  • These are example ports that are used for the purposes of this tutorial. Cluster Linking does not require you to use these exact ports. You may change them if needed.
  • If you have other processes using these ports, either quit the other processes, or modify the tutorial steps to use different ports.

Configure Kafka and ZooKeeper files

In $CONFLUENT_CONFIG, configure the following files to set up the Confluent Platform cluster.

Copy $CONFLUENT_CONFIG/zookeeper.properties to use as a basis for zookeeper-clusterlinking.properties.

Copy $CONFLUENT_CONFIG/server.properties to use as a basis for server-clusterlinking.properties.

File Configurations
zookeeper-clusterlinking.properties

dataDir=/tmp/zookeeper-clusterlinking

clientPort=2181

server-clusterlinking.properties

listeners=PLAINTEXT://:9092

log.dirs=/tmp/kafka-logs-1

zookeeper.connect=localhost:2181

offsets.topic.replication.factor=1

confluent.license.topic.replication.factor=1

confluent.reporters.telemetry.auto.enable=false

confluent.cluster.link.enable=true

password.encoder.secret=encoder-secret

Note

  • For simplicity, this example configures only one ZooKeeper and one Confluent Server broker, without security or SSL. This is fine for testing on your local machine, but in a production setting, you should have more Zookeepers and brokers, spread across different machines for fault tolerance and high availability, and secured with authentication and encryption.
  • For this example, the replication factors for important internal topics are set to 1, because this is a testing setup with only one broker. For production deployments, do not set the replication factor of these topics to 1. Generally, replication factors should be set to 3 or more, depending on the number of brokers.
  • The parameter password.encoder.secret is needed to encrypt the credentials which will be stored in the cluster link.

Start the Confluent Platform cluster

Run the following commands in separate command windows.

ZooKeeper and Confluent Server commands do not “complete” until you stop them, so these windows need to stay open while the applications are running.

Use another command window to serve as your main terminal in which to run commands that you expect to complete. (Examples of these are kafka-configs, kafka-topics, kafka-cluster-links, and in certain cases kafka-console-producer and kafka-console-consumer, although sometimes you may want to leave these last two running as well.)

../../_images/cluster-link-hybrid-command-windows.png
  1. In a new command window, start the ZooKeeper server for the Confluent Platform cluster.

    zookeeper-server-start $CONFLUENT_CONFIG/zookeeper-clusterlinking.properties
    
  2. Run commands to create SASL SCRAM credentials on the cluster for two users: one to be used by the Kafka cluster, and the other for running commands against the cluster.

    • Run this command to create credentials on the cluster for a user called “kafka” that will be used by the Kafka cluster itself.

      kafka-configs --zookeeper localhost:2181 --alter --add-config \
        'SCRAM-SHA-512=[iterations=8192,password=kafka-secret]' \
        --entity-type users --entity-name kafka
      
    • Run this command to create credentials on the cluster for a user called “admin” that you will use to run commands against this cluster.

      kafka-configs --zookeeper localhost:2181 --alter --add-config \
        'SCRAM-SHA-512=[iterations=8192,password=admin-secret]' \
        --entity-type users --entity-name admin
      
  3. Create a file with the admin credentials to authenticate when you run commands against the Confluent Platform cluster.

    Open a text editor, create a file called $CONFLUENT_CONFIG/CP-command.config and copy-paste in the following content:

    kafka-configs --zookeeper localhost:2181 --alter --add-config \
      'SCRAM-SHA-512=[iterations=8192,password=admin-secret]' \
       --entity-type users --entity-name admin
    
  4. In a new command window, start a Confluent Server broker for the source cluster, passing the credentials as a part of the command.

    kafka-server-start $CONFLUENT_CONFIG/server-clusterlinking.properties
    
  5. Get the Confluent Platform cluster ID.

    kafka-cluster cluster-id --bootstrap-server localhost:9092 --command-config $CONFLUENT_CONFIG/CP-command.config
    

    Your output should resemble:

    Cluster ID: G1pnOMOxSjWYIX8xuR2cfQ
    

    In this case, G1pnOMOxSjWYIX8xuR2cfQ is the Confluent Platform cluster ID, referred to in these examples as <CP_CLUSTER_ID>.

Start the Confluent Cloud cluster

You need a Dedicated Confluent Cloud cluster with Public internet in order to run the rest of the commands. You may create one just for the purpose of this demo, and then delete it after the tutorial is over. You will incur charges for this cluster.

You can create a new Dedicated cluster from the Confluent Cloud web UI or directly from the Confluent CLI with this command:

confluent kafka cluster create CLOUD-DEMO --type dedicated --cloud aws --region us-east-1 --cku 1 --availability single-zone

You must wait until the new cluster is provisioned, which typically takes a few minutes, but can take longer. You can also use an existing cluster.

  1. Log on to the Confluent CLI CLI.

    confluent login
    
  2. View environments, and select the one you want to use by environment ID.

    confluent environment list
    

    An asterisk indicates the currently selected environment in the list. You can select a different environment as follows.

    confluent environment use <environment-ID>
    
  3. View your clusters.

    confluent kafka cluster list
    

    An asterisk indicates the currently selected cluster. You can select a different cluster as follows:

    confluent kafka cluster use <CC-CLUSTER-ID>
    

    Tip

    You can get information or take several types of actions on a cluster that is not currently selected by specifying its cluster ID. For example, confluent kafka cluster describe <cluster-ID>.

  4. Note the cluster ID for your Dedicated cluster, referred to as <CC-CLUSTER-ID> in this tutorial.

Populate the Confluent Platform cluster

These commands use the Confluent Platform CLI.

  1. Create a topic on the Confluent Platform cluster with a single partition so ordering is easier to see.

    kafka-topics --create --topic from-on-prem --partitions 1 --replication-factor 1 --bootstrap-server localhost:9092 --command-config $CONFLUENT_CONFIG/CP-command.config
    

    You should get confirmation that the topic was successfully created.

    Created topic from-on-prem.
    

    You can get a list of existing topics as follows:

    kafka-topics --list --bootstrap-server localhost:9092 --command-config $CONFLUENT_CONFIG/CP-command.config
    

    And get detailed information on a topic with the --describe option:

    kafka-topics --describe --topic from-on-prem --bootstrap-server localhost:9092 --command-config $CONFLUENT_CONFIG/CP-command.config
    
  2. Send some messages to the from-on-prem topic on the source cluster, and fill it with data.

    seq 1 5 | kafka-console-producer --topic from-on-prem --bootstrap-server localhost:9092 --command-config $CONFLUENT_CONFIG/CP-command.config
    

    The command should terminate without any output.

  3. Consume from the topic on the source cluster.

    In a new terminal, run a consumer to consume messages from the from-on-prem topic.

    kafka-console-consumer --topic from-on-prem --from-beginning --bootstrap-server localhost:9092 --command-config $CONFLUENT_CONFIG/CP-command.config
    

    If the topic successfully consumes the messages, your output will be:

    1
    2
    3
    4
    5
    

    Use keyboard command Ctrl+C to get the prompt back.

Set up privileges for the Confluent Cloud cluster

  1. Create a user API key for your Confluent Cloud cluster to act as the destination in Confluent Platform to Confluent Cloud topic data mirroring.

    confluent api-key create --resource <CC-CLUSTER-ID>
    
  2. Save the resulting API key and secret in a safe place. This tutorial refers to these as <CC-link-api-key> and <CC-link-api-secret>. This is the API key and secret associated with the Confluent Cloud cluster that you will use to create the Confluent Platform to Confluent Cloud link. You will add these to a configuration file in the next step.

    Important

    If you are setting this up in production, you should use a service account API key instead of a user-associated key. A guide on how to set up privileges to access Confluent Cloud clusters with a service account is provided in the topic data sharing tutorial.

Teardown

Promote mirror topics

Promote the mirror topics to normal topics.

  1. On Confluent Cloud promote the mirror topic called from-on-prem:

    confluent kafka mirror promote from-on-prem --link from-on-prem-link --cluster <CC-CLUSTER-ID>
    

    Your output will resemble:

     Mirror Topic Name | Partition | Partition Mirror Lag | Error Message | Error Code | Last Source Fetch Offset
    +-------------------+-----------+----------------------+---------------+------------+--------------------------+
     from-on-prem      |         0 |                    0 |               |            |                        9
    

    If you want to verify that the mirroring stopped, you can re-run the above command. You should get a message in the Error Message column that Topic 'from-on-prem' has already stopped its mirror from 'from-on-prem-link'.

  2. On Confluent Platform, promote the mirror topic called cloud-topic:

    kafka-mirrors --promote --topics cloud-topic --bootstrap-server localhost:9092 --command-config $CONFLUENT_CONFIG/CP-command.config
    

    Your output should resemble:

    Calculating max offset and ms lag for mirror topics: [cloud-topic]
    Finished calculating max offset lag and max lag ms for mirror topics: [cloud-topic]
    Request for stopping topic cloud-topics mirror was successfully scheduled. Please use the describe command with the --pending-stopped-only option to monitor progress.
    

    If you retry this command, you will get an error indicating that the Topic 'cloud-topic' has already stopped its mirror 'from-cloud-link'.

Delete the source and mirror topics

  1. Delete the topics on Confluent Cloud.

    confluent kafka topic delete cloud-topic
    
    confluent kafka topic delete from-on-prem
    
  2. Delete the topics from Confluent Platform.

    kafka-topics --delete --topic cloud-topic --bootstrap-server localhost:9092 --command-config $CONFLUENT_CONFIG/CP-command.config
    
    kafka-topics --delete --topic from-on-prem --bootstrap-server localhost:9092 --command-config $CONFLUENT_CONFIG/CP-command.config
    

Stop consumers, producers, and Confluent Platform

Run the following shutdown and cleanup tasks.

  1. Stop consumers and producers with Ctl-C in their respective command windows.
  2. Stop all of the other components with Ctl-C in their respective command windows, in reverse order in which you started them. For example, stop the Kafka brokers first, and then stop the associated Zookeepers.