Cluster Linking on Confluent Cloud
What is Cluster Linking?
Cluster Linking on Confluent Cloud is a fully-managed service for moving data
from one Confluent cluster to another. Programmatically, it creates perfect
copies of your topics and keeps data in sync across clusters. Cluster Linking is
a powerful geo-replication technology for:
- Multi-cloud and global architectures powered by real-time data in motion
- Data sharing between different teams and lines of business
- High Availability (HA)/Disaster Recovery (DR) during a regional cloud provider outage
- Data and workload migration from a Apache Kafka® cluster to Confluent Cloud
Cluster Linking is fully-managed in Confluent Cloud, so you don’t need to manage
or tune data flows. Its usage-based pricing puts multi-cloud and multi-region
costs into your control. Cluster Linking reduces operational burden and cloud
egress fees, while improving the performance and reliability of your cloud data
How it Works
Cluster Linking allows one Confluent cluster to mirror data directly from
another. You can establish a cluster link between a source cluster and a
destination cluster in a different region, cloud, line of business, or
organization. You choose which topics to replicate from the source cluster to
the destination. You can even mirror consumer offsets and ACLs, making it
straightforward to move Kafka consumers from one cluster to another.
In one command or API call, you can create a cluster link from one cluster to
another. A cluster link acts as a persistent bridge between the two clusters.
confluent kafka link create tokyo-sydney
To mirror data across the cluster link, you create mirror topics on your destination cluster.
confluent kafka mirror create clickstream.tokyo
Mirror topics are a special kind of topic: they are read-only copies of their
source topic. Any messages produced to the source topic are mirrored to the
mirror topic “byte-for-byte,” meaning that the same messages go to the same
partition and same offset on the mirror topic. Mirror topics can be consumed
from just the same as any other topic.
Cluster links and mirror topics are the building blocks you can use to create
scalable, consistent architectures across regions, clouds, teams, and organizations.
Cluster Linking replicates essential metadata.
- Cluster Linking applies the best practice of syncing topic configurations between the source and mirror topics. (Certain configurations are synced, others are not.)
- You can enable consumer offset sync, which will sync consumer offsets from the source topic to the mirror topic (only for mirror topics), and you can filter to specific consumer groups if desired.
- You can enable ACL sync, which will sync all ACLs on the cluster (not just for mirror topics). You can filter based on the topic name or the principal name, as needed.
These features are covered in the various Tutorials.
Confluent provides multi-cloud, multi-region, and hybrid capabilities in Confluent Cloud.
Some of these are demo’ed in the Tutorials.
- Global and Multi-Cloud Replication: Move and aggregate real-time data across regions and clouds.
By making geo-local reads of real-time data possible, this can act like a content delivery network (CDN)
for your Kafka events throughout the public cloud, private cloud, and at the edge.
- Data Sharing - Share data in real-time with other teams, lines-of-business, or organizations.
- Data Migration - Migrate data and workloads from one cluster to another.
- Disaster Recovery and High Availability - Create a disaster recovery cluster, and fail over to it during an outage.
Cluster Linking mirroring throughput (the bandwidth used to read data or write data to your cluster) is counted against your Limits per CKU.
Supported Cluster Types
A cluster link sends data from a “source cluster” to a “destination cluster”. The supported cluster types are shown in the table below.
The source cluster can be Kafka or Confluent Server or Confluent Cloud running version 2.4 of the
the destination cluster must be either a Confluent Cloud dedicated cluster or Confluent Server,
which is bundled with Confluent Enterprise.
Unsupported cluster types and other limits are described in Limitations.
The source cluster and destination cluster can be in different regions, cloud providers, Confluent Cloud environments, or Confluent Cloud organizations.
How to Check the Cluster Type
To check a Confluent Cloud cluster’s type and endpoint type:
Log on to Confluent Cloud.
Select an environment.
Select a cluster.
The cluster type is shown on the summary card for the cluster.
Alternatively, click into the cluster, and select Cluster settings from the left menu.
The cluster type is shown on the summary card for “Cluster type”.
From the left menu under Cluster overview for a dedicated cluster, click the Networking menu item to view the endpoint type.
Only Dedicated clusters have the Networking tab; Basic and Standard
clusters always have Internet networking. Networking is defined when you first create the Dedicated cluster.
Confluent Cloud clusters that use Cluster Linking are charged based on the number of cluster links
and the volume of mirroring throughput to or from the cluster.
For a detailed breakdown of how Cluster Linking is billed, including guidelines for using metrics to track your costs,
see Cluster Linking in Confluent Cloud Billing.
More general information regarding prices for Confluent Cloud are on the website on the Confluent Cloud pricing page.
About Preview Features
Cluster Linking on Confluent Cloud is now in general availability. However, the
following included are being introduced in preview mode to gain early feedback
from developers. These metrics can be used for evaluation and non-production
testing purposes or to provide feedback to Confluent. Comments, questions, and
suggestions related to preview features are encouraged and can be submitted to
Just getting started with Cluster Linking? Here are a few suggestions for next steps.
To get started, try one or more tutorial, each of which maps to a use case.
Read-only, mirror topics that reflect the data in original (source) topics are the building blocks of Cluster Linking.
For a deep dive on this specialized type of topic and how it works, see Mirror Topics.
Commands and Prerequisites
The destination cluster can use the
confluent kafka link command to create a link from the source cluster.
The following prerequisite steps are needed to run the tutorials during the Preview.
To try out Cluster Linking on Confluent Cloud:
Install Confluent Cloud
if you do not already have it.
As a shortcut alternative to installing from the web, you can get Confluent Cloud
in two commands in your terminal window. (Replace
both commands with a different directory, if you wish.)
curl -L --http1.1 https://cnfl.io/ccloud-cli | sh -s -- -b ~/.local/bin
To learn more about Confluent Cloud in general, see Quick Start for Apache Kafka using Confluent Cloud.
Log on to Confluent Cloud.
Update your Confluent CLI to be sure you have an up-to-date version
of the Cluster Linking commands. See Get the latest version of the Confluent CLI in the quick start for details.
confluent kafka link command has the following subcommands or flags.
|Create a new cluster link.
|Delete a previously created cluster link.
|Describes an existing cluster link.
|Lists existing cluster links.
|Updates a property for an existing cluster link.
confluent kafka mirror command has the following subcommands or flags.
|Describe a mirror topic.
|Failover the mirror topics.
|List all mirror topics in the cluster or under the given cluster link.
|Pause the mirror topics.
|Promote the mirror topics.
|Updates a property for an existing cluster link.
Follow the tutorials to try out Cluster Linking. The commands are demo’ed in the tutorials.
Pro Tips for the CLI
A list of Confluent Cloud CLI commands is available here.
Following are some generic strategies for saving time on command line workflows.
Save command output to a text file
To keep track of information, save the output of the Confluent Cloud commands to a text
file. If you do so, be sure to safeguard API keys and secrets afterwards by
deleting the file or moving only the security codes to safer storage. To
redirect command output to a file, you can use either of these methods and
manually add in headings for organization:
- To redirect output to a file, use Linux syntax such as
<command> > notes.txt to run the first command
and create the notes file, and then
<command> >> notes.txt to append further output.
- To send output to a file and also view it on-screen (recommended), use
<command> | tee notes.txt to
run the first command and create the file. Thereafter, use the
tee command with the
to append; for example,
<command> | tee -a notes.txt.
Use configuration files to store data you will use in commands
Create configuration files to store API keys and secrets, detailed configurations on cluster links,
or security credentials for clusters external to Confluent Cloud. Examples of this are provided in
(Usually Optional) Use a config File in the topic data sharing tutorial and in Create the cluster link
for the disaster recovery tutorial.
Use environment variables to store resource information
You can streamline your command line workflows by saving permissions and cluster data in shell environment variables.
Save API keys and secrets, resources such as IDs for environments, clusters, or service accounts, and bootstrap servers,
then use the variables in Confluent commands.
For example, create variables for an environment and clusters:
Then use these in commands:
$ confluent environment use $CLINK_ENV
Now using "env-200py" as the default (active) environment.
$ confluent kafka cluster use $USA_EAST
Set Kafka cluster "lkc-qxxw7" as the active cluster for environment "env-200py".
Put it all together in commands
Assuming you’ve created environment variables for your clusters, API keys, and secrets,
and have cluster link configuration details in a file called
here is an example of creating a cluster link named “east-west-link” using variables and your configuration file.
confluent kafka link create east-west-link \
--cluster $DESTINATION_ID \
--source-cluster-id $ORIG_ID \
--source-bootstrap-server $ORIG_BOOT \
Scaling Cluster Linking
Because Cluster Linking fetches data from source topics, the first scaling
unit to inspect is the number of partitions in the source topics. Having enough
partitions lets Cluster Linking mirror data in parallel. Having too few
partitions can make Cluster Linking bottleneck on partitions that are more heavily used.
In Confluent Cloud, Cluster Linking scales with the ingress and egress quotas of
your cluster. Cluster Linking is able to use all remaining bandwidth in a
cluster’s throughput quota: 150 MB/s per CKU egress on a Confluent Cloud source
cluster or 50 MB/s per CKU ingress on a Confluent Cloud destination cluster,
whichever is hit first. Therefore, to scale Cluster Linking throughput, simply adjust
the number of CKUs on either the source, the destination, or both.
On the destination cluster, Cluster Linking write takes lower priority
than Kafka clients producing to that cluster; Cluster Linking will be throttled first.
(See recommended guidelines for Confluent Cloud.)
Confluent proactively monitors all cluster links in Confluent Cloud and will
perform tuning when necessary. If you find that your cluster link is not hitting
these limits even after a full day of sustained traffic, contact Confluent Support.
In a Confluent Platform or Apache Kafka® cluster, you can scale Cluster Linking throughput as follows:
This section details support and known limitations in terms of cluster types,
cluster management, and performance.
Cluster Types and Networking
Currently supported cluster types are described in Supported Cluster Types.
Cluster Linking is not supported for Confluent Cloud clusters that have the Transit
Gateway, VPC Peering, Privatelink or VNet Peering networking types. If you wish
to use Cluster Linking with a privately networked Confluent Cloud cluster,
contact your Confluent account team or email firstname.lastname@example.org to find out more.
A given cluster can only be the destination for five cluster links. Cluster Linking does not
currently support aggregating data from more than five sources.
A key feature of Cluster Linking is the capability to sync ACLs between clusters.
This is useful when moving clients between clusters for a migration or failover.
However, in Confluent Cloud, ACLs on a cluster can only be created for service accounts
that are in the same Confluent Cloud organization as the cluster itself. Therefore, in practice,
ACL sync is only useful between two Confluent Cloud clusters that are in the same Confluent Cloud organization.
ACL sync is not useful between two Confluent Cloud clusters in different organizations,
between Confluent Platform and Confluent Cloud, or Apache Kafka® and Confluent Cloud.
As a general rule, do not include in the sync filter ACLs that are managed
independently on the destination cluster. This is to prevent cluster link
migration from deleting ACLs that were added specifically on the destination and
should not be deleted. See also, Configuring Cluster Link Behavior and Syncing ACLs from Source to Destination Cluster.
Temporary known limitation: When using ACL sync, you must manually verify that ACL deletions on source
cluster were deleted on the destination cluster. When ACLs are created or
deleted on the source cluster, the ACL sync feature propagates these changes to
the destination cluster. However, in rare circumstances, the deletion of ACLs on
the source cluster is not synced to the destination cluster. Therefore, for the
time being, if you are using the ACL sync feature, whenever you delete ACLs on
the source cluster, you should verify that the ACLs were also deleted on the
destination cluster. If they have not been deleted by the cluster link, then you
should manually delete the ACLs on the destination cluster.
- Cluster links must be created and managed on the destination cluster.
- Cluster links can only be created with destination clusters that are Confluent Cloud
Dedicated Confluent Cloud clusters with internet networking.
- In Confluent Platform 7.0.0, source initiated cluster links (recommended for hybrid cloud) cannot be created with
the REST API; they must be created with the
kafka-cluster-links CLI. Regular, destination
initiated cluster links can be created with either the REST API or
- In Confluent Platform 7.0.x, REST API calls to list and get source-initiated cluster links will have their destination cluster IDs
returned under the parameter
- Mirror topics count against a cluster’s topic limits, partition limits, and/or
storage limits; just like other topics.
- There is no limit to the number of topics or partitions a cluster link can have,
up to the destination cluster’s maximum number of topics and partitions.
- A cluster can have at most five cluster links targeting it as the destination;
that is, not more than five cluster links that are replicating data to it.
If you require more than five cluster links on one cluster, contact Confluent Support.
- By definition, a mirror topic can only have one cluster link and one source topic
replicating data to it. Alternatively, a single topic can be the source topic for
an unlimited number of mirror topics.
- The frequency of sync processes for consumer group offset sync, ACL sync, and
topic configuration sync are user-configurable. The frequency with which these
syncs can occur is limited to at most once per second (that is, 1000 ms, since the setting is
in milliseconds). You can set these syncs to occur less frequently, but no more
frequent than 1000 ms.
Kafka transactions such as “exactly once” semantics not supported on mirror topics
Cluster Linking is not integrated with Kafka transactions, including “exactly once”
semantics. Using Cluster Linking to mirror topics that contain transactions or
exactly once semantics is not supported and not recommended.
For Cluster Linking, throughput indicates bytes-per-second of data replication.
The following performance factors and limitations apply.
- Cluster Linking throughput (bytes-per-second of
data replication) counts towards the destination cluster’s produce limits (also known
as “ingress” or “write” limits). However, production from Kafka clients is prioritized
over Cluster Linking writes; therefore, these are exposed as separate metrics
in the Metrics API: Kafka client writes are
and Cluster Linking writes are
- Cluster Linking consumes from the source cluster similar to Kafka consumers.
Throughput (bytes-per-second of data replication) is treated the same as consumer
throughput. Cluster Linking will contribute to any quotas and hard or soft
limits on your source cluster. The Kafka client reads and Cluster Linking reads
are therefore included in the same metric in the Metrics API:
- Cluster Linking is able to max out the throughput of your CKUs. The physical
distance between clusters is a factor of Cluster Linking performance.
Confluent monitors cluster links and optimizes their performance.
Unlike Replicator and Kafka MirrorMaker 2, Cluster Linking does
not have a unique scaling (that is, tasks). You do not need to scale up
or scale down your cluster links to increase performance.
- Cluster Linking conections count towards any connection limits on your clusters.
- Request rate
- Cluster Linking contributes requests which count towards your source cluster’s
request rate limits.
Frequently Asked Questions
Will adding a cluster link result in throttling consumers on the source cluster?
Possibly, yes; adding a cluster link is similar to adding a new consumer with
As such, the cluster link can cause other consumers to be throttled if it pushes
the total consumption above your cluster’s consume throughput quota. This depends
on how much throughput your cluster link can achieve, how much data you are trying
to mirror, and how much extra consume capacity you have.
Keep in mind, if you are mirroring existing topic data, the cluster link will
have a “burst” of consume at the beginning to get this historical data. After it
catches up, the consume rates should go down to match the production values into
your source topics (assuming your cluster link can handle the production throughput).
Will adding a cluster link cause throttling of existing producers on the destination cluster?
No, it shouldn’t. Kafka client producers are prioritized over Cluster Linking destination writes.
Considerations for deleting source topics
Do not delete a source topic for an active mirror topic, as it can cause issues with
Cluster Linking. Instead, follow these steps as a best practice:
- Use the
failover commands to stop or delete any active mirror
topics that read from the source topic you want to delete.
- Then, you can safely delete the source topic.
To learn more, see Source Topic Deletion
in Mirror Topics.