Cluster Linking is fundamentally a networking feature: it copies data over the
network. As such, Cluster Linking requires that at least one of the clusters
involved has connectivity to the other cluster. Therefore, the networking
situation of each cluster determines whether the two clusters can be linked, and
whether the destination cluster or the source cluster must initiate the
connection. By default, the destination cluster will initiate the connection. A
special mode called “source-initiated links” allows the source cluster to
initiate the connection of the cluster link.
The following tables show which networking combinations are possible, and
whether a source-initiated link is required.
To learn more about all available Confluent Cloud cluster types, see Kafka Cluster Types in Confluent Cloud.
The above table shows supported cluster types for this particular Cluster Linking scenario (private networking). For a more general overview of
supported cluster types for Cluster Linking, see Supported cluster types.
Using the Confluent Cloud Console for Cluster Linking with private networking¶
To view, create, and modify cluster links and mirror topics in the Confluent Cloud Console,
the cluster link’s destination cluster must be accessible from
your browser. If your destination cluster is a Confluent Cloud cluster with
private networking, you must have completed the setup described in Use Confluent Cloud Console with Private Networking.
If your browser cannot access the destination cluster, you will not see any
cluster links on that cluster, nor an indication of which topics are mirror
topics. You will not be able to create a cluster link with that cluster as its
If your source cluster is also a Confluent Cloud cluster with private
networking, then some features of the Confluent Cloud Console require your
browser to be able to reach your source cluster, also:
Creating a “Confluent Cloud to Confluent Cloud” cluster link as an OrganizationAdmin, which creates ACLs and an API key on the source cluster.
Viewing the source cluster topics’ throughput metric on a cluster link
Using the drop-down menu of source cluster topic names to create a mirror topic. Topic names must be entered manually.
When creating a cluster link in the Confluent Cloud Console, rest assured that the
drop-down menus automatically filter for source clusters and destination
clusters that can be linked. If you see a drop-down option for the cluster,
then a cluster link is possible and generally available. Preview cases are excluded
from the drop-downs, and require using the Confluent CLI or
the Confluent Cloud REST API to create a cluster link.
Running Cluster Linking API and CLI commands with private networking¶
Cluster Linking commands require that the location where you are running them has access to the destination cluster of the cluster link:
Creating, updating, listing, describing, and deleting cluster links
Creating, listing, and describing mirror topics
For example, when running the Confluent CLI command, the shell must have access to the
destination cluster. If your destination cluster has private networking, one way
to achieve this is to SSH into a virtual machine that has network connectivity
(by means of PrivateLink, Transit Gateway, or Peering) to the destination cluster.
Aside from the above mentioned requirements, the user experience when using a
cluster link with private networking is the same as when using a cluster link
with public networking. The same commands work in the same ways. That means you
can follow the same Cluster Linking tutorials using the same steps, regardless of
whether your clusters use public or private networking.
Exception: When Cluster Linking from a cluster with private networking to a cluster with public networking,
a source-initiated link is required. This introduces different configurations and an extra step.
Cluster Linking between AWS PrivateLink Confluent Cloud clusters¶
You can use Cluster Linking on Confluent Cloud to create a cluster link that
perpetually mirrors data between two Confluent Cloud clusters in different AWS regions,
each with the “AWS PrivateLink” networking type. Both clusters must be in the
same Confluent Cloud organization.
When AWS PrivateLink is used to connect to a Confluent Cloud cluster, the cluster is
isolated in its own Confluent Cloud network for network-level security. This prevents
external resources, including Cluster Linking, from connecting to the Kafka
cluster. Thus, in order to use Cluster Linking between two Confluent Cloud clusters
in different Confluent Cloud networks, a secure networking path is required between the
two Confluent Cloud networks. You can create this secure network path using resources
called a Network Link Service and a Network Link Endpoint, as described in the
networking document Cluster Linking using AWS PrivateLink.
Network Link Services allow secure inbound connectivity from specific networks that you specify.
Network Link Endpoints allow secure outbound connectivity to specific networks that you specify.
To use Cluster Linking between two AWS PrivateLink clusters in different Confluent Cloud Organizations,
you first must establish secure connectivity between the clusters as follows:
Create a Network Link Service in the source cluster Confluent Cloud network to allow incoming connectivity from the destination cluster Confluent Cloud network.
Then, create a Network Link Endpoint in the destination cluster Confluent Cloud network. This allows secure outbound connectivity to the source cluster.
The Network Link Service and Network Link endpoint must be in the same Confluent Cloud Organization. Cross-organization cluster linking between AWS PrivateLink clusters is not supported.
Step 3: Wait until the Network Link Endpoint is READY¶
It takes several minutes to establish secure connectivity. When ready to be used
by Cluster Linking, the network link endpoint’s status will change from PROVISIONING to READY, which you can check with a
GET REST API call:
Deleting any one resource–cluster link, network link service, or network link
endpoint will stop data from being mirrored from source to destination cluster.
Removing the destination cluster network or environment ID from the allowed
list of IDs in the Network Link Service will also stop the data flow.
Cluster Linking between AWS Transit Gateway attached Confluent Cloud clusters¶
Confluent provides Cluster Linking between AWS Transit Gateway Confluent Cloud
clusters as a fully-managed solution for geo-replication, multi-region, high
availability and disaster recovery, data sharing, or aggregation.
This section describes how to use Cluster Linking to sync data between two
private Confluent Cloud clusters in different AWS regions that are each attached to an
AWS Transit Gateway. You can provision new Confluent Cloud clusters or use existing
AWS Transit Gateway attached or AWS virtual private cloud (VPC) Peered Confluent Cloud clusters.
This is limited to Confluent Cloud clusters that use AWS Transit Gateway as their networking type.
AWS VPC Peered clusters can be seamlessly converted to AWS Transit Gateway clusters with a Confluent support ticket.
The Confluent Cloud clusters can be in the same or in different Confluent Cloud Environments or Organizations. The Transit Gateways
can be in the same or different AWS Accounts. Connecting clusters from different organizations is useful for data sharing between organizations.
The clusters must be provisioned with different CIDRs. The address ranges cannot overlap.
The CIDRs for both clusters must be within RFC 1918:
The CIDRs for either cluster cannot be 198.18.0.0/15, even though it is a valid Confluent Cloud CIDR.
This configuration does not support combinations with other networking types, such as PrivateLink,
or with other cloud providers, such as Google Cloud Platform or Microsoft Azure.
You will need to specify the network ID and the ARN of the resource share containing that region’s Transit Gateway.
It is possible to seamlessly convert an existing AWS VPC Peered Confluent Cloud cluster in that region to a Transit Gateway attached cluster. If you choose that option, let
Confluent know the name and cluster ID of the existing cluster in the support ticket.
Turn around time is generally three to five business days.
Connect a “Command VPC” from which to issue commands, and create topics on the Confluent Cloud cluster in “Region 1”.
Create a new VPC in Region 1, from which to run commands against your Confluent Cloud cluster. For purposes of this example, call this the “Command VPC”.
Attach the Command VPC to your Transit Gateway. (Make sure you have a route in the Transit Gateway’s route table that points to the Command VPC for the Command VPC’s CIDR range.)
In the Command VPC’s route table, create the following routes if they do not already exist:
List the topics in your Confluent Cloud cluster with confluent kafka topic list.
If this command fails, your Command EC2 instance may not be able to reach your Confluent Cloud cluster. Your networking may not be correctly set up. Make sure you followed the steps above.
See the Troubleshooting section if needed.
You will use the bootstrap server of the cluster in Region 1. Because you have peered the Transit Gateways and set up routing between the two clusters,
the cluster in Region 2 will be able to resolve the private CIDR of the cluster in Region 1.
As long as your Command VPC can reach the Destination Cluster (Region 2 in this picture), it does not matter which region the Command VPC is in. (It is okay to run the command from Region 1.)
Create mirror topics using the cluster link, and produce and consume the data. This will bring data from Region 1 to Region 2.
It is okay to run these commands from Region 1, even if they are against Region 2’s cluster, as long as your Command VPC has connectivity to Region 2’s cluster.
Successfully consuming messages from Region 2 proves the inter-region replication is working, no matter where your Command VPC itself lives.
Now, you can spin up more Kafka producers and consumers in Region 1, and Kafka consumers in Region 2.
You can also create a Command VPC in Region 2, so that you can issue confluent commands should Region 1 experience an outage.
You can also set up a cluster link in the opposite direction, too. You only need to repeat steps 4 through 6 above.
You do not need to set up additional networking to create additional cluster links. You only need to set up the networking once.
Cluster Linking between two Confluent Cloud clusters in the same region¶
If you want to create a cluster link between two AWS Transit Gateway Confluent Cloud clusters in the same AWS region,
this is a special case in which the requirements may be different, depending on the networking setup:
Here are some scenarios and factors to consider:
If both Confluent Cloud clusters are in the same Confluent Cloud network, the additional configuration described in previous sections
is not necessary, Only one Transit Gateway is required; without any Transit Gateway peering or changes to its route table.
If the Confluent Cloud clusters are in different Confluent Cloud networks, but both are attached to the same Transit Gateway,
then the only requirement is that they use the same Transit Gateway Route Table. The routes from each CIDR to each
Confluent VPC must be in the same route table. No Transit Gateway Peering is required.
If the Confluent Cloud clusters are attached to different Transit Gateways, then the above configuration is required.
The steps are no different for two Transit Gateways in the same region from two Transit Gateways in different regions.
Often, CLI commands may fail if you are not running them from a place that has
connectivity to your Confluent Cloud cluster. Verify that your CLI (for example, which
may be running in a Command EC2 instance) has connectivity to the Confluent Cloud cluster.
If this process ends in failure, you are not running the CLI commands from a location that has connectivity to your Confluent Cloud cluster.
Work with your networking team to ensure your instance is attached and routed to your Transit Gateway to the Confluent Cloud cluster.
Ensure the Transit Gateways, Peering, and Route Tables are properly configured.
Assuming a transit gateway in Region A and one in Region B, with CIDRs CIDR-A and CIDR-B :
Transit Gateway Region A route table
Confluent VPC A (Confluent is responsible for setting this)
Transit Gateway B via Peering connection
Transit Gateway Region B route table
Transit Gateway B via Peering connection
Confluent VPC B (Confluent is responsible for setting this)
Note that each transit gateway has routes set for both CIDRs. This may be the issue. If the
transit gateways are not both set up with both CIDRs, then the clusters will not
have connectivity to each other. The route between the two Confluent Cloud clusters and
transit gateways must allow traffic over port 9092.
You can test if the cross-region connectivity works like this:
Attach a Test VPC in Region A to A’s transit gateway.
Launch an EC2 instance in the Test VPC (or you can use the “Command VPC” if you set one up per the previous steps).
Route the EC2 instance into transit gateway in Region A:
Test VPC Route Table:
Transit gateway A
Transit gateway A (also)
Transit gateway Region A Route Table: Add CIDR-Test-VPC:TestVPCAttachment
Transit gateway Region B Route Table: Add CIDR-Test-VPC:TGWAviaPeeringconnection (needed for bidirectional connectivity)
SSH into the EC2 instance. For example, put an elastic IP on the EC2 instance.
The cluster link on the Source cluster needs these configurations:
Bootstrap server set to the Destination cluster’s bootstrap server
Security credentials to authenticate into the Destination cluster. (These can be either API key or OAuth). The Service Account or user with these security credentials
must have the CLUSTER:ALTER ACL on the destination (public) cluster, or alternatively the CloudClusterAdmin RBAC role.
Security credentials to authenticate into the Source cluster. (These can be either API key or OAuth.)
In this walkthrough, information about your private cluster is referred to as <private>.
You will need the following information about your private cluster:
Cluster ID. This will have the format lkc-XXXXX. You can get this using the command confluent kafka cluster list.
This is referred to as <private-id>.
Cluster bootstrap servers. You can get this by describing the cluster with confluent kafka cluster describe; confluentkafkaclusterdescribe<private-id>.
The bootstrap will be in the Endpoint row. This is referred to as <private-bootstrap>.
In this walkthrough, information about your private cluster is referred to as <public>.
You will need the following information about the public cluster:
Cluster ID. This will have the format lkc-XXXXX. You can get this using the command confluent kafka cluster list.
This is referred to as <public-id>.
Cluster bootstrap server. You can get this by describing the cluster with confluent kafka cluster describe; confluentkafkaclusterdescribe<public-id>.
The bootstrap will be in the Endpoint row. This is referred to as <public-bootstrap>.
Create security credentials for the private half of the link with two sets of API keys.
To create the Private half, you’ll need two sets of security credentials: one for each cluster.
This is because the Private cluster must authenticate with the Public cluster to establish
the cluster link, and then the cluster link must authenticate with the Private cluster to read data.
To get security credentials, you need API keys on both clusters. Choose one of the following methods; either for development and testing, or for production.
For development and testing, a quick way to do is with this commands:
This creates an API key with Admin privileges. This is not recommended for production or for clusters with sensitive data.
If the clusters belong to two different owners or two different Confluent Cloud organizations, then two different service accounts will be needed:
one for the source cluster and one for the destination cluster.
Using this service account or these service accounts, create two Kafka API keys, one on each Kafka cluster:
Create the Private link configuration file.
Create a file called private-source-link.config that contains all necessary link configurations with the following content:
You should see a success message. If so, you now have a link between your clusters that you can use to create mirror topics.
If the command times out after 30 seconds, then the private cluster was unable to reach and authenticate into your source cluster
using the provided information. Make sure you can consume from the source cluster using the public-link.config file.
Both REST API calls should return a status of 201: CREATED and a blank response if successful.
Private-Public-Private Pattern with cluster link chaining and a jump cluster¶
You can configure Cluster Linking between any two dedicated Confluent Cloud clusters with any private networking configuration, as a fully supported pattern.
Private networking combinations that cannot be directly linked¶
While Cluster Linking can directly connect certain networking types of
Dedicated clusters on Confluent Cloud (such as AWS PrivateLink and AWS Transit Gateway)
there are are some combinations of clusters that cannot be directly connected by Cluster Linking on Confluent Cloud, such as:
Two Azure PrivateLink-connected clusters in different regions
Two GCP Private Services Connect-connected clusters in different regions
Two VPC Peered or VNET Peered clusters in different regions
Two clusters with different private networking types (such as, VPC Peered to PrivateLink) or
with private networking on different cloud providers (such as, AWS to Azure or GCP)
Any two clusters not explicitly listed in the supported clusters table.
A chain of multiple cluster links–or simply “chaining”, which keep data and metadata replicated from start to finish.
Cluster Links and mirror topics are designed to be composable and to chain together intuitively.
A dedicated Confluent Cloud cluster with Internet networking, which can be securely reached by both private clusters.
This is colloquially known as a “jump cluster”, as it is the middle point between the two disparate private networks.
To learn more about the security implications of using a cluster with Internet networking, and why this is a secure choice,
see Security considerations for the Private-Public-Private pattern.
The solution consists of an origin cluster, a “jump” cluster, and a target cluster, as described below.
a Dedicated Confluent Cloud cluster with any private networking type where the data is coming from. The original source of the data and metadata.
a Dedicated Confluent Cloud cluster with Internet networking
A cluster link replicates topics and desired metadata from the origin cluster (its source cluster) to the jump cluster (its destination cluster)
This cluster link uses the “private to public” feature, where the private source cluster always initiates an outbound connection to the internet
destination cluster. The internet cluster never has access to the private network.
a Dedicated Confluent Cloud cluster with any private networking type where the data is going to. The final destination of the data and metadata.
A cluster link replicates data and metadata from the jump cluster (the link’s source cluster) to the target cluster (the link’s destination cluster)
Data and metadata are kept consistent with the origin cluster thanks to cluster link chaining and the buyer for byte replication offered by mirror topics.
The private source cluster always initiates an outbound connection to the internet destination cluster. The internet cluster never has access to the private network.
Security considerations for the Private-Public-Private pattern¶
There are three security goals that are achieved by choosing private networking
over internet networking, and all three can be achieved with the
private-public-private Cluster Linking pattern:
Prevent data from being intercepted and sniffed in transit over the internet¶
Private networking is appealing because it ensures that no third party can be
involved in the transfer of data and potentially “sniff” or otherwise reveal the
contents of that data. However, AWS and Azure both guarantee that data between
two of their networks always stays on their own, private backbone, even if a
public IP is used in the process. Therefore, if all three Confluent Cloud clusters in
the private-public-private pattern are on AWS or all are on Azure, the data
never leaves the cloud service provider’s backbone and cannot be intercepted by
a third party, satisfying this goal.
Prevent Distributed Denial of Service (DDoS) attacks from interrupting cluster operations¶
Private IP addresses cannot be addressed or resolved from outside the private
network, giving an extra layer of protection against a potential DDoS attack
from a malicious actor who learned the cluster’s addresses. The
private-public-private pattern does use internet-accessible addresses on the
jump cluster; these addresses are only used for replication. The private
cluster(s) which serve business-critical applications remain protected behind
private IPs. A theoretical DDoS attack on the jump cluster could temporarily
interrupt replication, but if replication is used for Disaster Recovery,
migration, or another similar goal, this would not impact the business-critical
workloads which are isolated to the private clusters, and the private clusters
themselves cannot be DDoSed.
Private networking gives an extra layer of protection against data exfiltration,
by requiring access to the private network in order to access the data. Since
the private-public-private pattern puts the data on an Internet cluster, the
data on the jump cluster is not protected by network isolation, and therefore
access to this cluster must be carefully guarded. Because accessing a Confluent Cloud
cluster requires an API key specifically created for that cluster, the jump
cluster can be protected by only creating one API key for it–the API key given
to the cluster link–and then immediately destroying that API key. There is no
need to store the cluster link’s API key, as it is securely and permanently
stored on the cluster link. It is impossible to retrieve the API key from the
cluster link; even for Confluent Cloud engineers, which ensures that this API key will
never be used for anything other than the Cluster Linking operations. Since no
other API keys exist for this cluster, nothing can access the cluster other than
the cluster links.
There are cost differences associated with private vs. public networking.
These are detailed under Cluster Linking
in the Billing documentation. Examples are provided there for public networking, with
more details about private networking to follow soon.