Cluster Linking¶
Looking for Confluent Cloud Cluster Linking docs? You are currently viewing Confluent Platform documentation. If you are looking for Confluent Cloud docs, check out Cluster Linking on Confluent Cloud.
Important
This feature is available as a preview feature. A preview feature is a component of Confluent Platform that is being introduced to gain early feedback from developers. This feature can be used for evaluation and non-production testing purposes or to provide feedback to Confluent.
What is Cluster Linking?¶
Cluster Linking allows you to directly connect clusters together and mirror topics from one cluster to another without the need for Connect. Cluster Linking makes it much easier to build multi-datacenter, multi-cluster, and hybrid cloud deployments.
Unlike, Replicator and MirrorMaker2, Cluster Linking does not require running Connect to move messages from one cluster to another, ensuring that the offsets are preserved from one cluster to another. We call this “byte-for-byte” replication. Whatever is on the source, will be mirrored precisely on the destination cluster.
Cluster Linking and Built-in Multi-Region Replication can be combined to create a highly-available, durable, and distributed global eventing fabric. Use Built-in Multi-Region Replication when auto-client failover (low RTO) or RPO=0 is required on some topics. Cluster Linking should be used when network quality is questionable, data centers are very far apart, or RTO goals can tolerate client reconfiguration.
Note
The destination cluster must be running Confluent Server and the source cluster can either be Confluent Server or Apache Kafka® 2.4+. Cluster Linking is being introduced in Confluent Platform 6.0.0 as a preview feature.
Capabilities and Comparisons¶
Cluster Linking provides the following capabilities:
- Preserves offsets between clusters, making failover and rationalizing about your system easier
- Increases replication speed by avoiding recompression
- Provides a cost-effective complement to an exclusive Built-in Multi-Region Replication solution because less data moves across data centers, thereby saving on replication costs and making inter-continental clusters viable
- Simplifies and streamlines Confluent Cloud operations
- Supports hybrid cloud architectures (write once, read many). Using Kafka and Cluster Linking to bridge infrastructure environments gives you a single system to monitor and secure; a cost-efficient transport layer across on-premises deployments, Confluent Cloud, and your cloud providers
- Cluster Linking combined with standard or Built-in Multi-Region Replication clusters provides a global mesh of Kafka with very low Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) over great distances and across different network tolerances
- Lays the groundwork for aggregate clusters
What’s Supported¶
Source | Destination |
---|---|
Confluent Platform 5.4 or later | Confluent Platform 6.0.0 or later |
Confluent Platform 5.4 or later | Confluent Cloud |
Confluent Cloud | Confluent Platform 6.0.0 or later |
Confluent Cloud | Confluent Cloud |
- Requires Confluent Server, specifically Confluent Platform 5.4 or later (Apache Kafka® 2.4) on the source cluster and Confluent Platform 6.0.0 or later on the destination cluster
- Confluent Platform 7.0.0 requires an inter-broker protocol (IBP) of 2.7 or higher on both the source cluster and the destination cluster.
An IBP of 2.7 is recommended, but not required, for Confluent Platform 6.1.x and 6.2.x. These earlier versions can use an IBP of 2.4 or higher.
If you need to upgrade from a lower IBP, you can set this as the value of
inter.broker.protocol.version
on your brokers. - Works with all clients
- Compatible with Ansible and Confluent for Kubernetes. To learn more, see Using Cluster Linking with Ansible and Using Cluster Linking with Confluent for Kubernetes
- Provides support for authentication and authorization, as described in Cluster Linking Security, with some Known Limitations.
- The source cluster can be either Kafka or Confluent Server; the destination cluster must be Confluent Server, which is bundled with Confluent Enterprise.
- If the source cluster is Confluent Server, it must be Confluent Platform 5.4 or later. If the the source cluster is Kafka, it must be Apache Kafka® 2.4 or later. (In either case, an IBP of 2.4 or later is required for pre Confluent Platform 7.0.0 deployments, and an IBP of 2.7 or later is required for Confluent Platform deployments 7.0.0 and later, as previously stated.)
- In addition to self-managed deployments on Confluent Platform, Cluster Linking is also available as a managed service on Confluent Cloud and in Hybrid Cloud Architectures.
Use Cases and Architectures¶
The following use cases can be achieved by the configurations and architectures shown. These deployments are demo’ed in Cluster Linking Demo (Docker) and Tutorial: Using Cluster Linking for Topic Data Sharing.
Topic Data Sharing¶
Use Case: Share the data in a handful of topics across two Kafka clusters.
- source cluster
- destination cluster
For topic sharing, data moves from the source to the destination cluster by means of a cluster link. Mirror topics are associated with a cluster link. Consumers on the destination cluster can read from local, read-only, mirrored topics to read messages produced on the source cluster. If an original topic on the source cluster is removed for any reason, you can stop mirroring that topic, and convert it to a read/write topic on the destination.
Cluster Migration¶
Use Case: Move from an on-premises Kafka cluster to a Confluent Cloud Kafka cluster, or from an older version to a newer version. The native offset preservation you get by leveraging Confluent Server on the brokers makes this much easier to do with Cluster Linking than with other Connect based methods.
Hybrid Cloud Architectures¶
Use Case: Deploy an ongoing data funnel for a few topics from an on-premise environment to Confluent Cloud. Cluster Linking provides a network partition tolerant architecture that supports this nicely (losing a network connection momentarily does not materially affect the data on any particular cluster), whereas trying this with stretch clusters requires a highly reliable and robust network.
Understanding Listeners in Cluster Linking¶
For a forward connection, the target server knows which listener the connection came in on and associates the listener with that connection. When a metadata request arrives on that connection, the server returns metadata corresponding to the listener.
For example, in Confluent Cloud, when a client on the external listener asks for
the leader of topicA
, it always gets the external endpoint of the leader and never
the internal one, because the system knows the listener name from the connection.
For reverse connections, the target server (that is, the source cluster) established the connection. When the connection is reversed, this target server needs to know which listener to associate the reverse connection with; that is, for example, which endpoint it should return to the destination for its leader requests.
By default, the listener is associated based on the source cluster where the link was created. In most cases this is sufficient because typically a single external listener is used. On Confluent Cloud, this default is used and you cannot override it.
On self-managed Confluent Platform, you have the option to override the default listener/connection association. This provides the flexibility to create the source link on an internal listener but associate the external listener with the reverse connection.
The configuration local.listener.name
refers to source cluster listener
name. By default, this is the listener that was used to create the source link.
If you want to use a different listener, you must explicitly configure it. If
Confluent Cloud is the source, then it would be the external listener (default) and cannot be overridden.
For the destination, the listener is determined by bootstrap.servers and cannot be overridden.
Known Limitations¶
- Use of the Cluster Linking preview is not recommended for production use cases. Some Kafka features are not supported in this preview.
- The source cluster must be Confluent Platform 5.4 or later (Apache Kafka® 2.4) and the destination cluster must be Confluent Platform 6.0.0 or later.
- The source topic and destination mirrored topic must have the same name.
- Links are unidirectional. You cannot configure a single link accomplish bidirectional replication; you must use two links for this.
- Connections are established from the destination cluster only. Source clusters must be configured to accept inbound connection requests from destination brokers. This will satisfy some but not all security models.
- All SSL key stores, trust stores and Kerberos keytab files must be stored at the same location on each broker in a given cluster. If not, Cluster Links may fail.
- ACL migration (ACL sync), previously available in Confluent Platform 6.0.0 through 6.2.x, was removed due to a security vulnerability. If you are using ACL migration in your deployments, you should disable it. To learn more, see Authorization (ACLs).
Suggested Resources¶
- Podcast: Multi-Cluster Apache Kafka with Cluster Linking ft. Nikhil Bhatia
- Blog post: Project Metamorphosis Month 5: Global Event Streaming in Confluent Cloud
- Blog post: How Krake Makes Floating Workloads on Confluent Platform
- Cluster Linking Demo (Docker)
- Tutorial: Using Cluster Linking for Topic Data Sharing
- Commands for Cluster Linking
- Configuration Options for Cluster Linking
- Monitoring Cluster Metrics and Optimizing Links
- Cluster Linking Security
- Cluster Linking on Confluent Cloud