Cluster Linking¶
Looking for Confluent Cloud Cluster Linking docs? You are currently viewing Confluent Platform documentation. If you are looking for Confluent Cloud docs, check out Cluster Linking on Confluent Cloud.
Important
This feature is available as a preview feature. A preview feature is a component of Confluent Platform that is being introduced to gain early feedback from developers. This feature can be used for evaluation and non-production testing purposes or to provide feedback to Confluent.
What is Cluster Linking?¶
Cluster Linking allows you to directly connect clusters together and mirror topics from one cluster to another without the need for Connect. Cluster Linking makes it much easier to build multi-datacenter, multi-cluster, and hybrid cloud deployments.
Unlike, Replicator and MirrorMaker2, Cluster Linking does not require running Connect to move messages from one cluster to another, ensuring that the offsets are preserved from one cluster to another. We call this “byte-for-byte” replication. Whatever is on the source, will be mirrored precisely on the destination cluster.
Cluster Linking and Built-in Multi-Region Replication can be combined to create a highly-available, durable, and distributed global eventing fabric. Use Built-in Multi-Region Replication when auto-client failover (low RTO) or RPO=0 is required on some topics. Cluster Linking should be used when network quality is questionable, data centers are very far apart, or RTO goals can tolerate client reconfiguration.
Note
The destination cluster must be running Confluent Server and the source cluster can either be Confluent Server or Apache Kafka® 2.4+. Cluster Linking is being introduced in Confluent Platform 6.0.0 as a preview feature.
Capabilities and Comparisons¶
Cluster Linking provides the following capabilities:
- Preserves offsets between clusters, making failover and rationalizing about your system easier
- Increases replication speed by avoiding recompression
- Provides a cost-effective complement to an exclusive Built-in Multi-Region Replication solution because less data moves across data centers, thereby saving on replication costs and making inter-continental clusters viable
- Simplifies and streamlines Confluent Cloud operations
- Supports hybrid cloud architectures (write once, read many). Using Kafka and Cluster Linking to bridge infrastructure environments gives you a single system to monitor and secure; a cost-efficient transport layer across on-premises deployments, Confluent Cloud, and your cloud providers
- Cluster Linking combined with standard or Built-in Multi-Region Replication clusters provides a global mesh of Kafka with very low Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) over great distances and across different network tolerances
- Lays the groundwork for aggregate clusters
What’s Supported¶
- Requires Confluent Server, specifically Confluent Platform 5.4 or later (Apache Kafka® 2.4) on the source cluster and Confluent Platform 6.0.0 or later on the destination cluster
- Confluent Platform 7.0.0 requires an inter-broker protocol (IBP) of 2.7 or higher on both the source cluster and the destination cluster.
An IBP of 2.7 is recommended, but not required, for Confluent Platform 6.1.x and 6.2.x. These earlier versions can use an IBP of 2.4 or higher.
If you need to upgrade from a lower IBP, you can set this as the value of
inter.broker.protocol.version
on your brokers. - Works with all clients
- Compatible with Ansible and Confluent for Kubernetes. To learn more, see Using Cluster Linking with Ansible and Using Cluster Linking with Confluent for Kubernetes
- Provides support for authentication and authorization, as described in Cluster Linking Security, with some Known Limitations.
- The source cluster can be either Kafka or Confluent Server; the destination cluster must be Confluent Server, which is bundled with Confluent Enterprise.
- If the source cluster is Confluent Server, it must be Confluent Platform 5.4 or later. If the the source cluster is Kafka, it must be Apache Kafka® 2.4 or later. (In either case, an IBP of 2.4 or later is required for pre Confluent Platform 7.0.0 deployments, and an IBP of 2.7 or later is required for Confluent Platform deployments 7.0.0 and later, as previously stated.)
- In addition to self-managed deployments on Confluent Platform, Cluster Linking is also available as a managed service on Confluent Cloud and in Hybrid Cloud Architectures.
Use Cases and Architectures¶
The following use cases can be achieved by the configurations and architectures shown. These deployments are demo’ed in Cluster Linking Demo (Docker) and Cluster Linking Tutorial.
Topic Data Sharing¶
Use Case: Share the data in a handful of topics across two Kafka clusters.
- source cluster
- destination cluster
For topic sharing, data moves from the source to the destination cluster by means of a cluster link. Mirror topics are associated with a cluster link. Consumers on the destination cluster can read from local, read-only, mirrored topics to read messages produced on the source cluster. If an original topic on the source cluster is removed for any reason, you can stop mirroring that topic, and convert it to a read/write topic on the destination.
Cluster Migration¶
Use Case: Move from an on-premises Kafka cluster to a Confluent Cloud Kafka cluster, or from an older version to a newer version. The native offset preservation you get by leveraging Confluent Server on the brokers makes this much easier to do with Cluster Linking than with other Connect based methods.
Hybrid Cloud Architectures¶
Use Case: Deploy an ongoing data funnel for a few topics from an on-premise environment to Confluent Cloud. Cluster Linking provides a network partition tolerant architecture that supports this nicely (losing a network connection momentarily does not materially affect the data on any particular cluster), whereas trying this with stretch clusters requires a highly reliable and robust network.
Known Limitations¶
- Use of the Cluster Linking preview is not recommended for production use cases. Some Kafka features are not supported in this preview.
- The source cluster must be Confluent Platform 5.4 or later (Apache Kafka® 2.4) and the destination cluster must be Confluent Platform 6.0.0 or later.
- Topic renaming is currently not supported, so the source topic and destination mirrored topic must have the same name.
- Links are unidirectional. You cannot configure a single link accomplish bidirectional replication; you must use two links for this.
- Connections are established from the destination cluster only. Source clusters must be configured to accept inbound connection requests from destination brokers. This will satisfy some but not all security models.
- Disaster Recovery is currently not recommended with Cluster Linking. The replication mechanism is offset preserving (byte-for-byte), but support for reliable failback is not built out yet. “Failing forward” to another cluster is the best option for Cluster Linking at this time.
- Aggregate topics are currently not supported. Topics cannot be aggregated from multiple clusters into one cluster. For now, continue to use Confluent Replicator for this.
- All SSL key stores, trust stores and Kerberos keytab files must be stored at the same location on each broker in a given cluster. If not, Cluster Links may fail.
- ACL migration (ACL sync), previously available in Confluent Platform 6.0.0 through 6.2.x, was removed due to a security vulnerability. If you are using ACL migration in your deployments, you should disable it. To learn more, see Authorization (ACLs).
Suggested Reading¶
- Blog post: Project Metamorphosis Month 5: Global Event Streaming in Confluent Cloud
- Blog post: How Krake Makes Floating Workloads on Confluent Platform
- Cluster Linking Demo (Docker)
- Cluster Linking Tutorial
- Commands for Cluster Linking
- Configuration Options for Cluster Linking
- Monitoring Cluster Metrics and Optimizing Links
- Cluster Linking Security
- Cluster Linking on Confluent Cloud