Cluster Linking Preview

Important

This feature is available as a preview feature. A preview feature is a component of Confluent Platform that is being introduced to gain early feedback from developers. This feature can be used for evaluation and non-production testing purposes or to provide feedback to Confluent.

What is Cluster Linking?

Cluster Linking allows you to directly connect clusters together and mirror topics from one cluster to another without the need for Connect. Cluster Linking makes it much easier to build multi-datacenter, multi-cluster, and hybrid cloud deployments.

Unlike, Replicator and MirrorMaker2, Cluster Linking does not require running Connect to move messages from one cluster to another, ensuring that the offsets are preserved from one cluster to another. We call this “byte-for-byte” replication. Whatever is on the source, will be mirrored precisely on the destination cluster.

Cluster Linking and Built-in Multi-Region Replication can be combined to create a highly-available, durable, and distributed global eventing fabric. Use Built-in Multi-Region Replication when auto-client failover (low RTO) or RPO=0 is required on some topics. Cluster Linking should be used when network quality is questionable, data centers are very far apart, or RTO goals can tolerate client reconfiguration.

Note

The destination cluster must be running Confluent Server and the source cluster can either be Confluent Server or Kafka 2.4+. Cluster Linking is being introduced in Confluent Platform 6.0.0 as a preview feature.

Capabilities and Comparisons

Cluster Linking provides the following capabilities:

  • Preserves offsets between clusters, making failover and rationalizing about your system easier
  • Increases replication speed by avoiding recompression
  • Provides a cost-effective complement to an exclusive Built-in Multi-Region Replication solution because less data moves across data centers, thereby saving on replication costs and making inter-continental clusters viable
  • Simplifies and streamlines Confluent Cloud operations
  • Supports hybrid cloud architectures (write once, read many). Using Kafka and Cluster Linking to bridge infrastructure environments gives you a single system to monitor and secure; a cost-efficient transport layer across on-premises deployments, Confluent Cloud, and your cloud providers
  • Cluster Linking combined with standard or Built-in Multi-Region Replication clusters provides a global mesh of Kafka with very low Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) over great distances and across different network tolerances
  • Lays the groundwork for aggregate clusters

What’s Supported

  • Requires Confluent Server, specifically Confluent Platform 5.4 or later (Apache Kafka® 2.4) on the source cluster and Confluent Platform 6.0.0 or later on the destination cluster
  • Works with all clients
  • Compatible with Ansible and Confluent Operator
  • Provides support for authentication and authorization, as described in Cluster Linking Security, with some Known Limitations.
  • Source cluster can be either Kafka or Confluent Server, destination cluster must be Confluent Server, which is bundled with Confluent Enterprise

Use Cases and Architectures

The following use cases can be achieved by the configurations and architectures shown. These deployments are demo’ed in Cluster Linking Demo (Docker) and Cluster Linking Tutorial.

Topic Data Sharing

Use Case: Share the data in a handful of topics across two Kafka clusters.

  • source cluster
  • destination cluster

For topic sharing, data moves from the source to the destination cluster by means of a cluster link. Mirror topics are associated with a cluster link. Consumers on the destination cluster can read from local, read-only, mirrored topics to read messages produced on the source cluster. If an original topic on the source cluster is removed for any reason, you can stop mirroring that topic, and convert it to a read/write topic on the destination.

Cluster Migration

Use Case: Move from an on-premises Kafka cluster to a Confluent Cloud Kafka cluster, or from an older version to a newer version. The native offset preservation you get by leveraging Confluent Server on the brokers makes this much easier to do with Cluster Linking than with other Connect based methods.

Hybrid Cloud Architectures

Use Case: Deploy an ongoing data funnel for a few topics from an on-premise environment to Confluent Cloud. Cluster Linking provides a network partition tolerant architecture that supports this nicely (losing a network connection momentarily does not materially affect the data on any particular cluster), whereas trying this with stretch clusters requires a highly reliable and robust network.

Known Limitations

  • Use of the Cluster Linking preview is not recommended for production use cases. Some Kafka features are not supported in this preview.
  • The source cluster must be Confluent Platform 5.4 or later (Apache Kafka® 2.4) and the destination cluster must be Confluent Platform 6.0.0 or later.
  • Topic renaming is currently not supported, so the source topic and destination mirrored topic must have the same name.
  • Links are unidirectional. You cannot configure a single link accomplish bidirectional replication; you must use two links for this.
  • Connections are established from the destination cluster only. Source clusters must be configured to accept inbound connection requests from destination brokers. This will satisfy some but not all security models.
  • Disaster Recovery is currently not recommended with Cluster Linking. The replication mechanism is offset preserving (byte-for-byte), but support for reliable failback is not built out yet. “Failing forward” to another cluster is the best option for cluster linking at this time.
  • Aggregate topics are currently not supported. Topics cannot be aggregated from multiple clusters into one cluster. For now, continue to use Confluent Replicator for this.
  • All SSL key stores, trust stores and Kerberos keytab files must be stored at the same location on each broker in a given cluster. If not, Cluster Links may fail.