Configure Cluster Linking on Confluent Platform

This page describes how to configure Cluster Linking with various Confluent tools, products, and security options.

Using Cluster Linking with Confluent for Kubernetes

You can use Cluster Linking with Confluent Platform deployed with Confluent for Kubernetes.

Confluent for Kubernetes 2.2 released built-in Cluster Linking support, as described in this section of the CFK documentation: Cluster Linking using Confluent for Kubernetes.

To configure Cluster Linking on earlier versions of CFK, use configOverrides in the Kafka custom resource. See Configuration Overrides. in the CFK documentation for more information about using configOverrides.

Also, pre Confluent Platform 7.0.0 releases required that you include a configOverrides section on the server to specify confluent.cluster.link.enable: "true". For Confluent Platform 7.0.0 and later releases, Cluster Linking is enabled by default, so this element of the configuration is no longer needed, regardless of the Confluent for Kubernetes version.

For example:

apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
  name: kafka
  namespace: confluent
spec:
  replicas: 3
  image:
    application: confluentinc/cp-server:7.6.0
    init: confluentinc/confluent-init-container:2.0.1
  configOverrides:
    server:
      - confluent.cluster.link.enable=true # Enable Cluster Linking

Using Cluster Linking with Ansible

You can use Cluster Linking with Confluent Platform deployed with Ansible.

Starting in Confluent Platform 7.0.0, Cluster Linking is enabled by default, so no changes are needed to the configuration file.

Tip

Pre Confluent Platform 7.0.0 releases required that you add a broker configuration property to the kafka_broker_custom_properties section in the inventory as described in Configure Confluent Platform with Ansible, to set confluent.cluster.link.enable: "true". If you are upgrading from an earlier release, this configuration can be deleted, as it is no longer needed.

Common Configuration Options

The following subset of common properties, although not specific to Cluster Linking, may be particularly relevant to setting up and managing cluster links. These are common across Confluent Platform for clients, brokers, and security configurations, and are described in their respective sections per the links provided.

Client Configurations

For a full list of AdminClient configurations, see Kafka AdminClient Configurations for Confluent Platform.

  • bootstrap.servers
  • client.dns.lookup
  • metadata.max.age.ms
  • retry.backoff.ms
  • request.timeout.ms

Client SASL and TLS/SSL Configurations

  • sasl.client.callback.handler.class
  • sasl.jaas.config
  • sasl.kerberos.kinit.cmd
  • sasl.kerberos.min.time.before.relogin
  • sasl.kerberos.service.name
  • sasl.kerberos.ticket.renew.jitter
  • sasl.kerberos.ticket.renew.window.factor
  • sasl.login.callback.handler.class
  • sasl.login.class
  • sasl.login.refresh.buffer.seconds
  • sasl.login.refresh.min.period.seconds
  • sasl.login.refresh.window.factor
  • sasl.login.refresh.window.jitter
  • sasl.mechanism
  • security.protocol
  • ssl.cipher.suites
  • ssl.enabled.protocols
  • ssl.endpoint.identification.algorithm
  • ssl.engine.factory.class
  • ssl.key.password
  • ssl.keymanager.algorithm
  • ssl.keystore.location
  • ssl.keystore.password
  • ssl.keystore.type
  • ssl.protocol
  • ssl.provider
  • ssl.secure.random.implementation
  • ssl.trustmanager.algorithm
  • ssl.truststore.location
  • ssl.truststore.password
  • ssl.truststore.type

Configuring Reconnection Speed and Behavior

A cluster link has two sets of configuration options, both exponential times, which control connections. These are the same options that Apache Kafka® clients have.

  • reconnect.backoff.ms and reconnect.backoff.max.ms - These options determine how soon the cluster link retries after a connection failure. These are 50ms and 10s by default for cluster links.
  • socket.connection.setup.timeout.ms and socket.connection.setup.timeout.max.ms - These options determine how long the cluster link waits for a connection attempt to succeed before breaking and retrying after a “reconnect backoff”. These are 10s and 30s, respectively, by default.

On Confluent Platform clusters, reducing the values for these options may give faster reconnection speeds, at the expense of CPU and network usage.

These options cannot be updated by cluster links that have a Confluent Cloud destination cluster.

Bidirectional Cluster Linking

Cluster Linking bidirectional mode (a bidirectional cluster link) enables better Disaster Recovery and active/active architectures, with data and metadata flowing bidirectionally between two or more clusters.

../../_images/cluster-link-bidirectional.png

Mental model

A useful analogy is to consider a cluster link as a bridge between two clusters.

  • By default, a cluster link is a one-way bridge: topics go from a source cluster to a destination cluster, with data and metadata always flowing from source to destination.
  • In contrast, a bidirectional cluster link is a two-way bridge: topics on either side can go to the other cluster, with data and metadata flowing in both directions.

In the case of a “bidirectional” cluster link, there is no “source” or “destination” cluster. Both clusters are equal, and can function as a source or destination for the other cluster. Each cluster sees itself as the “local” cluster and the other cluster as the “remote” cluster.

../../_images/cluster-link-bidirectional-both.png

Benefits

Bidirectional cluster links are advantageous in disaster recovery (DR) architectures, and certain types of migrations, as described below.

Disaster recovery

Bidirectional cluster links are useful for Disaster Recovery, both active/passive and active/active.

  • In a disaster recovery setup, two clusters in different regions are deployed so that at least one cluster is available at all times, even if a region experiences an outage. A bidirectional cluster link ensures both regions have the latest data and metadata from the other region, should one of them fail, or should applications need to be rotated from region to region for DR testing.
  • It is easier to test and practice DR by moving producers and consumers to the DR cluster and reversing the direction of data and metadata, with fewer commands and moving pieces.
  • In an active/passive setup, a bidirectional cluster link ensures consumer offsets from the DR region get synced to the primary region, so that consumer applications from the DR region can be moved or failed over to the primary region.

In an active/active setup, a bidirectional cluster link ensures that consumer offsets are synced to both clusters, so that consumers and producers can easily failover to the other cluster and resume from the right place.

Consumer-last migrations

Bidirectional cluster links are useful for certain types of migrations, where consumers are migrated after producers.

  • In most migrations from an old cluster to a new cluster, a default cluster link suffices because consumers are migrated before or at the same time as producers.
  • If there are straggling consumers on the old cluster, a bidirectional cluster link can help by ensuring their consumer offsets flow to the new cluster and are available when these consumers need to migrate. A default cluster link does not do this.

Tip

Bidirectional cluster links can only be used for this use case if the clusters fit the supported combinations described below.

Restrictions and limitations

To use bidirectional mode for Cluster Linking, both clusters must be one of these types:

Bidirectional mode is not supported if either of the clusters is a Basic or Standard Confluent Cloud cluster, a version of Confluent Platform 7.4 or earlier, or open source Apache Kafka®.

Consumer group prefixing cannot be enabled for bidirectional links. Setting consumer.group.prefix.enable to true on a bidirectional cluster link will result in an “invalid configuration” error stating that the cluster link cannot be validated due to this limitation.

Security

The cluster link will need one or more principal to represent it on each cluster, and those principals will be given cluster permissions via ACLs or RBAC, consistent with how authentication and authorization works for Cluster Linking. To learn more about authentication and authorization for Cluster Linking, see Manage Security for Cluster Linking on Confluent Cloud and Manage Security for Cluster Linking on Confluent Platform

On Confluent Cloud, the same service account or identity pool can be used for both clusters, or two separate service accounts and identity pools can be used.

Default security config for bidirectional connectivity

By default, a cluster link in bidirectional mode is configured similar to the default configuration for two cluster links.

../../_images/cluster-link-bidirectional-security.png

Each cluster requires:

  • The ability to connect (outbound) to the other cluster. (If this is not possible, see Advanced options for bidirectional Cluster Linking.)
  • A user to create a cluster link object on it with:
    • An authentication configuration (such as API key or OAuth) for a principal on its remote cluster with ACLs or RBAC role bindings giving permission to read topic data and metadata.
      • The Describe:Cluster ACL
      • The DescribeConfigs:Cluster ACL if consumer offset sync is enabled (which is recommended)
      • The required ACLs or RBAC role bindings for a cluster link, as described in Authorization (ACLs) (the rows for a cluster link on a source cluster).
      • link.mode=BIDIRECTIONAL

Note

In some cases, only one cluster can reach the other. For example, if one of the clusters is in a private network or private datacenter, and the other is not. For details on how to configure a bidirectional link in this scenario, see Advanced options for bidirectional Cluster Linking.

Advanced options for bidirectional Cluster Linking

In advanced situations, security requirements may require that only one cluster can reach the other and/or that security credentials be stored on only one cluster. For example, if one of the clusters has private networking or is located in a datacenter, and the other cluster is configured with Internet networking.

An advanced option for bidirectional Cluster Linking is a “unidirectional” security configuration for private-to-public or Confluent Platform to Confluent Cloud with a source-initiated link. In this case, a bidirectional cluster link can be configured such that only the more privileged (private) cluster connects to the remote cluster, and not the other way around. This is similar to a source-initiated cluster link.

../../_images/cluster-link-bidirectional-advanced.png

Required Configurations for Control Center

Cluster Linking requires embedded v3 Confluent REST Proxy to communicate with Confluent Control Center and properly display mirror topics on the Control Center UI. If the REST configurations are not implemented, mirror topics will display in Control Center as regular topics, showing inaccurate information. (To learn more, see Known limitations and best practices.)

Configure REST Endpoints in the Control Center properties file

If you want to use Control Center with Cluster Linking, you must configure the Control Center cluster with REST endpoints to enable HTTP servers on the brokers. If this is not configured properly for all brokers, Cluster Linking will not be accessible from Confluent Control Center.

In the appropriate Control Center properties file (for example $CONFLUENT_HOME/etc/confluent-control-center/control-center-dev.properties or control-center.properties), use confluent.controlcenter.streams.cprest.url to define the REST endpoints for controlcenter.cluster. The default is http://localhost:8090, as shown below.

# Kafka REST endpoint URL
confluent.controlcenter.streams.cprest.url="http://localhost:8090"

Identify the associated URL for each broker. If you have multiple brokers in the cluster, use a comma-separated list.

See also

confluent.controlcenter.streams.cprest.url in the Control Center Configuration Reference

Configure authentication for REST endpoints on Kafka brokers (Secure Setup)

Tip

  • Cluster Linking does not require the Metadata Service (MDS) or security to run, but if you want to configure security, you can get started with the following example which shows an MDS client configuration for RBAC.
  • You can use confluent.metadata.server.listeners (which will enable the Metadata Service) instead of confluent.http.server.listeners to listen for API requests. Use either confluent.metadata.server.listeners or confluent.http.server.listeners, but not both. If a listener uses HTTPS, then appropriate TLS/SSL configuration parameters must also be set. To learn more, see Admin REST APIs Configuration Options for Confluent Server.

To run Cluster Linking in a secure setup, you must configure authentication for REST endpoints in each of the Kafka broker server.properties files. If the Kafka broker files are missing these configs, Control Center will not be able to access Cluster Linking in a secure setup.

At a minimum, you will need the following configurations.

# EmbeddedKafkaRest: HTTP Auth Configuration
kafka.rest.kafka.rest.resource.extension.class=io.confluent.kafkarest.security.KafkaRestSecurityResourceExtension
kafka.rest.rest.servlet.initializor.classes=io.confluent.common.security.jetty.initializer.InstallBearerOrBasicSecurityHandler

Here is an example of an MDS client configuration for Kafka RBAC in a broker server.properties file .

# EmbeddedKafkaRest: Kafka Client Configuration
kafka.rest.bootstrap.servers=<host:port>, <host:port>, <host:port>
kafka.rest.client.security.protocol=SASL_PLAINTEXT

# EmbeddedKafkaRest: HTTP Auth Configuration
kafka.rest.kafka.rest.resource.extension.class=io.confluent.kafkarest.security.KafkaRestSecurityResourceExtension
kafka.rest.rest.servlet.initializor.classes=io.confluent.common.security.jetty.initializer.InstallBearerOrBasicSecurityHandler
kafka.rest.public.key.path=<rbac_enabled_public_pem_path>

# EmbeddedKafkaRest: MDS Client configuration
kafka.rest.confluent.metadata.bootstrap.server.urls=<host:port>, <host:port>, <host:port>
kafka.rest.ssl.truststore.location=<truststore_location>
kafka.rest.ssl.truststore.password=<password>
kafka.rest.confluent.metadata.http.auth.credentials.provider=BASIC
kafka.rest.confluent.metadata.basic.auth.user.info=<user:password>
kafka.rest.confluent.metadata.server.urls.max.age.ms=60000
kafka.rest.client.confluent.metadata.server.urls.max.age.ms=60000

See also

Disabling Cluster Linking

To disable Cluster Linking on a cluster running Confluent Enterprise version 7.0.0 or later, add the following line to the broker configuration on the destination cluster (for example $CONFLUENT_HOME/etc/server.properties).

confluent.cluster.link.enable=false

This will disable creating cluster links with that cluster as the destination, or “source initiated cluster links” with that cluster as the source. Note: this will not disable creating a destination-initiated cluster link with this cluster as its source.

Cluster Linking is not available as a dynamic configuration. It must either be enabled before starting the brokers (it is on by default starting with Confluent Platform 7.0.0), or to enable it on a running cluster where it was previously turned off, set the configuration confluent.cluster.link.enable=true on the brokers and restart them to perform a rolling update.

Understanding Listeners in Cluster Linking

For a forward connection, the target server knows which listener the connection came in on and associates the listener with that connection. When a metadata request arrives on that connection, the server returns metadata corresponding to the listener.

For example, in Confluent Cloud, when a client on the external listener asks for the leader of topicA, it always gets the external endpoint of the leader and never the internal one, because the system knows the listener name from the connection.

For reverse connections, the target server (that is, the source cluster) established the connection. When the connection is reversed, this target server needs to know which listener to associate the reverse connection with; that is, for example, which endpoint it should return to the destination for its leader requests.

By default, the listener is associated based on the source cluster where the link was created. In most cases this is sufficient because typically a single external listener is used. On Confluent Cloud, this default is used and you cannot override it.

On self-managed Confluent Platform, you have the option to override the default listener/connection association. This provides the flexibility to create the source link on an internal listener but associate the external listener with the reverse connection.

The configuration local.listener.name refers to source cluster listener name. By default, this is the listener that was used to create the source link. If you want to use a different listener, you must explicitly configure it. If Confluent Cloud is the source, then it would be the external listener (default) and cannot be overridden.

For the destination, the listener is determined by bootstrap.servers and cannot be overridden.