Troubleshoot Cluster Linking on Confluent Platform
This page describes common errors you may encounter when creating cluster links, and how to address them. Task error codes for cluster link and mirror topic commands are also listed.
Errors when creating a cluster link
Creating a cluster link can fail with one of the following error codes. Each error also returns a customized message with context about the clusters.
Start here if you are having trouble creating a cluster link:
If cluster link creation fails, review the error codes in this section.
If an existing cluster link is unavailable, see Troubleshoot an unavailable cluster link.
If mirror topic transitions fail, see Troubleshoot mirror topics.
UNRESOLVABLE_BOOTSTRAP_ERROR
Meaning
The destination cluster or, for a source-initiated link, the source cluster, could not resolve the provided bootstrap server to an IP address. In other words, there was no DNS entry for this bootstrap server in the cluster’s DNS.
Solutions
If the provided source cluster bootstrap server is a URL with an FQDN that is not Confluent Cloud (for example, kafka.example.com:9092), then the DNS is not properly configured for the destination cluster to resolve that FQDN to an IP address.
If the destination cluster is a Confluent Cloud cluster, this error is likely because your FQDN relies on your private DNS; there is no way to provide Confluent Cloud with access to your private DNS. In this case, you have two options:
The best way to resolve this error is to use an FQDN that is registered in a public DNS. Alternatively, use an IP address (such as
10.10.4.24:9092) for the bootstrap server, instead of a host name. Your source cluster must have an IP address advertised listener on every broker before you can use Cluster Linking with an IP address.Alternatively, you can register this FQDN with a public DNS, pointing to the appropriate IP address of your broker, proxy, or load balancer, so that Confluent Cloud can resolve this FQDN to an IP that it can reach. You can register internal IPs in a public DNS for the purposes of a load balancer.
If the destination cluster is a Confluent Platform cluster, then you can attempt to fix the destination cluster’s DNS. Ensure the route to your source cluster allows traffic over port
9092.If the provided source cluster bootstrap server is for Confluent Cloud, for example,
pkc-lmn0p.us-west-2.aws.confluent.cloud:9092, then contact Confluent Support.
BOOTSTRAP_TCP_CONNECTION_FAILED_ERROR
Meaning
The destination cluster cannot reach a Apache Kafka® cluster at the provided source bootstrap server.
Solutions
What if both clusters are Confluent Cloud clusters?
First verify that your source cluster is up and running, and in a healthy state. If it isn’t, this causes the error.
If the source cluster has internet networking, contact Confluent Support.
If the source cluster has Transit Gateway networking, follow the steps for troubleshoot Transit Gateway connectivity.
If the source cluster does not have internet networking or Transit Gateway networking, make sure that the two Confluent Cloud clusters exist in the same Confluent Cloud network, region, and cloud provider. To learn more about supported cluster combinations on Confluent Cloud, see Supported cluster types in the overview and Supported cluster combinations for private networking.
What if the source cluster is Confluent Platform or Kafka and the destination cluster is Confluent Cloud?
If your Confluent Cloud cluster cannot reach your source cluster, you must resolve this. Providing network access for Confluent Cloud to reach your source cluster is your responsibility.
If you are intending to access a source cluster over the internet, validate that all your source cluster’s brokers have public internet IP addresses. You can verify this by consuming from your cluster, for example, with kafka-console-consumer, on a machine that does not have VPN or other private access to your cluster. For the consumer configuration, enter the bootstrap servers and security configuration that you provided to the cluster link.
Depending on your network configuration, validate the following:
Supported networking type: The Confluent Cloud cluster must use a supported private networking type to reach your private network. This includes VPC/VNet peering, AWS Transit Gateway or GCP Route Import, or PrivateLink/Private Service Connect (PSC), available for Enterprise and Dedicated clusters.
Connectivity testing: Test that the machines that host your source cluster brokers have connectivity to the Confluent Cloud cluster, as described in Test connectivity to Confluent Cloud. In some cases, you can also test connectivity from a Confluent Cloud VPC or VNet to your source cluster by using the AWS VPC Reachability Analyzer.
PrivateLink/Private Service Connect (PSC): Validate that outbound private endpoints (egress) are properly provisioned and mapped so Confluent Cloud can securely route traffic to your self-managed Confluent Platform or Kafka cluster.
VPC/VNet peering: If the source cluster is in a cloud VPC or VNet and the Confluent Cloud cluster uses VPC peering or VNet peering, the Confluent Cloud VPC must be peered to the VPC that hosts the source cluster.
Transit Gateway: If the source cluster is in a cloud VPC or VNet and the Confluent Cloud cluster uses Transit Gateway networking, configure routing to enable two-way communication between the Confluent Cloud VPC and the VPC hosting the source cluster.
On-premises environments: If the source cluster is not hosted in a public cloud, such as in an on-premises datacenter, use AWS Transit Gateway or GCP Route Import to provide connectivity between your cluster host machines and Confluent Cloud.
Firewall and ports: Verify that your security groups, firewalls, or ACLs allow inbound traffic from Confluent Cloud on the broker port, typically 9092 or 9094.
AUTHENTICATION_ERROR
Meaning
The security credentials provided to the cluster link could not authenticate with the source cluster.
Solutions
Confirm the security configuration that you assigned your cluster link.
For a Confluent Cloud source cluster, confirm that your link configuration has these properties:
If using API keys: -
security.protocol=SASL_SSL-sasl.mechanism=PLAIN-sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='<source cluster API key>' password='<source cluster API secret>';If using OAuth: - same parameters as your consumers use to authenticate with OAuth on the source cluster
For a Confluent Platform or Kafka source cluster, verify that the cluster link principal used in your link configuration is using an authentication mechanism that is enabled on the source cluster.
INVALID_BOOTSTRAP_INTERNAL_ENDPOINT_ERROR
Meaning
Your source cluster bootstrap server points to a private internal endpoint or port that Cluster Linking cannot access in Confluent Cloud.
Solutions
Verify that your bootstrap server uses port 9092.
TIMEOUT_ERROR
Meaning
The operation to create the cluster link timed out.
Solutions
Contact Confluent Support.
UNKNOWN
Meaning
An unexpected error has occurred.
Solutions
Contact Confluent Support.
Task error codes
The following error codes apply to cluster link and mirror topic commands and signal problems with link tasks or mirror transition tasks.
With the exception of INTERNAL_ERROR, these are all user-created errors. In other words, if the error code is INTERNAL_ERROR, Confluent receives an alert and works to address the issue. Otherwise, the error is user-created and requires customer action to resolve.
Task Status | Description |
|---|---|
| Error cause cannot be determined. |
| System error caused by Confluent software. This type of error automatically alerts Confluent, and resolution is in work. |
| Authentication credentials are not properly configured. |
| Authorization credentials are not properly configured. |
| Authentication credentials on the broker are not properly configured Confluent Platform. |
| Authorization credentials on the broker are not properly configured Confluent Platform. |
| A misconfiguration is causing errors. |
| The remote link was unexpectedly not found. |
| The cluster link cannot be found. |
| The consumer group is active on the destination, causing offsets to not be synced. |
| No authorizer is configured on the source cluster. |
| A topic exists on the destination unexpectedly. |
| The topic transition violates a policy. |
| Cluster link is not enabled. |
| ACLs limit on the link has been exceeded. |
| Remote mirror topic is not available. |
| Either a topic or partition was unexpectedly not found. |
| An InvalidTopicException was encountered from the destination cluster. This error would occur, for example, if the auto-create mirror task tries to create a topic on the destination cluster and the topic name is invalid. See the error message for more details. |
| This means some errors were suppressed because too many were encountered. |
| An InvalidRequestException was encountered. See the error message for more details. |
Troubleshoot mirror topics
On Confluent Cloud and Confluent Platform, mirror command describe on a failed mirror topic returns the cause of the failure. In the case of a failed mirror topic, you have the following choices to remediate:
Failover or delete the mirror topic.
Contact Confluent Support to repair the failed mirror topic for a subset of the failures. Failure causes that can be repaired include the following:
UNSUPPORTED_MESSAGE_FORMAT: Source leader epoch went backwards, source topic may have been recreated.
RECORD_TOO_LARGE: Truncation below high watermark. This can be caused by unclean source leader election or other errors such as inability to detect source topic recreation.
TRUNCATION_BELOW_HIGH_WATERMARK
Mirror topics can be transitioned in various ways, such as with promote, failover, or reverse. Both Confluent Cloud and Confluent Platform provide metrics and APIs that can help you find solutions when things go wrong. To learn more, see the following topics:
On Confluent Cloud: Mirror topic state transition
On Confluent Platform: View mirror topic state transition errors
Troubleshoot listing cluster links
To list cluster links using the REST API on Confluent Platform, see GET /clusters/{cluster_id}/links in the Confluent Platform API reference.
To list cluster links using the REST API on Confluent Cloud, see GET /kafka/v3/clusters/{cluster_id}/links in the Confluent Cloud API reference.
To list cluster links using the Confluent CLI, see confluent kafka link list.
Note that querying the REST API endpoint on a source cluster for a destination-initiated cluster link results in entries returned in the JSON with no cluster link name, just the link ID that represents these links. This is expected behavior. The source cluster detects the inbound link but doesn’t own it. List links on the destination cluster to get the full link details, including link_id and link_name, and then correlate this with the link ID on the source.