Metrics and Monitoring for Cluster Linking on Confluent Cloud¶
Looking for Confluent Platform Cluster Linking docs? You are currently viewing Confluent Cloud documentation. If you are looking for Confluent Platform docs, check out Cluster Linking on Confluent Platform.
To monitor Cluster Linking on Confluent Cloud, use the Confluent Cloud Metrics. As shown below, Cluster Linking exposes metrics in the API to determine the number of cluster links on a cluster, the number of mirror topics on a cluster, mirroring throughput, and mirroring lag.
Number of Active Cluster Links on a Cluster¶
- io.confluent.kafka.server/cluster_active_link_count
- Total number of cluster links connected to the cluster. You can filter or group by the direction (source or destination) of the cluster link.
Labels
Label | Description |
---|---|
mode |
Either source or destination . |
Example
Get the count of active cluster links on a cluster for the past 24 hours, grouped by the direction of the link.
{
"aggregations": [
{
"metric": "io.confluent.kafka.server/cluster_active_link_count"
}
],
"filter": {
"field": "resource.kafka.id",
"op": "EQ",
"value": "lkc-XXXXX"
},
"granularity": "PT1H",
"group_by": [
"metric.mode"
],
"intervals": [
"now-24h/now"
],
"limit": 25
}
Number of Cluster Links on a Cluster¶
- io.confluent.kafka.server/cluster_link_count
- Total number of cluster links (in any state) connected to the cluster. You can filter or group by the direction (source or destination), link name, and link state of the cluster link.
Labels
Label | Description |
---|---|
mode |
Either source or destination . |
link_state |
unavailable , paused , or active |
link_name |
Cluster link name |
Tip
For unavailable
, use the CLI or REST API to get more detailed error information.
Example 1
Get the count of all cluster links on a cluster, regardless of state.
{
"aggregations": [
{
"metric": "io.confluent.kafka.server/cluster_link_count"
}
],
"filter": {
"field": "resource.kafka.id",
"op": "AND",
"filters": [
{
"field": "resource.kafka.id",
"op": "EQ",
"value": "lkc-123"
}
]
},
"granularity": "PT1M",
"group_by": [
"metric.link_state",
"metric.link_name",
"metric.mode"
],
"intervals": [
"PT5M/now"
]
}
Example 2
Get the count of unavailable links on a cluster.
{
"aggregations": [
{
"metric": "io.confluent.kafka.server/cluster_link_count"
}
],
"filter": {
"field": "resource.kafka.id",
"op": "AND",
"filters": [
{
"field": "resource.kafka.id",
"op": "EQ",
"value": "lkc-123"
},
{
"field": "metric.link_state",
"op": "EQ",
"value": "unavailable"
}
]
},
"granularity": "PT1M",
"group_by": [
"metric.link_state",
"metric.link_name",
"metric.mode"
],
"intervals": [
"PT5M/now"
]
}
Number of Mirror Topics on a Cluster¶
- io.confluent.kafka.server/cluster_link_mirror_topic_count
- The count of mirror topics on the cluster. You can filter or group by the name of the cluster link, or by the state of the mirror topic.
Labels
Label | Description |
---|---|
link_name |
Name of the cluster link. |
link_mirror_topic_state |
The state the mirror topic is in. |
Possible states for mirror topic are as follows:
Mirror Topic State | Description |
---|---|
Mirror |
Actively mirroring data. Corresponds to the ACTIVE state in REST API. Known issue: also contains topics that are in the SOURCE_UNAVAILABLE state in REST API. |
PausedMirror |
A user has paused this mirror topic, and it is not mirroring data. Corresponds to the PAUSED state in REST API. |
PendingStoppedMirror |
A user has called promote on the mirror topic, and the promotion is in progress. Corresponds to the PENDING_STOPPED state in REST API. |
StoppedMirror |
A promote or failover command has completed, and this topic has changed from a mirror topic to a regular topic. Corresponds to the STOPPED state in REST API. |
FailedMirror |
The mirror topic has permanently failed, and will no longer mirror data. Corresponds to the FAILED state in REST API. |
Example
Get the count of active mirror topics over the past hour, grouped by cluster link name.
{
"aggregations": [
{
"metric": "io.confluent.kafka.server/cluster_link_mirror_topic_count"
}
],
"filter": {
"op": "AND",
"filters": [
{
"field": "resource.kafka.id",
"op": "EQ",
"value": "lkc-52p82"
},
{
"field": "metric.link_mirror_topic_state",
"op": "EQ",
"value": "Mirror"
}
]
},
"granularity": "PT1M",
"group_by": [
"metric.link_name"
],
"intervals": [
"now-1h/now"
],
"limit": 25
}
Mirroring Throughput¶
Source¶
- io.confluent.kafka.server/cluster_link_source_response_bytes
- Rate of mirroring throughput, in bytes per second, sent by the source.
Labels
None.
Destination¶
- io.confluent.kafka.server/cluster_link_destination_response_bytes
- Rate of mirroring throughput, in bytes per second, received by the destination. You can filter or group by cluster link name.
Labels
Label | Description |
---|---|
link_name |
Name of the cluster link. |
Example
Get mirroring throughput on a destination cluster for the past hour, grouped by cluster link name.
{
"aggregations": [
{
"metric": "io.confluent.kafka.server/cluster_link_destination_response_bytes"
}
],
"filter": {
"field": "resource.kafka.id",
"op": "EQ",
"value": "lkc-XXXXX"
},
"granularity": "PT1M",
"group_by": [
"metric.link_name"
],
"intervals": [
"now-1h/now"
],
"limit": 25
}
Mirror Topics¶
- io.confluent.kafka.server/cluster_link_mirror_topic_bytes
- The amount of bytes sent over each mirror topic on a destination cluster.
Labels
Label | Description |
---|---|
link_name |
Name of the cluster link. |
topic |
Name of the mirror topic. |
Example
Get the total number of bytes sent each day over the last week on a cluster
link called from_west
, grouped by mirror topic name.
{
"aggregations": [
{
"metric": "io.confluent.kafka.server/cluster_link_mirror_topic_bytes"
}
],
"filter": {
"op": "AND",
"filters": [
{
"field": "resource.kafka.id",
"op": "EQ",
"value": "lkc-odq3o"
},
{
"field": "metric.link_name",
"op": "EQ",
"value": "from-west"
}
]
},
"granularity": "P1D",
"group_by": [
"metric.topic"
],
"intervals": [
"now-7d/now"
],
"limit": 25
}
Mirroring Lag¶
- io.confluent.kafka.server/cluster_link_mirror_topic_offset_lag
The mirroring lag indicates how far behind the destination is from the source in terms of processing events. This is measured as the maximum number of messages lagging on any of the partitions for a mirror topic.
For example, given a mirror topic with three partitions: one partition lags 4 messages behind the source topic, another lags 24 messages behind, and the third lags 92 messages behind, the mirror topic’s lag is reported as 92.
Each mirror topic’s lag is measured once per minute. If your query’s
granularity
is higher than a minute (PT1M
), then the API will return the maximum lag from each of the minutes in that time range.If your query does not group by
topic
, then it will return the maximum lag over all of the mirror topics that match the filter clause. For example, if your query filters on a specificlink_name
, then it will return the maximum lag among all of that link’s mirror topics.
Labels
Label | Description |
---|---|
link_name |
Name of the cluster link. |
topic |
Name of the mirror topic. |
Example
Get the maximum mirroring lag for each mirror topic on a destination cluster.
{
"aggregations": [
{
"metric": "io.confluent.kafka.server/cluster_link_mirror_topic_offset_lag"
}
],
"filter": {
"field": "resource.kafka.id",
"op": "EQ",
"value": "lkc-odq3o"
},
"granularity": "PT1M",
"group_by": [
"metric.topic"
],
"intervals": [
"2021-08-14T07:00:00Z/2021-08-14T08:00:00Z"
],
"limit": 25
}