Confluent Cloud Metrics API

Note

This feature is in preview. The API is not yet stable and breaking changes may occur. This feature is being rolled out to Confluent Cloud deployments in phases and may not be available in your cluster yet. The phased rollout is expected to complete in January 2020.

The Confluent Cloud Metrics API provides actionable operational metrics about your Confluent Cloud deployment. This is a queryable HTTP API in which the user will POST a query written in JSON and get back a time series of metrics specified by the query.

Metrics API Quick Start

Prerequisites

The following examples use HTTPie rather than cURL. This software package can be installed using most common software package managers by following the documentation .

Use your Confluent Cloud username and password to authenticate your requests.

List the available metrics

Get a description of the available metrics by sending a GET request to the descriptors endpoint of the API:

http -v https://api.telemetry.confluent.cloud/v1/metrics/cloud/descriptors --auth '<USER>:<PASSWORD>'

This returns a JSON blob with details on the available metrics to query.

List the available topics for a given metric in a specified interval

  1. Create a file named attributes_query.json using the following template. Be sure to change lkc-XXXXX and the timestample values to match your needs.

    {
        "filter": {
            "field": "metric.label.cluster_id",
            "op": "EQ",
            "value": "lkc-XXXXX"
        },
        "group_by": [
            "metric.label.topic"
        ],
        "intervals": [
            "2020-01-13T10:30:00-05:00/2020-01-13T11:00:00-05:00"
        ],
        "limit": 25,
        "metric": "io.confluent.kafka.server/sent_bytes/delta"
    }
    
  2. Submit the query as a POST using the following command. Be sure to change USER and PASSWORD to match your environments.

    http -v https://api.telemetry.confluent.cloud/v1/metrics/cloud/attributes --auth '<USER>:<PASSWORD>' < attributes_query.json
    

    Your output should resemble:

    Note

    Be aware that topics without timeseries values during the specified interval will not be returned.

    {
        "data": [
            {},
            {
                "metric.label.topic": "test-topic"
            }
        ],
        "meta": {
            "pagination": {
                "page_size": 25
            }
        }
    }
    

Query for bytes produced per minute grouped by topic

  1. Create a file named sent_bytes_query.json using the following template. Be sure to change lkc-XXXXX and the timestamp values to match your needs.

    {
        "aggregations": [
            {
                "agg": "SUM",
                "metric": "io.confluent.kafka.server/sent_bytes/delta"
            }
        ],
        "filter": {
            "filters": [
                {
                    "field": "metric.label.cluster_id",
                    "op": "EQ",
                    "value": "lkc-XXXXX"
                }
            ],
            "op": "AND"
        },
        "granularity": "PT1M",
        "group_by": [
            "metric.label.topic"
        ],
        "intervals": [
            "2019-12-19T11:00:00-05:00/2019-12-19T11:05:00-05:00"
        ],
        "limit": 25
    }
    
  2. Submit the query as a POST using the following command. Be sure to change USER and PASSWORD to match your environments.

    http -v https://api.telemetry.confluent.cloud/v1/metrics/cloud/query --auth '<USER>:<PASSWORD>' < sent_bytes_query.json
    

    Your output should resemble:

    Note

    Be aware that if you have not produced data during the time window, the dataset will be empty for a given topic.

    {
         "data": [
             {
                 "timestamp": "2019-12-19T16:01:00Z",
                 "metric.label.topic": "test-topic",
                 "value": 0.0
             },
             {
                 "timestamp": "2019-12-19T16:02:00Z",
                 "metric.label.topic": "test-topic",
                 "value": 157.0
             },
             {
                 "timestamp": "2019-12-19T16:03:00Z",
                 "metric.label.topic": "test-topic",
                 "value": 371.0
             },
             {
                 "timestamp": "2019-12-19T16:04:00Z",
                 "metric.label.topic": "test-topic",
                 "value": 0.0
             }
         ]
     }
    

Query for bytes consumed per minute grouped by topic

  1. Create a file named received_bytes_query.json using the following template. Be sure to change lkc-XXXXX and the timestamp values to match your needs.

    {
        "aggregations": [
            {
                "agg": "SUM",
                "metric": "io.confluent.kafka.server/received_bytes/delta"
            }
        ],
        "filter": {
            "filters": [
                {
                    "field": "metric.label.cluster_id",
                    "op": "EQ",
                    "value": "lkc-XXXXX"
                }
            ],
            "op": "AND"
        },
        "granularity": "PT1M",
        "group_by": [
            "metric.label.topic"
        ],
        "intervals": [
            "2019-12-19T11:00:00-05:00/2019-12-19T11:05:00-05:00"
        ],
        "limit": 25
    }
    
  2. Submit the query as a POST using the following command. Be sure to change USER and PASSWORD to match your environments.

    http -v https://api.telemetry.confluent.cloud/v1/metrics/cloud/query --auth '<USER>:<PASSWORD>' < received_bytes_query.json
    

    Your output should resemble:

    Note

    Be aware that if you have not produced data during the time window, the dataset will be empty for a given topic.

    {
        "data": [
            {
                "timestamp": "2019-12-19T16:00:00Z",
                "metric.label.topic": "test-topic",
                "value": 72.0
            },
            {
                "timestamp": "2019-12-19T16:01:00Z",
                "metric.label.topic": "test-topic",
                "value": 139.0
            },
            {
                "timestamp": "2019-12-19T16:02:00Z",
                "metric.label.topic": "test-topic",
                "value": 232.0
            },
            {
                "timestamp": "2019-12-19T16:03:00Z",
                "metric.label.topic": "test-topic",
                "value": 0.0
            },
            {
                "timestamp": "2019-12-19T16:04:00Z",
                "metric.label.topic": "test-topic",
                "value": 0.0
            }
        ]
    }
    

Query for max retained bytes per hour over 2 hours for topic named test-topic

  1. Create a file named retained_bytes_query.json using the following template. Change lkc-XXXXX and the timestamp values to match your needs.

    {
        "aggregations": [
            {
                "agg": "SUM",
                "metric": "io.confluent.kafka.server/retained_bytes"
            }
        ],
        "filter": {
            "filters": [
                {
                     "field": "metric.label.topic",
                     "op": "EQ",
                     "value": "test-topic"
                },
                {
                    "field": "metric.label.cluster_id",
                    "op": "EQ",
                    "value": "lkc-XXXXX"
                }
            ],
            "op": "AND"
        },
        "granularity": "PT1M",
        "group_by": [
            "metric.label.topic"
        ],
        "intervals": [
            "2019-12-19T11:00:00-05:00/P0Y0M0DT2H0M0S"
        ],
        "limit": 25
    }
    
  2. Submit the query as a POST using the following command. Be sure to change USER and PASSWORD to match your environments.

    http -v https://api.telemetry.confluent.cloud/v1/metrics/cloud/query --auth '<USER>:<PASSWORD>' < retained_bytes_query.json
    

    Your output should resemble:

    {
        "data": [
            {
                "timestamp": "2019-12-19T16:00:00Z",
                "metric.label.topic": "test-topic",
                "value": 406561.0
            },
            {
                "timestamp": "2019-12-19T17:00:00Z",
                "metric.label.topic": "test-topic",
                "value": 406561.0
            }
        ]
    }
    

Query for max retained bytes per hour over 2 hours for a cluster lkc-XXXX

  1. Create a file named cluster_retained_bytes_query.json using the following template. Be sure to change lkc-XXXXX and the timestamp values to match your needs.

    {
        "aggregations": [
            {
                "agg": "SUM",
                "metric": "io.confluent.kafka.server/retained_bytes"
            }
        ],
        "filter": {
            "filters": [
                {
                    "field": "metric.label.cluster_id",
                    "op": "EQ",
                    "value": "lkc-xr36q"
                }
            ],
            "op": "AND"
        },
        "granularity": "PT1H",
        "group_by": [
            "metric.label.cluster_id"
        ],
        "intervals": [
            "2019-12-19T11:00:00-05:00/P0Y0M0DT2H0M0S"
        ],
        "limit": 5
    }
    
  2. Submit the query as a POST with following command. Be sure to change USER and PASSWORD to match your environments.

    http -v https://api.telemetry.confluent.cloud/v1/metrics/cloud/query --auth '<USER>:<PASSWORD>' < cluster_retained_bytes_query.json
    

    Your output should resemble:

    {
        "data": [
            {
                "timestamp": "2019-12-19T16:00:00Z",
                "value": 507350.0
            },
            {
                "timestamp": "2019-12-19T17:00:00Z",
                "value": 507350.0
            }
        ]
    }
    

FAQ

Why am I seeing empty data sets for topics that exist on queries other than for retained_bytes?

If there are only values of 0.0 in the time range queried, than the API will return an empty set. When there is non-zero data within the time range, time slices with values of 0.0 are returned.

Why didn’t retained_bytes decrease after I changed the retention policy for my topic?

The value of retained_bytes is the maximum over the time range returned. If data has been deleted during the current timeslice, you will not see the effect until the next time range window begins.

What are the the supported granularity levels?

The following table shows the granularity levels in the Granularity enumeration.

Reporting interval Symbol
1 minute PT1M
5 minutes PT5M
15 minutes PT15M
30 minutes PT30M
1 hour PT1H
All granularities ALL

Why don’t I see consumer lag in the Metrics API?

In Kafka, consumer lag is not tracked as a metric on the server side. This is because it is a cluster level construct and today, Kafka’s metrics are derived from instrumentation at a lower level abstraction. Consumer lag may be added to the Metrics API at a later date. At this time, there are multiple other ways to monitor Consumer lag including the client metrics, UI, CLI, and Admin API. These methods are all available when using Confluent Cloud.