Request rate limits
Confluent Cloud has a limit on the maximum number of client requests allowed within a second. Client
requests include but are not limited to requests from a producer to send a batch, requests from a
consumer to commit an offset, or requests from a consumer to fetch messages. If request rate limits
are hit, requests may be refused and clients may be throttled to keep the cluster stable. When a
client is throttled, Confluent Cloud will delay the client’s requests for
produce-throttle-time-avg (in ms) for
fetch-throttle-time-avg (in ms) for consumers
Confluent Cloud offers different cluster types, each with its own usage limits. This demo assumes
you are running on a “basic” or “standard” cluster; both have a request limit of 1500 per second.
Open Grafana and use the username
admin and password
password to login.
Navigate to the
Confluent Cloud dashboard.
Requests (rate) panel. If this panel is yellow, you have used 80% of your allowed requests; if it’s red, you have used 90%.
See Grafana documentation for more information about about configuring thresholds.
Scroll lower down on the dashboard to see a breakdown of where the requests are to in the
Request rate stacked column chart.
Reduce requests by adjusting producer batching configurations (
batching configurations (
fetch.max.wait.ms), and shut down unnecessary clients.