Request rate limits¶
Confluent Cloud has a limit on the maximum number of client requests allowed within a second. Client
requests include but are not limited to requests from a producer to send a batch, requests from a
consumer to commit an offset, or requests from a consumer to fetch messages. If request rate limits
are hit, requests may be refused and clients may be throttled to keep the cluster stable. When a
client is throttled, Confluent Cloud will delay the client’s requests for
produce-throttle-time-avg (in ms) for
fetch-throttle-time-avg (in ms) for consumers
Confluent Cloud offers different cluster types, each with its own usage limits. This demo assumes you are running on a “basic” or “standard” cluster; both have a request limit of 1500 per second.
Open Grafana and use the username
Navigate to the
Requests (rate)panel. If this panel is yellow, you have used 80% of your allowed requests; if it’s red, you have used 90%. See Grafana documentation for more information about about configuring thresholds.
Scroll lower down on the dashboard to see a breakdown of where the requests are to in the
Request ratestacked column chart.
Reduce requests by adjusting producer batching configurations (
linger.ms), consumer batching configurations (
fetch.max.wait.ms), and shut down unnecessary clients.