Request rate limits¶
Confluent Cloud has a limit on the maximum number of client requests allowed within a second. Client
requests include but are not limited to requests from a producer to send a batch, requests from a
consumer to commit an offset, or requests from a consumer to fetch messages. If request rate limits
are hit, requests may be refused and clients may be throttled to keep the cluster stable. When a
client is throttled, Confluent Cloud will delay the client’s requests for
produce-throttle-time-avg (in ms) for
fetch-throttle-time-avg (in ms) for consumers
Confluent Cloud offers different cluster types, each with its own usage limits.
Open Grafana and use the username
Navigate to the
Requests (rate)panel. If this panel is yellow, you have used 80% of your allowed requests; if it’s red, you have used 90%. See Grafana documentation for more information about about configuring thresholds.
Scroll lower down on the dashboard to see a breakdown of where the requests are to in the
Request ratestacked column chart.
Reduce requests by adjusting producer batching configurations (
linger.ms), consumer batching configurations (
fetch.max.wait.ms), and shut down unnecessary clients.