Request rate limits¶
Confluent Cloud has a limit on the maximum number of client requests allowed within a second. Client
requests include but are not limited to requests from a producer to send a batch, requests from a
consumer to commit an offset, or requests from a consumer to fetch messages. If request rate limits
are hit, requests may be refused and clients may be throttled to keep the cluster stable. When a
client is throttled, Confluent Cloud will delay the client’s requests for produce-throttle-time-avg
(in ms) for
producers or fetch-throttle-time-avg
(in ms) for consumers
Confluent Cloud offers different cluster types, each with its own usage limits.
Open Grafana and use the username
admin
and passwordpassword
to login.Navigate to the
Confluent Cloud
dashboard.Check the
Requests (rate)
panel. If this panel is yellow, you have used 80% of your allowed requests; if it’s red, you have used 90%. See Grafana documentation for more information about about configuring thresholds.Scroll lower down on the dashboard to see a breakdown of where the requests are to in the
Request rate
stacked column chart.Reduce requests by adjusting producer batching configurations (
linger.ms
), consumer batching configurations (fetch.max.wait.ms
), and shut down unnecessary clients.