Confluent Cloud Features and Limits by Cluster Type

The Confluent Cloud cluster type determines the features, usage limits, and price of your cluster. Cluster types include:

  • Basic
  • Standard
  • Dedicated

Features for all clusters

All Confluent Cloud cluster types support the following features:

Feature Details
Kafka ACLs For details, see Access Control Lists (ACLs) for Confluent Cloud.
Encryption at rest Confluent Cloud uses encrypted volumes for all data storage at rest.
Encryption in motion Confluent Cloud requires SSL/TLS 1.2 for all connections.
Exactly Once Semantics For details, see Processing Guarantees.
Fully-managed replica placement Confluent Cloud manages replica placement, ensuring that replicas are spread across different brokers. For multi-zone clusters, Confluent Cloud ensures replicas are also spread across different Availability Zones.
Key based compacted storage For details, see Log Compaction.
UI: Consumer lag For details, see Monitor Consumer Lag.
UI: Topic management For details, see Managing Topics in Confluent Cloud.
Uptime SLA For details, see the Confluent Cloud Service Level Agreement

For specific broker settings, Confluent Cloud Cluster and Topic Configuration Settings. For resource limits that apply to organizations, environments, clusters, and accounts, see Service Quotas for Confluent Cloud.

Basic clusters

Basic clusters are designed for development use-cases. Basic clusters support the following:

  • 99.5% uptime SLA
  • Up to 100 MB/s of throughput and 5 TB of storage.
  • You only pay for the ingress, egress, storage, and partitions. There is no base cluster price.
  • Can be upgraded to a Standard cluster at any time.

Basic cluster limits per cluster

Dimension Capability
Ingress Max 100 MBps
Egress Max 100 MBps
Storage (pre-replication) Max 5 TB
Partitions (pre-replication) Max 2048 [1]
Total client connections Max 1000
Connection attempts Max 80 per second [2 ]
Requests Max ~15000 per second
Message size Max 8 MB (to configure, see Edit a Topic)
Client version Minimum 0.11.0
Request size Max 100 MB
Fetch bytes Max 55 MB
API keys Max 20
Partition creation and deletion Max 250 per 5 minute period [3]
Connector tasks per Kafka cluster Max 250 (Note: 1 task per connector)
ACLs Max 1000

Notes about cluster limits

  1. Partition limits - Confluent Cloud clusters apply a limit to the number of partitions for a cluster. The partition limit is determined by the cluster type and size, and refers to the pre-replicated number of partitions. All topics that the customer creates as well as internal topics that are automatically created by Confluent Platform components–such as ksqlDB, Kafka Streams, Connect, and Control Center–count towards the cluster partition limit. The automatically created topics are prefixed with an underscore (_), created using the Cloud Console, Confluent Cloud CLI, Kafka AdminAPI, Confluent Control Center, Kafka Streams applications, and ksqlDB. However, topics that are internal to Kafka itself (e.g., consumer offsets) are not visible in the Cloud Console, and do not count against partition limits or toward partition billing.

  2. Connection attempts - Connection attempts are equal to the number of successful authentications plus unsuccessful authentication attempts. You can use the Metrics API to query for successful connection attempts. For more information, see Confluent Cloud Metrics API.

  3. Partition creation and deletion - The following occurs when particion creation and deletion rate limit is reached:

    • For clients < Kafka 2.7:

      The cluster always accepts and processes all the partition creates and deletes within a request, and then throttles the connection of the client until the rate of changes is below the quota.

    • For clients >= Kafka 2.7:

      The cluster only accepts and processes partition creates and deletes up to the quota. All other partition creates and deletes in the request are rejected with a THROTTLING_QUOTA_EXCEEDED error. By default, the admin client will automatically retry on that error until default.api.timeout.ms is reached. When the automatic retry is disabled by the client, the THROTTLING_QUOTA_EXCEEDED error is immediately returned to the client.

Basic cluster limits per partition

Dimension Capability
Ingress per Partition ~5 MBps
Egress per Partition ~15 MBps
Storage per Partition Max 250 GB

Standard clusters

Standard clusters are designed for production-ready features and functionality. Standard clusters support the following:

  • 99.95% uptime SLA
  • Up to 100 MB/s of throughput and 5 TB of storage.
  • Multi-zone high availability (optional).
  • Charged an hourly base price in addition to the ingress, egress, storage, and partitions.

Standard cluster limits per cluster

Dimension Capability
Ingress Max 100 MBps
Egress Max 100 MBps
Storage (pre-replication) AWS and GCP: Infinite [1], Azure: Max 5 TB
Partitions (pre-replication) Max 2048 [2]
Total client connections Max 1000
Connection attempts Max 80 per second [3 ]
Requests Max ~15000 per second
Message size Max 8 MB (to configure, see Edit a Topic)
Client version Minimum 0.11.0
Request size Max 100 MB
Fetch bytes Max 55 MB
API keys Max 20
Partition creation and deletion Max 500 per 5 minute period [4]
Connector tasks per Kafka cluster Max 250
ACLs Max 1000

Notes about cluster limits

  1. Storage (pre-replication) - Some Confluent Cloud cluster types and cloud providers support infinite storage. This means there is no maximum size limit for the amount of data that can be stored on the cluster.

    There is no price difference for clusters with infinite storage.

    The storage policy settings retention.bytes and retention.ms are still configurable at a topic level so you can control exactly how much and how long to retain data in a way that makes sense for your applications and controls your costs.

  2. Partition limits - Confluent Cloud clusters apply a limit to the number of partitions for a cluster. The partition limit is determined by the cluster type and size, and refers to the pre-replicated number of partitions. All topics that the customer creates as well as internal topics that are automatically created by Confluent Platform components–such as ksqlDB, Kafka Streams, Connect, and Control Center–count towards the cluster partition limit. The automatically created topics are prefixed with an underscore (_), created using the Cloud Console, Confluent Cloud CLI, Kafka AdminAPI, Confluent Control Center, Kafka Streams applications, and ksqlDB. However, topics that are internal to Kafka itself (e.g., consumer offsets) are not visible in the Cloud Console, and do not count against partition limits or toward partition billing.

  3. Connection attempts - Connection attempts are equal to the number of successful authentications plus unsuccessful authentication attempts. You can use the Metrics API to query for successful connection attempts. For more information, see Confluent Cloud Metrics API.

  4. Partition creation and deletion - The following occurs when particion creation and deletion rate limit is reached:

    • For clients < Kafka 2.7:

      The cluster always accepts and processes all the partition creates and deletes within a request, and then throttles the connection of the client until the rate of changes is below the quota.

    • For clients >= Kafka 2.7:

      The cluster only accepts and processes partition creates and deletes up to the quota. All other partition creates and deletes in the request are rejected with a THROTTLING_QUOTA_EXCEEDED error. By default, the admin client will automatically retry on that error until default.api.timeout.ms is reached. When the automatic retry is disabled by the client, the THROTTLING_QUOTA_EXCEEDED error is immediately returned to the client.

Standard cluster limits per partition

Dimension Capability
Ingress per Partition ~5 MBps
Egress per Partition ~15 MBps
Storage per Partition Max 250 GB

Dedicated clusters

Dedicated clusters are designed for critical production workloads with high traffic or private networking requirements. Dedicated clusters support the following:

  • Single-tenant deployments with a 99.95% uptime SLA
  • Private networking options including VPC peering, AWS Transit Gateway, AWS PrivateLink, and Azure PrivateLink.
  • Multi-zone high availability (optional).
  • Can be scaled to achieve gigabytes per second of ingress.
  • Simple scaling in terms of CKUs.
  • Cluster expansion.

Dedicated clusters are provisioned and billed in terms of Confluent Unit for Kafka (CKU). CKUs are a unit of horizontal scalability in Confluent Cloud that provide a pre-allocated amount of resources. How much you can ingest and stream per CKU depends on a variety of factors including client application design and partitioning strategy.

Dedicated cluster sizing

Dedicated clusters can be purchased in any whole number of CKUs up to a limit. For organizations with credit card billing the upper limit is 4 CKUs per dedicated cluster. For organizations with integrated cloud provider billing or payment using an invoice, the upper limit is 24 CKUs per dedicated cluster. To provision a cluster with more CKUs than the upper limit, contact Confluent Support.

Single-zone clusters can have 1 or more CKUs, whereas multi-zone clusters require a minimum of 2 CKUs and provide high availability. Zone availability cannot be changed after the cluster is created.

Dedicated cluster CKUs and limits

CKUs determine the capacity of your cluster. For a Confluent Cloud cluster, the expected performance for any given workload is dependent on a variety of factors, such as message size, number of partitions, number of clients, ratio of ingress to egress, and the number of CKUs provisioned for the cluster.

Use this table to determine the minimum number of CKUs to use for a given workload. Values for storage and partition limits are pre-replication.

Important

The performance of your clients depends upon the behavior of your workload. In general, you cannot simultaneously max out all the dimensions of your workload behavior and achieve the maximum CKU throughput bandwidth. For example, if you reach the partition limit, you will not likely reach the maximum CKU throughput bandwidth.

See Benchmark Your Dedicated Apache Kafka Cluster on Confluent Cloud for detailed benchmarking instructions and results for a 2 CKU cluster.

Dimension Maximum per CKU Details
Ingress ~50 megabytes per second (MBps)

Number of bytes that can be produced to the cluster in one second.

Use this value as a guideline for capacity planning. The ability to fully utilize this dimension depends on the workload and utilization of other dimensions. Additional details and maximum bandwidth you can achieve for each cloud provider (AWS, GCP, Azure), are available in Benchmark Your Dedicated Apache Kafka Cluster on Confluent Cloud.

Available in the Confluent Cloud Metrics API as received_bytes (convert from bytes to MB).

If you exceed this value, the producers may be throttled to ensure the cluster remains available, which will register as non-zero values for the producer client produce-throttle-time-max and produce-throttle-time-avg metrics.

If you are self-managing Kafka, you can look at the producer outgoing-byte-rate metrics and broker kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec metrics to understand your throughput.

To reduce usage on this dimension, you can compress your messages. lz4 is recommended for compression. gzip is not recommended because it incurs high overhead on the cluster.

Egress ~150 megabytes per second (MBps)

Number of bytes that can be consumed from the cluster in one second.

Use this value as a guideline for capacity planning. The ability to fully utilize this dimension depends on the workload and utilization of other dimensions. Additional details and maximum bandwidth you can achieve for each cloud provider (AWS, GCP, Azure), are available in Benchmark Your Dedicated Apache Kafka Cluster on Confluent Cloud.

Available in the Confluent Cloud Metrics API as sent_bytes (convert from bytes to MB).

If you exceed this value, the consumers may be throttled to ensure the cluster remains available, which will register as non-zero values for the consumer client fetch-throttle-time-max and fetch-throttle-time-avg metrics.

If you are self-managing Kafka, you can look at the consumer incoming-byte-rate metrics and broker kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec to understand your throughput.

To reduce usage on this dimension, you can compress your messages and ensure each consumer is only consuming from the topics it requires. lz4 is recommended for compression. gzip is not recommended because it incurs high overhead on the cluster.

Storage (pre-replication) AWS and GCP: Infinite, Azure: 10 terabytes (TB)

Maximum bytes that can be stored on the cluster at one time, measured before replication.

Dedicated clusters on AWS and GCP have infinite storage [1].

Available in the Confluent Cloud Metrics API as retained_bytes (convert from bytes to TB). The API value is post-replication, so divide by the replication factor of three to get pre-replication storage usage.

If you exceed the maximum, the producers will be throttled to prevent additional writes, which will register as non-zero values for the producer client produce-throttle-time-max and produce-throttle-time-avg metrics.

If you are self-managing Kafka, you can look at how much disk space your cluster is using to understand your storage needs.

To reduce usage on this dimension, you can compress your messages and reduce your retention settings. lz4 is recommended for compression. gzip is not recommended because it incurs high overhead on the cluster.

Partitions (pre-replication) 4,500 partitions

Maximum number of partitions that can exist on the cluster at one time, before replication.

Available in the Confluent Cloud Metrics API as partition_count.

Attempts to create additional partitions beyond this limit will fail with an error message.

If you are self-managing Kafka, you can look at the broker kafka.controller:type=KafkaController,name=GlobalPartitionCount metric to understand your partition usage.

To reduce usage on this dimension, you can delete unused topics and create new topics with fewer partitions. You can use the Kafka Admin interface to increase the partition count of an existing topic if the initial partition count is too low.

Partition limits - Confluent Cloud clusters apply a limit to the number of partitions for a cluster. The partition limit is determined by the cluster type and size, and refers to the pre-replicated number of partitions. All topics that the customer creates as well as internal topics that are automatically created by Confluent Platform components–such as ksqlDB, Kafka Streams, Connect, and Control Center–count towards the cluster partition limit. The automatically created topics are prefixed with an underscore (_), created using the Cloud Console, Confluent Cloud CLI, Kafka AdminAPI, Confluent Control Center, Kafka Streams applications, and ksqlDB. However, topics that are internal to Kafka itself (e.g., consumer offsets) are not visible in the Cloud Console, and do not count against partition limits or toward partition billing.

Total client connections ~3,000 connections

Number of TCP connections to the cluster that can be open at one time.

Available in the Confluent Cloud Metrics API as active_connection_count.

Use this value as a guideline for capacity planning. The ability to fully utilize this dimension depends on the workload and utilization of other dimensions. If you exceed this value, new client connections may be refused to ensure the cluster remains available. Producer and consumer clients may also be throttled to keep the cluster stable. This throttling would register as non-zero values for the producer client produce-throttle-time-max and produce-throttle-time-avg metrics and consumer client fetch-throttle-time-max and fetch-throttle-time-avg metrics.

If you are self-managing Kafka, you can look at the broker kafka.server:type=socket-server-metrics,listener={listener_name},networkProcessor={#},name=connection-count metrics to understand how many connections you are using.

This value can vary widely based on several factors, including number of producer clients, number of consumer clients, number of CKUs provisioned for the cluster, partition keying strategy, produce patterns per client, and consume patterns per client.

To reduce usage on this dimension, you can reduce the total number of clients connecting to the cluster.

Connection attempts 250 connection attempts per second

Maximum number of new TCP connections to the cluster that can be created in one second. This means successful authentications plus unsuccessful authentication attempts.

If you exceed the maximum, connection attempts may be refused. Producer and consumer clients may also be throttled to ensure the cluster remains available. This throttling would register as non-zero values for the producer client produce-throttle-time-max and produce-throttle-time-avg metrics and consumer client fetch-throttle-time-max and fetch-throttle-time-avg metrics.

If you are self-managing Kafka, you can look at the rate of change for broker kafka.server:type=socket-server-metrics,listener={listener_name},networkProcessor={#},name=connection-count metrics and client connection-creation-rate metrics to understand how many new connections you are creating.

To reduce usage on this dimension, you can use longer-lived connections to the cluster.

Requests ~15,000 requests per second

Number of client requests to the cluster in one second.

Available in the Confluent Cloud Metrics API as request_count.

Use this value as a guideline for capacity planning. The ability to fully utilize this dimension depends on the workload and utilization of other dimensions. If you exceed this value, requests may be refused. Producer and consumer clients may also be throttled to ensure the cluster remains available. This throttling would register as non-zero values for the producer client produce-throttle-time-max and produce-throttle-time-avg metrics and consumer client fetch-throttle-time-max and fetch-throttle-time-avg metrics.

If you are self-managing Kafka, you can look at the broker kafka.network:type=RequestMetrics,name=RequestsPerSec,request={Produce FetchConsumer FetchFollower} metrics and client request-rate metrics to understand your request volume.

To reduce usage on this dimension, you can adjust producer batching configurations, consumer client batching configurations, and shut down otherwise inactive clients.

Notes about cluster limits

  1. Storage (pre-replication) - Some Confluent Cloud cluster types and cloud providers support infinite storage. This means there is no maximum size limit for the amount of data that can be stored on the cluster.

    There is no price difference for clusters with infinite storage.

    The storage policy settings retention.bytes and retention.ms are still configurable at a topic level so you can control exactly how much and how long to retain data in a way that makes sense for your applications and controls your costs.

Dedicated cluster limits per cluster

You can add CKUs to a dedicated cluster to meet the capacity for your high traffic workloads. However, the limits shown in this table will not change as you increase the number of CKUs. For more information, see Service Quotas for Confluent Cloud and Custom topic and cluster settings for Dedicated clusters.

Dimension Capability
Message size Max 20 MB
Client version Minimum 0.11.0
Request size Max 100 MB
Fetch bytes Max 55 MB
Partitions Depends on number of CKUs, absolute max 100,000
API keys Max 1000
Partition creation and deletion Max 5000 per 5 minute period [1]
Connector tasks per Kafka cluster Max 250
ACLs Max 10000
VPC peering connections Max 25
AWS Accounts requesting PrivateLink connections Max 10 Accounts (no limit on number of connections)
Azure Subscriptions requesting PrivateLink connections Max 10 Subscriptions (no limit on number of connections)

Notes about cluster limits

  1. Partition creation and deletion - The following occurs when particion creation and deletion rate limit is reached:

    • For clients < Kafka 2.7:

      The cluster always accepts and processes all the partition creates and deletes within a request, and then throttles the connection of the client until the rate of changes is below the quota.

    • For clients >= Kafka 2.7:

      The cluster only accepts and processes partition creates and deletes up to the quota. All other partition creates and deletes in the request are rejected with a THROTTLING_QUOTA_EXCEEDED error. By default, the admin client will automatically retry on that error until default.api.timeout.ms is reached. When the automatic retry is disabled by the client, the THROTTLING_QUOTA_EXCEEDED error is immediately returned to the client.

Dedicated cluster limits per partition

You can add CKUs to a dedicated cluster to meet the capacity for your high traffic workloads. However, the limits shown in this table will not change as you increase the number of CKUs.

Dimension Capability
Ingress per Partition Max 10 MBps (aggregate consumer throughput)
Egress per Partition Max 30 MBps (aggregate consumer throughput)
Storage per Partition Max 250 GB

Dedicated cluster provisioning and resizing time

Provisioning time

The estimated time to initially provision a dedicated cluster depends on the cluster’s size and your choice of cloud provider. Note that provisioning time is excluded from the Confluent SLA.

  • AWS: the expected provisioning times is one hour per CKU. A dedicated 2 CKU cluster on AWS is estimated to complete provisioning in about two hours. Sometimes, however, provisioning may take 24 hours or more. Contact Confluent Support if provisioning takes longer than 24 hours.
  • GCP: the expected provisioning time is one hour per CKU. A dedicated 2 CKU cluster on GCP is estimated to complete provisioning in two hours.
  • Azure: the expected provisioning times is two and a half hours per CKU. A dedicated 2 CKU cluster on Azure is estimated to complete in five hours.

You will receive an email when provisioning is complete.

Resizing time

Resizing a dedicated cluster depends on the cluster’s size, underlying cloud provider, and the current workload on the cluster.

When a cluster is not heavily loaded, expansion times will generally be similar to expected provisioning times (single-digit hours). When a cluster is under heavy load, expansion can take longer, up to a few days to complete resizing and data rebalancing.

When shrinking a cluster by 1 CKU, you can expect a shrink time of 1-2 hours.

Regardless of the load, when a cluster has infinite storage enabled, the resize should generally complete more quickly.

During a resize operation, your applications may see leader elections, but otherwise performance will not suffer. Supported Kafka clients will gracefully handle these changes. You will receive an email when the resize operation is complete.