Confluent Cloud Cluster Types

The Confluent Cloud cluster type determines the features, usage limits, and price of your cluster.

Features for all clusters

All Confluent Cloud cluster types support the following features:

Feature Details
Kafka ACLs For details, see Access Control Lists (ACLs) for Confluent Cloud.
Consumer lag UI For details, see Monitor Consumer Lag.
Encryption at rest Confluent Cloud uses encrypted volumes for all data storage at rest.
Encryption in motion Confluent Cloud requires SSL/TLS 1.2 for connections.
Exactly Once Semantics For details, see Processing Guarantees.
Key based compacted storage For details, see Log Compaction.
Service accounts For details, see Service Accounts for Confluent Cloud.
Support For details, see Compliance for Confluent Cloud.
Topic management UI For details, see Managing Topics in Confluent Cloud.
Schema Registry For details, see Supported Features for Confluent Cloud Schema Registry.

For specific broker settings, Confluent Cloud Broker and Topic Configuration Settings.

Basic clusters

Basic clusters are designed for development use-cases. Basic clusters support the following:

  • 99.5% uptime SLA
  • Up to 100 MB/s of throughput and 5 TB of storage.
  • You only pay for the ingress, egress, and storage. There is no base cluster price.

Basic cluster limits per cluster

Dimension Capability
Ingress Max 100 MBps
Egress Max 100 MBps
Storage (pre-replication) Max 5 TB
Partitions (pre-replication) Max 2048 [1]
Total client connections Max 1000
Connection attempts Max 10 per second
Requests Max 1500 per second
Message size Max 2 MB
Client version Minimum 0.11.0
Request size Max 100 MB
Fetch bytes Max 55 MB
API keys Max 20
Partition creation and deletion Max 250 per 5 minute period [2]
Networking options Internet endpoints
Multi-zone high availability Not available
Confluent Cloud Schema Registry schema versions Max 1000
Tasks per Connector Max 1
ACLs Max 1000

Notes:

  1. All topics, partitions, and replicas shown in the Confluent Cloud UI count toward the cluster partition limit. This includes topics that are automatically created by Confluent Platform components, such as ksqlDB, Kafka Streams, Connect, and Control Center. The automatically created topics are prefixed with an underscore (_), created using the Confluent Cloud UI, Confluent Cloud CLI, Kafka AdminAPI, Confluent Control Center, Kafka Streams applications, and ksqlDB. However, topics that are internal to Kafka itself (e.g., consumer offsets) are not visible in Confluent Cloud UI and do not count against partition limits or toward partition billing.

  2. What happens when the partition creation and deletion rate limit is reached:

    • For clients < Kafka 2.7:

      The cluster always accepts and processes all the partition creates and deletes within a request, and then throttles the connection of the client until the rate of changes is below the quota.

    • For clients >= Kafka 2.7:

      The cluster only accepts and processes partition creates and deletes up to the quota. All other partition creates and deletes in the request are rejected with a THROTTLING_QUOTA_EXCEEDED error. By default, the admin client will automatically retry on that error until default.api.timeout.ms is reached. When the automatic retry is disabled by the client, the THROTTLING_QUOTA_EXCEEDED error is immediately returned to the client.

Basic cluster limits per partition

Dimension Capability
Ingress per Partition Max 5 MBps
Egress per Partition Max 15 MBps
Storage per Partition Max 250 GB

Standard clusters

Standard clusters are designed for production-ready features and functionality. Standard clusters support the following:

  • 99.95% uptime SLA.
  • Up to 100 MB/s of throughput and 5 TB of storage.
  • Multi-zone availability.
  • Unlimited Connect tasks per connector.
  • Charged an hourly base price in addition to the ingress, egress, and storage.

Standard cluster limits per cluster

Dimension Capability
Ingress Max 100 MBps
Egress Max 100 MBps
Storage (pre-replication) Max 5 TB
Partitions (pre-replication) Max 2048 [1]
Total client connections Max 1000
Connection attempts Max 10 per second
Requests Max 1500 per second
Message size Max 2 MB
Client version Minimum 0.11.0
Request size Max 100 MB
Fetch bytes Max 55 MB
API keys Max 20
Partition creation and deletion Max 500 per 5 minute period [2]
Networking options Internet endpoints
Multi-zone high availability Available
Confluent Cloud Schema Registry schema versions Max 1000
Tasks per Connector Unlimited
ACLs Max 1000

Notes:

  1. All topics, partitions, and replicas shown in the Confluent Cloud UI count toward the cluster partition limit. This includes topics that are automatically created by Confluent Platform components, such as ksqlDB, Kafka Streams, Connect, and Control Center. The automatically created topics are prefixed with an underscore (_), created using the Confluent Cloud UI, Confluent Cloud CLI, Kafka AdminAPI, Confluent Control Center, Kafka Streams applications, and ksqlDB. However, topics that are internal to Kafka itself (e.g., consumer offsets) are not visible in Confluent Cloud UI and do not count against partition limits or toward partition billing.

  2. What happens when the partition creation and deletion rate limit is reached:

    • For clients < Kafka 2.7:

      The cluster always accepts and processes all the partition creates and deletes within a request, and then throttles the connection of the client until the rate of changes is below the quota.

    • For clients >= Kafka 2.7:

      The cluster only accepts and processes partition creates and deletes up to the quota. All other partition creates and deletes in the request are rejected with a THROTTLING_QUOTA_EXCEEDED error. By default, the admin client will automatically retry on that error until default.api.timeout.ms is reached. When the automatic retry is disabled by the client, the THROTTLING_QUOTA_EXCEEDED error is immediately returned to the client.

Standard cluster limits per partition

Dimension Capability
Ingress per Partition Max 5 MBps
Egress per Partition Max 15 MBps
Storage per Partition Max 250 GB

Dedicated clusters

Dedicated clusters are designed for critical production workloads with high traffic or private networking requirements. Dedicated clusters support the following:

  • Single-tenant deployments with a 99.95% uptime SLA.
  • Private networking options including VPC peering, AWS Transit Gateway, and AWS PrivateLink.
  • Multi-zone availability.
  • Unlimited Connect tasks per connector without any capacity limits.
  • Can be scaled to achieve gigabytes per second of ingress.
  • Simple scaling in terms of CKUs.
  • Cluster expansion.

Dedicated clusters are provisioned and billed in terms of Confluent Unit for Kafka (CKU). CKUs are a unit of horizontal scalability in Confluent Cloud that provide a pre-allocated amount of resources. How much you can ingest and stream per CKU depends on a variety of factors including client application design and partitioning strategy.

Dedicated cluster sizing

Dedicated clusters can be purchased in any increment of CKUs up to 24. To provision a cluster with more than 24 CKUs, contact your Confluent sales representative.

Multi-zone clusters require a minimum of 2 CKUs.

Dedicated cluster CKUs and limits

CKUs determine the capacity of your cluster. For a Confluent Cloud cluster, the expected performance for any given workload is dependent on a variety of factors, such as message size, number of partitions, number of clients, ratio of ingress to egress, and the number of CKUs provisioned for the cluster.

Use this table to determine the minimum number of CKUs to use for a given workload. Values for storage and partition limits are pre-replication.

Important

The performance of your clients depends upon the behavior of your workload. In general, you cannot simultaneously max out all the dimensions of your workload behavior and achieve the maximum CKU throughput bandwidth. For example, if you reach the partition limit, you will not likely reach the maximum CKU throughput bandwidth.

See Benchmark Your Dedicated Apache Kafka Cluster on Confluent Cloud for detailed benchmarking instructions and results for a 2 CKU cluster.

Dimension Maximum per CKU Details
Ingress 50 megabytes per second (MBps)

Maximum number of bytes that can be produced to the cluster in one second.

Available in the Confluent Cloud Metrics API as received_bytes (convert from bytes to MB).

If you exceed the maximum, the producers will be throttled to maintain the peak ingress level, which will register as non-zero values for the producer client produce-throttle-time-max and produce-throttle-time-avg metrics.

If you are self-managing Kafka, you can look at the producer outgoing-byte-rate metrics and broker kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec metrics to understand your throughput.

To reduce usage on this dimension, you can compress your messages. lz4 is recommended for compression. gzip is not recommended because it incurs high overhead on the cluster.

Egress 150 megabytes per second (MBps)

Maximum number of bytes that can be consumed from the cluster in one second.

Available in the Confluent Cloud Metrics API as sent_bytes (convert from bytes to MB).

If you exceed the maximum, the consumers will be throttled to maintain the peak egress level, which will register as non-zero values for the consumer client fetch-throttle-time-max and fetch-throttle-time-avg metrics.

If you are self-managing Kafka, you can look at the consumer incoming-byte-rate metrics and broker kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec to understand your throughput.

To reduce usage on this dimension, you can compress your messages and ensure each consumer is only consuming from the topics it requires. lz4 is recommended for compression. gzip is not recommended because it incurs high overhead on the cluster.

Storage (pre-replication) 10 terabytes (TB)

Maximum bytes that can be stored on the cluster at one time, measured before replication.

Available in the Confluent Cloud Metrics API as retained_bytes (convert from bytes to TB). The API value is post-replication, so divide by the replication factor of three to get pre-replication storage usage.

If you exceed the maximum, the producers will be throttled to prevent additional writes, which will register as non-zero values for the producer client produce-throttle-time-max and produce-throttle-time-avg metrics.

If you are self-managing Kafka, you can look at how much disk space your cluster is using to understand your storage needs.

To reduce usage on this dimension, you can compress your messages and reduce your retention settings. lz4 is recommended for compression. gzip is not recommended because it incurs high overhead on the cluster.

Partitions (pre-replication) 4,500 partitions

Maximum number of partitions that can exist on the cluster at one time, before replication.

Available in the Confluent Cloud Metrics API as partition_count.

Attempts to create additional partitions beyond this limit will fail with an error message.

If you are self-managing Kafka, you can look at the broker kafka.controller:type=KafkaController,name=GlobalPartitionCount metric to understand your partition usage.

To reduce usage on this dimension, you can delete unused topics and create new topics with fewer partitions. You can use the Kafka Admin interface to increase the partition count of an existing topic if the initial partition count is too low.

All topics, partitions, and replicas shown in the Confluent Cloud UI count toward the cluster partition limit. This includes topics that are automatically created by Confluent Platform components, such as ksqlDB, Kafka Streams, Connect, and Control Center. The automatically created topics are prefixed with an underscore (_), created using the Confluent Cloud UI, Confluent Cloud CLI, Kafka AdminAPI, Confluent Control Center, Kafka Streams applications, and ksqlDB. However, topics that are internal to Kafka itself (e.g., consumer offsets) are not visible in Confluent Cloud UI and do not count against partition limits or toward partition billing.

Total client connections 3,000 connections

Maximum number of TCP connections to the cluster that can be open at one time.

Available in the Confluent Cloud Metrics API as active_connection_count.

If you exceed the maximum, new client connections may be refused. Producer and consumer clients may also be throttled to keep the cluster stable. This throttling would register as non-zero values for the producer client produce-throttle-time-max and produce-throttle-time-avg metrics and consumer client fetch-throttle-time-max and fetch-throttle-time-avg metrics.

If you are self-managing Kafka, you can look at the broker kafka.server:type=socket-server-metrics,listener={listener_name},networkProcessor={#},name=connection-count metrics to understand how many connections you are using.

This value can vary widely based on several factors, including number of producer clients, number of consumer clients, number of brokers in the cluster, partition keying strategy, produce patterns per client, and consume patterns per client.

To reduce usage on this dimension, you can reduce the total number of clients connecting to the cluster.

Connection attempts 80 connection attempts per second

Maximum number of new TCP connections to the cluster that can be created in one second.

If you exceed the maximum, connection attempts may be refused. Producer and consumer clients may also be throttled to keep the cluster stable. This throttling would register as non-zero values for the producer client produce-throttle-time-max and produce-throttle-time-avg metrics and consumer client fetch-throttle-time-max and fetch-throttle-time-avg metrics.

If you are self-managing Kafka, you can look at the rate of change for broker kafka.server:type=socket-server-metrics,listener={listener_name},networkProcessor={#},name=connection-count metrics and client connection-creation-rate metrics to understand how many new connections you are creating.

To reduce usage on this dimension, you can use longer-lived connections to the cluster.

Requests 15,000 requests per second

Maximum number of client requests to the cluster in one second.

Available in the Confluent Cloud Metrics API as request_count.

If you exceed the maximum, requests may be refused. Producer and consumer clients may also be throttled to keep the cluster stable. This throttling would register as non-zero values for the producer client produce-throttle-time-max and produce-throttle-time-avg metrics and consumer client fetch-throttle-time-max and fetch-throttle-time-avg metrics.

If you are self-managing Kafka, you can look at the broker kafka.network:type=RequestMetrics,name=RequestsPerSec,request={Produce FetchConsumer FetchFollower} metrics and client request-rate metrics to understand your request volume.

To reduce usage on this dimension, you can adjust producer batching configurations, consumer client batching configurations, and shut down otherwise inactive clients.

Dedicated cluster limits per cluster

You can add CKUs to a dedicated cluster to meet the capacity for your high traffic workloads. However, the limits shown in this table will not change as you increase the number of CKUs.

Dimension Capability
Message size Max 8 MB
Client version Minimum 0.11.0
Request size Max 100 MB
Fetch bytes Max 55 MB
Partitions Depends on number of CKUs, absolute max 100,000
API keys Max 500
Partition creation and deletion Max 5000 per 5 minute period [1]
Networking options Internet endpoints, VPC peering, AWS Transit Gateway, AWS PrivateLink
Multi-zone high availability Available
Confluent Cloud Schema Registry schema versions Max 1000
Connector tasks per Kafka cluster Max 250
ACLs Max 10000

Notes:

  1. What happens when the partition creation and deletion rate limit is reached:

    • For clients < Kafka 2.7:

      The cluster always accepts and processes all the partition creates and deletes within a request, and then throttles the connection of the client until the rate of changes is below the quota.

    • For clients >= Kafka 2.7:

      The cluster only accepts and processes partition creates and deletes up to the quota. All other partition creates and deletes in the request are rejected with a THROTTLING_QUOTA_EXCEEDED error. By default, the admin client will automatically retry on that error until default.api.timeout.ms is reached. When the automatic retry is disabled by the client, the THROTTLING_QUOTA_EXCEEDED error is immediately returned to the client.

Dedicated cluster limits per partition

You can add CKUs to a dedicated cluster to meet the capacity for your high traffic workloads. However, the limits shown in this table will not change as you increase the number of CKUs.

Dimension Capability
Ingress per Partition Max 5 MBps
Egress per Partition Max 15 MBps
Storage per Partition Max 250 GB

Dedicated cluster provisioning time

You can determine the estimated time to provision a dedicated cluster by the size for a cluster.

  • AWS: the expected provisioning times is one hour per CKU. A dedicated 2 CKU cluster on AWS is estimated to complete provisioning in two hours.
  • GCP: the expected provisioning time is one hour per CKU. A dedicated 2 CKU cluster on GCP is estimated to complete provisioning in two hours.
  • Azure: the expected provisioning times is two and a half hours per CKU. A dedicated 2 CKU cluster on Azure is estimated to complete in five hours.