Confluent Cloud

Confluent Cloud ksqlDB

ksqlDB is a database purpose-built to help developers create stream processing applications on top of Apache Kafka®. Confluent Cloud provides a fully managed solution for creating and managing ksqlDB clusters.

You can provision ksqlDB clusters by using the Confluent Cloud Console or the Confluent CLI.

For more information, see Section 2: Add ksqlDB to the cluster.

Supported Features for ksqlDB in Confluent Cloud

  • Web interface for managing your ksqlDB cloud environment directly from your browser that exposes all critical ksqlDB information.
  • SQL editor to write, develop, and execute SQL queries with auto completion directly from the Web interface.
  • Integration with Confluent Cloud Schema Registry to leverage your existing schemas to use within your SQL queries.
  • SQL-based Connect integration.
  • Available in AWS, GCP, and Azure in all regions.
  • Private networking with Private Link.

New Features for ksqlDB in Confluent Cloud

For the latest features in ksqlDB, see the 0.27.2 changelog.

Limitations for ksqlDB in Confluent Cloud

Pricing for ksqlDB in Confluent Cloud

The unit of pricing in Confluent Cloud ksqlDB is the Confluent Streaming Unit. A Confluent Streaming Unit is an abstract unit that represents the linearity of performance. For example, if a workload gets a certain level of throughput with 4 CSUs, you can expect about three times the throughput with 12 CSUs.

Confluent charges you in CSUs per hour.

You select the number of CSUs for your cluster at provisioning time. You can configure CSUs as follows:

  • 1 CSU is the minimum.
  • 12 CSUs is the maximum.
  • Your cluster can have 1, 2, 4, 8, or 12 CSUs.
  • Clusters with 8 or 12 CSUs are automatically configured for high availability.
  • High availability cannot be enabled for clusters with less than 8 CSUs.

Note

Confluent Streaming Unit pricing varies slightly depending on cloud provider and region. Pricing is shown in the web interface as part of ksqlDB provisioning so that you can see exact pricing for your cloud and region of choice.

Scaling CSUs for ksqlDB in Confluent Cloud

Scaling cluster CSUs after initial provisioning is not currently supported as a self-service option. For clusters having four or more CSUs, contact your Confluent team to scale the cluster (up or down).

Alternatively, if your cluster requires more CSUs, you can provision a new cluster with the desired number of CSUs, and migrate to your new one.

Sizing guidelines for ksqlDB in Confluent Cloud

The number of CSUs needed for your cluster depends on the workload, including the number of queries, query complexity, and throughput. The amount of resources allocated to a cluster is proportional to the number of CSUs defined for the cluster.

Four CSUs are sufficient for many workloads. In general, start with four CSUs and scale out if more capacity is needed, or scale down if less capacity is needed. To identify when more CSUs are needed, check the ksqlDB consumer lag or the CSU Saturation metric. For more information, see Step 10: Monitor persistent queries.

After your ksqlDB cluster is provisioned, you can only change the Confluent Streaming Unit cluster by migrating to a new one one with the desired number of CSUs.

CSUs and storage

You get 125 GB of storage space with each CSU. For 8 and 12 CSUs, clusters are configured with high availability automatically, and ksqlDB maintains replicas of your data that use half of the available storage. For 4 CSUs and 8 CSUs, you get 500 GB of user-available storage, so to expand storage past 500 GB, you must expand to 12 CSUs.

The following table shows how storage is provisioned for ksqlDB clusters.

Number of CSUs Storage User-available storage
1 125 GB 125 GB
2 250 GB 250 GB
4 500 GB 500 GB
8 1000 GB 500 GB*
12 1500 GB 750 GB*

Note

* High-availability clusters with half of storage used for data replicas.

High availability

When your cluster has 8 CSUs or more, ksqlDB is configured for high availability automatically, for both processing and storage. This is done in a multi-zones way if your Kafka cluster is configured for multi-zones.

For 8 or more CSUs:

  • Storage is redundant (2x).
  • ksqlDB has multi-node deployment, with 4 CSUs per node.
  • When the Kafka cluster is multi-zones, ksqlDB nodes are placed in multiple zones.

For fault tolerance and faster failure recovery, standby replication is enabled for clusters with 8 CSUs or more. With standby replication, a ksqlDB server, beside being the active for some partitions, also acts as the standby for other partitions. In this mode, a ksqlDB server subscribes to the changelog topic partition of the active and replicates updates continuously to its own copy of the table partition. Pull queries that would otherwise fail are routed to other active or standby servers that host the same partition.

The unavailability window during failures is dominated by the table restoration time and the failure detection time. Standby replication significantly reduces the table restoration time. The failure detection time is directly related to the Kafka Streams consumer configuration, and with default settings used in managed ksqlDB, failure detection can take as long as 10–15 seconds.

For more information, see Highly Available, Fault-Tolerant Pull Queries in ksqlDB.

Exclude row data in the ksqlDB processing log

With ksqlDB, you can exclude row data in the processing log when queries error out by setting the ksql.logging.processing.rows.include config to false.

With ksqlDB in Confluent Cloud, this config is set to true by default, and row data is written to the log. You can exclude row data by using Confluent Cloud Console or by using the Confluent CLI.

Configure with Confluent Cloud Console

When you create a new ksqlDB cluster, toggle “Hide row data in processing log” by selecting Advanced under Configuration options.

Screenshot of Confluent Cloud showing the ksqlDB Add Cluster wizard

Configure with Confluent Cloud CLI

confluent ksql cluster create <cluster-name> --api-key <api-key> --api-secret <secret> --log-exclude-rows