Multi-tenancy and Client Quotas on Confluent Cloud¶
This topic describes multi-tenancy, principals, and Client Quotas as a tool to support multi-tenant workloads on Dedicated Clusters. For information about service quotas, which define limits for Confluent Cloud resources, see Service Quotas for Confluent Cloud.
Why multi-tenancy?¶
As organizations continue to integrate Apache Kafka® into their applications, Confluent Cloud becomes a central nervous system for businesses, meaning that each of the organization’s clusters could contain large amounts of data that unique business units and teams consume, enrich, and generate business value from.
Running multiple distinct applications on a single cluster is known as multi-tenancy. Each tenant on the cluster consumes some portion of the cluster’s resources.
Your business may choose to support multi-tenancy on a single Dedicated Cluster for the following reasons:
- Lower cost: Most use cases don’t need the full capacity of a Kafka cluster. You can minimize fixed costs by sharing a single cluster across workloads, and spread the costs across teams, even when those workloads are kept separate.
- Simpler operations: Using one cluster instead of several means means a narrower scope of access controls and credentials to manage. Separate clusters may have different networking configurations, API keys, schema registries, roles, and ACLs.
- Greater reuse: Teams can most easily reuse existing data when they’re already using the cluster - it’s a simple access control change to grant them access. Reusing topics and events created by other teams lets teams deliver value more quickly.
Supporting a multi-tenant workload imposes additional requirements. For example, you might need granular insights into the behaviors of each tenant, and details about the cluster as a whole. Running a multi-tenant cluster raises the following questions:
- What resources are each tenant consuming?
- What is the level of performance each tenant is obtaining?
- Are some tenants consuming too many resources, while others are not getting enough?
Application identity and principals¶
To manage many tenants on a single cluster, and answer the questions posed previously, each tenant must have a unique identity. This identity is called a principal.
Assigning each tenant a unique principal provides the foundation for granular monitoring and management capabilities in Confluent Cloud. Although there isn’t a blanket recommendation for how to assign principals to applications, you can create principals that enable you to map unique identities to tenants with service accounts or identity pools.
Service accounts¶
Each service account represents an application principal programmatically accessing Confluent Cloud. Confluent recommends using one service account per producer application (or consumer group) for maximal granularity and control.
Each service account can have one or more API keys. The service account corresponds to the long-lived principal identity, while the API keys are credentials that you can and should rotate periodically. Client Quotas cannot be applied to a specific API key, and specific API keys are not labeled in the Metrics API.
Identity pools¶
Each identity pool represents either a single application or a group of applications. In general, each identity pool is seen as a unique principal inside of Confluent Cloud.
The Metrics API supports metric visibility at the identity pool level, and Client Quotas are applied at the identity pool level as well. If multiple applications are using the same identity pool, they will share a quota and their application metrics will be aggregated.
Monitoring metrics by principal¶
The Confluent Cloud Metrics API provides actionable operational metrics about your Confluent Cloud deployment.
The Metrics API supports labels, which can be used in queries to filter or group results.
For multi-tenancy, the principal_id
label enables metric filtering by a specific application.
By monitoring the performance characteristics of specific applications, you can derive granular insights about cluster utilization.
For example, you can learn which applications are
responsible for driving high levels of throughput consumption or requests. You can also
correlate changes in application metrics with fluctuations in overall cluster utilization as measured
by the cluster load metric.
The following metrics are labeled with principal_id
:
io.confluent.kafka.server/request_bytes
io.confluent.kafka.server/response_bytes
io.confluent.kafka.server/active_connection_count
io.confluent.kafka.server/request_count
io.confluent.kafka.server/successful_authentication_count
For an example of querying the metrics API by principal_id
, see Query Metrics by Specific Principal in the Metrics
API doc.
For a full list of metrics and supported labels, see the Metrics API.
See Monitor Confluent Cloud Clients for more information on monitoring clients.
Control application usage with Client Quotas¶
Confluent Cloud Client Quotas are a cloud-native implementation of Kafka Client Quotas. Confluent Cloud Client Quotas enable you to apply throughput limits to specific principals. Client Quotas in Confluent Cloud differ slightly from Quotas in Apache Kafka:
Quota parameter | Cloud Client Quotas | Apache Kafka Quotas |
---|---|---|
Apply to | Service Accounts or identity pools | User or Client ID |
Managed by | Calling the Confluent Cloud API | API Interacting with Kafka Directly |
Level enforced at | Cluster level | Broker level |
Confluent Cloud Client Quotas:
- Are defined on the cluster level. Confluent Cloud automatically distributes slices of the quota to the correct brokers for the target principal.
- Restrict Ingress and Egress throughput
- Can apply to one or more principals
- Each principal assigned to a quota receives the full amount of the quota, meaning the quota is not shared by the principals it is assigned. For example if a 10 MBps ingress quota is applied to principals 1 and 2, principal 1 will be able to produce at most 10 MBps, no matter what principal 2 does.
- Define a throughput maximum, but do not guarantee a throughput floor. Applications are rate-limited through the use of the Kafka throttling mechanism. Kafka asks the client to wait before sending more data and mutes the channel, which appears as latency to the client application.
Each cluster can be assigned one default quota, which applies to all principals connecting to the cluster, unless a more specific quota is applied. Each principal can be referenced by one quota at most per cluster.
- Example 1: cluster default quota is 10 MBps. A specific 5 MBps quota for Principal 1 is applied. Principal 1 will be able to produce at most 5 MBps.
- Example 2: cluster default quota is 10 MBps. A specific 50 MBps quota for Principal 1 is applied. Principal 1 will be able to produce at most 50 MBps.
Client Quotas can be used to support multi-tenancy by rate limiting distinct applications as necessary. Common approaches for managing cluster utilization include:
- Creating a default quota that ensures all applications are held to a specified throughput value, unless the application is of high-priority and requires more throughput
- Creating a quota for nightly ETL jobs that consume too much throughput and negatively impact the performance of other, more latency-sensitive applications
- Creating quotas that represent different tiers of service, with certain applications assigned higher throughput levels than others.
Get started with Client Quotas¶
This section shows how to create, list, and delete Confluent Cloud Client Quotas using the Confluent CLI and the Confluent Cloud Console. For the full command reference, see confluent kafka quota. To make similar calls with a REST API, see the Client Quotas Reference.
Prerequisites:
- Confluent CLI installed, and access to a Confluent Cloud administrator account.
Sign in to your Confluent Cloud account, and create a Service Account by running the following command:
confluent iam service-account create client-quota-SA --description "Client Quotas demo SA"
Create a quota, specifying a quota name, the cluster where the quota is applied, values for ingress and egress and the principal ID. Ingress and egress values are specified in bytes. Use values of at least 1 MB (1,000,000 bytes) or higher. Consider the following examples.
For example, creating a quota with a service account principal:
confluent kafka quota create --name test-quota --cluster lkc-12345 \ --ingress 1000000 --egress 1000000 description "Test Quota" \ --principals sa-12345
For example, creating a quota with an identity pool principal:
confluent kafka quota create --name test-quota --cluster lkc-12345 \ --ingress 1000000 --egress 1000000 description "Test Quota" \ --principals pool-abcde
For example, creating a quota with identity pool and service account principals:
confluent kafka quota create --name test-quota --cluster lkc-12345 \ --ingress 1000000 --egress 1000000 description "Test Quota" \ --principals pool-abcde sa-12345
The result should look similar to the following:
+--------------+-------------+ | ID | cq-abcde | | Display Name | test-quota | | Description | Test Quota | | Ingress | 1000000 B/s | | Egress | 1000000 B/s | | Cluster | lkc-12345 | | Principals | sa-12345 | +--------------+-------------+
You can modify your quota later. For example, to add or remove principals, use the
update
command:confluent kafka quota update cq-abcde --add-principals sa-12345 confluent kafka quota update cq-abcde --remove-principals sa-12345
Finally, you can retrieve a list of quota IDs with the
list
command, and usedelete
to delete the the sample quota you created:confluent kafka quota list --cluster lkc-12345 confluent kafka quota delete cq-abcde --cluster lkc-12345
Prerequisites:
- Access to a Confluent Cloud administrator account.
- A Service Account or identity pool with permissions to the cluster.
Tip
To create a service account, use the instructions found in Create a service account using the Confluent Cloud Console.
To create a quota with the console:
Sign in to your Confluent Cloud account, select the environment and cluster you want to set a client quota for.
From the navigation menu, choose Cluster settings. If Client Quotas have been enabled for your account, you will see a Client Quotas tab.
Select Client Quotas. To create a new quota, click Create your first quota. Specify the Quota name, Ingress throughput, Egress throughput. For ingress and egress you also choose the units: MB/s or KB/s. For Principals, choose the service account or identity pool that you created as a prerequisite.
Click Submit to create the quota.
Once you have created a quota, you can always navigate back to this page to see a list of quota for the cluster, or modify or delete an existing quota.
- To edit a quota, click its name in the list of quotas a quota from the list, and then click Edit quota. When edits are complete, click Confirm or Cancel.
- Delete the quota by clicking Delete quota, entering the quota name and Confirm.
Prerequisites:
- Terraform Provider for Confluent installed, and access to a Confluent Cloud administrator account.
To create a quota with Terraform Provider for Confluent use the following snippet of Terraform configuration:
# Configure the Confluent Provider
terraform {
required_providers {
confluent = {
source = "confluentinc/confluent"
}
}
}
provider "confluent" {
cloud_api_key = var.confluent_cloud_api_key # optionally use CONFLUENT_CLOUD_API_KEY env var
cloud_api_secret = var.confluent_cloud_api_secret # optionally use CONFLUENT_CLOUD_API_SECRET env var
}
...
resource "confluent_kafka_client_quota" "example" {
display_name = "test-quota"
description = "Test Quota"
throughput {
ingress_byte_rate = "1000000"
egress_byte_rate = "1000000"
}
principals = ["sa-12345", "sa-abc456", "pool-xyz111"]
kafka_cluster {
id = "lkc-12345"
}
environment {
id = "env-12345"
}
lifecycle {
prevent_destroy = true
}
}
# Create more resources ...
You must provide appropriate Confluent Cloud credentials to use the provider.
For more information about the resources that you can interact with, see the Confluent Terraform Provider documentation in the Terraform registry.
For the full confluent_kafka_client_quota resource reference, see the Confluent Terraform Provider documentation in the Terraform registry.