Configure Storage for Confluent Platform using Confluent for Kubernetes¶
Overview¶
Confluent for Kubernetes (CFK) requires the use of Kubernetes Storage Classes to provision persistent storage volumes for most of the Confluent Platform components.
Kafka and ZooKeeper require block storage.
Apart from the above requirements of persistent volume Storage Class and block storage, CFK does not have specific requirements or recommendations for storage solutions or storage configuration details. You should test and tune the storage solutions you select before configuring them for CFK.
Connect and Schema Registry do not use persistent storage volumes, and thus do not need a Storage Class to be specified.
Kafka¶
For Kafka persistent volumes, CFK supports block storage. Example block storage solutions you can use are:
AWS EBS
On Kubernetes 1.23 or higher, EBS CSI driver is required to make CFK and Kafka work. Enable and set up the EBS-CSI add-on as described in Managing the Amazon EBS CSI driver as an Amazon EKS add-on.
Azure Disk
GCE Disk
Ceph RBD
Portworx
You need 1 persistent volume on each Kafka host.
Object storage or network file storage solutions are not supported for Kafka.
In addition to peristent storage, you can configure Kafka with Tiered Storage. For details, Configure Kafka Tiered Storage.
ZooKeeper¶
ZooKeeper uses the same persistent storage volume solution that Kafka uses.
You need 2 persistent volumnes on each ZooKeeper host.
Other Confluent components¶
Each Confluent component requires 1 persistent storage volume to be configured as the component uses it for maintaining working state during the lifetime of the service.
Additionally, Confluent components utilize Kafka for their durable shared storage needs. For example, Schema Registry stores schemas in Kafka, and Confluent Control Center stores operational state and metrics in Kafka.
Persistent storage volumes¶
Storage class configuration is one of the most critical steps in the Confluent for Kubernetes (CFK) configuration process.
When configuring Confluent components to use persistent storage volumes, the following options are supported:
- Dynamic Provisioning: Use a pre-defined Kubernetes Storage Class
- Dynamic Provisioning: Use the Kubernetes default Storage Class
- Custom Provisioning: Use pre-provisioned persistent storage volumes
By default, CFK manages storage using dynamic storage provisioning that Kubernetes provides.
Consideration¶
CFK does not support an automated change to storage classes on an existing deployment. To make changes to a storage class on an existing deployment, such as listed below, contact Confluent Support:
- Migrating from one storage class to another.
- Changing the storage class, for example, enabling encryption on the
- persistent volumes.
Use pre-defined Kubernetes StorageClass for dynamic provisioning¶
You can provide a storage class to use for the entire Confluent Platform, or you can specify different storage classes for different components such as ZooKeeper, Kafka, ksqlDB, and Control Center.
To use a pre-defined storage class:
Create or use a pre-defined StorageClass you want to use in your Kubernetes cluster.
The following settings are the best practice recommendations for your storage class:
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain
Especially for production deployments, this setting is required.
allowVolumeExpansion: true
You need to have sufficient permissions to create and modify StorageClasses in your Kubernetes cluster if you intend to create a new StorageClass to use rather than using a pre-existing one.
For more information and example, see Kubernetes Storage Classes.
In your Confluent component custom resource (CR), specify the name of the StorageClass to use:
spec: storageClass: name: my-storage-class
For a sample scenario for creating a storage class for a production environment, see Create a Production Storage Class on GKE.
Use default Kubernetes StorageClass for dynamic provisioning¶
The support for default StorageClasses is enabled by default in versions 1.11
and higher of Kubernetes. If you do not provide the spec.storageClass
in
the CR, CFK will use the default storage class.
Important
We do not recommend using the default StorageClass in production environments.
Use the following command to get the name of the current default storage class:
kubectl get sc
To use the Kubernetes default storage class, make sure the following properties are set on the default StorageClasses:
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain
Especially for production deployments, this setting is required.
allowVolumeExpansion: true
Use statically provisioned persistent volumes¶
By default, CFK automates disk management by leveraging Kubernetes dynamic storage provisioning. If your Kubernetes cluster does not support dynamic provisioning, you can follow the instructions in this section to use statically-provisioned disks for your Confluent Platform deployments.
Connect and Schema Registry do not use persistent storage volumes, so you do not need to follow the steps in this section.
To use statically-provisioned persistent volumes for a Confluent Platform component:
Create a StorageClass in Kubernetes for local provisioning. For example:
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: my-storage-class provisioner: kubernetes.io/no-provisioner volumeBindingMode: WaitForFirstConsumer
Create PersistentVolumes with the desired host path and the hostname label for each of the desired worker nodes.
You need the following number of persistent volumes for Confluent Platform components:
- 2 persistent volumes on each ZooKeeper host
- 1 persistent volume on each Kafka, ksqlDB, Confluent Control Center host
For example:
apiVersion: v1 kind: PersistentVolume metadata: name: pv-1 --- [1] spec: capacity: storage: 10Gi --- [2] volumeMode: Filesystem accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain --- [3] storageClassName: my-storage-class --- [4] local: path: /mnt/data/broker-1-data --- [5] nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - gke-myhost-cluster-default-pool-5cc13882-k0gb --- [6]
[1] Choose a name for the PersistentVolume.
[2] Choose a storage size that is greater than or equal to the storage you’re requesting for each Kafka broker instance. This corresponds to the
spec.dataVolumeCapacity
property of the component CR.[3] Choose
Retain
if you want the data to be retained after you delete the PersitentVolumeClaim that CFK will eventually create and which Kubernetes will eventually bind to this PersistentVolume.Choose
Delete
if you want this data to be garbage-collected when the PersistentVolumeClaim is deleted.Warning
With
persistentVolumeReclaimPolicy: Delete
, your data on the volume will be deleted when you delete the CFK component custom resource (CR), for example, when you delete the Kafka CR with thekubectl delete kafka <kafka_cluster-name>
.[4] The
storageClassName
must match the one created in Step 1.[5] This is the directory path you want to use on the worker node for the broker as its persistent data volume. The path must exist on the worker node.
[6] This is the value of the
kubernetes.io/hostname
label of the worker node you want to host this broker instance. To find this hostname, run the following command:kubectl get nodes \ -o 'custom-columns=NAME:metadata.name,HOSTNAME:metadata.labels.kubernetes\.io/hostname' NAME HOSTNAME gke-myhost-cluster-default-pool-5cc13882-k0gb gke-myhost-cluster-default-pool-5cc13882-k0gb gke-myhost-cluster-default-pool-5cc13882-n8vr gke-myhost-cluster-default-pool-5cc13882-n8vr gke-myhost-cluster-default-pool-5cc13882-tbbj gke-myhost-cluster-default-pool-5cc13882-tbbj
Add the storageClass to the component CR, for example:
spec: storageClass: name: my-storage-class
After deploying the new CR, validate that the PersistentVolumes are bound:
kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS pv-1 10Gi RWO Retain Bound operator/data0-kafka-0 my-storage-class pv-2 10Gi RWO Retain Bound operator/data0-kafka-2 my-storage-class pv-3 10Gi RWO Retain Bound operator/data0-kafka-1 my-storage-class
Validate that the Confluent Platform pods are healthy. For example:
kubectl get pods -l app=kafka NAME READY STATUS RESTARTS AGE kafka-0 1/1 Running 0 40m kafka-1 1/1 Running 0 40m kafka-2 1/1 Running 0 40m
Tiered Storage¶
You can use Confluent for Kubernetes (CFK) to enable Tiered Storage for Kafka. Confluent supports various object Storage solutions, such as:
- AWS S3
- GCP GCS
- Azure Blob Storage
- Pure Storage FlashBlade
- Nutanix Objects
- NetApp Object Storage
- Dell EMC ECS
- MinIO
- Hitachi Content Platform Object Storage
Enable Tiered Storage¶
When you enable Tiered Storage, you need to configure Kafka with the following:
The type of blob storage to use.
The name of the storage bucket to use.
You must have created this bucket in advance. CFK does not create this bucket on your behalf.
You also need to ensure that the Kafka brokers have appropriate access to the storage bucket. You can use one of the following options:
-
This is the recommended option.
Kubernetes Secret
- Use Service Account to give Kafka brokers access to the storage bucket
Map cloud IAM permissions to the Kubernetes ServiceAccount associated with your Kafka broker pods.
AWS provides the ability to natively associate AWS IAM permissions with ServiceAccounts in EKS.
Similarly, GCP provides the ability to map IAM permissions with ServiceAccounts in GKE.
You can map the appropriate bucket permissions to the default ServiceAccount in the Kubernetes namespace where you plan to deploy Kafka, or you can map them to a separate ServiceAccount and use CFK to ensure the Kafka broker pods are associated with that ServiceAccount. The primary benefit of this approach is that you do not need to actually manage sensitive credentials for bucket access when deploying Confluent Platform via CFK.
For more on associating AWS IAM roles for service accounts on EKS, see IAM roles for service accounts.
For more on associating GCP IAM roles for service accounts on GKE, see Workload Identity.
For more information on configuring which Kubernetes Service Account to associate with Confluent Platform components managed by CFK, see Provide custom service account.
- Use the Kubernetes Secret object to give Kafka brokers access to the storage bucket
Put your AWS or GCP credentials in a Secret object and configure Kafka to use the credentials in that object, when deploying Kafka via the CFK.
When your storage credentials change, you need to restart the Kafka cluster.
In addition to the above required settings, you can configure other Tiered Storage
settings using configOverrides
in the kafka
section. For the available
settings, see Tiered Storage.
When a Kafka cluster is deleted, CFK does not perform a garbage collection of the Tiered Storage bucket contents. You can either wait for the set interval or manually delete the objects in the Tiered Storage bucket. For more information, see Time Interval for Topic Deletes.
Configure Tiered Storage¶
You configure Tiered Storage in the Kafka CR in spec.configOverrides.server
.
This section presents the steps to enable Tiered Storage for AWS S3 and GCS. For the
configuration properties you need to set for other storage providers via
spec.configOverrides.server
, see Enabling Tiered Storage on a Broker.
Configure Tiered Storage for AWS S3¶
To enable and configure Tiered Storage with AWS S3, set the following config overrides in the Kafka CR:
kind: Kafka
spec:
configOverrides:
server:
- confluent.tier.feature=true ----- [1]
- confluent.tier.enable=true ----- [2]
- confluent.tier.backend=S3 ----- [3]
- confluent.tier.s3.bucket= ----- [4]
- confluent.tier.s3.region= ----- [5]
- confluent.tier.s3.cred.file.path= ----- [6]
- confluent.tier.topic.delete.check.interval.ms -- [7]
mountedSecrets: ----- [8]
- secretRef:
[1] Set
confluent.tier.feature=true
to enable Tiered Storage.[2] Set
confluent.tier.enable
to the default value for created topics. Setting this totrue
causes all non-compacted topics to be tiered.[3] Set
confluent.tier.backend
toS3
.[4] Set
confluent.tier.s3.bucket
to the S3 bucket you want to use.[5] Set
confluent.tier.s3.region
to the region.[6] Optional. Specify
confluent.tier.s3.cred.file.path
if using Secrets to provide credentials for Tiered Storage. If using Service Accounts, this property is not necessary.To see what to add in the Secrets file, refer to Tiered Storage.
[7] Optional. Set
confluent.tier.topic.delete.check.interval.ms
to a time interval for which the deletion of log segment files takes place after a topic or a cluster is deleted. The default value for this time interval is 3 hours.[8] Optional. Only required if using Secrets to provide credentials for Tiered Storage.
For example:
spec:
configOverrides:
server:
- confluent.tier.feature=true
- confluent.tier.enable=true
- confluent.tier.backend=S3
- confluent.tier.s3.bucket=my-bucket
- confluent.tier.s3.region=us-west-2
- confluent.tier.s3.cred.file.path=/mnt/secrets/my-secret-aws/aws/creds
mountedSecrets:
- secretRef: my-secret-aws
Configure Tiered Storage for GCS¶
To enable and configure Tiered Storage with GCS, set the following config overrides in the Kafka CR:
spec:
configOverrides:
server:
- confluent.tier.feature=true ----- [1]
- confluent.tier.enable=true ----- [2]
- confluent.tier.backend=GCS ----- [3]
- confluent.tier.gcs.bucket= ----- [4]
- confluent.tier.gcs.region= ----- [5]
- confluent.tier.gcs.cred.file.path= ----- [6]
- confluent.tier.topic.delete.check.interval.ms -- [7]
mountedSecrets: ----- [8]
- secretRef:
[1] Set
confluent.tier.feature=true
to enable Tiered Storage.[2] Set
confluent.tier.enable
to the default value for created topics. Setting this totrue
causes all non-compacted topics to be tiered.[3] Set
confluent.tier.backend
toGCS
.[4] Set
confluent.tier.gcs.bucket
to the GCS bucket you want to use.[5] Set
confluent.tier.gcs.region
to the GCS region.[6] Optional. Specify
confluent.tier.gcs.cred.file.path
if using Secrets for Tiered Storage. If using Service Accounts, this property is not necessary.To see what to add in the Secrets file, refer to Tiered Storage.
[7] Optional. Set
confluent.tier.topic.delete.check.interval.ms
to a time interval for which the deletion of log segment files takes place after a topic or a cluster is deleted. The default value for this time interval is 3 hours.[8] Optional. Only required if using Secrets to provide credentials for Tiered Storage.
For example:
spec:
configOverrides:
server:
- confluent.tier.feature=true
- confluent.tier.enable=true
- confluent.tier.backend=GCS
- confluent.tier.gcs.bucket=my-bucket
- confluent.tier.gcs.region=us-central1
- confluent.tier.gcs.cred.file.path=/mnt/secrets/my-secret-gcs/credentials
mountedSecrets:
- secretRef: my-secret-gcs