Configure Storage for Confluent Platform Using Confluent for Kubernetes Blueprints

This topic presents storage configurations in Confluent for Kubernetes (CFK) Blueprints.

Tiered Storage

You can use Confluent for Kubernetes (CFK) Blueprints to enable Tiered Storage. Confluent supports various object Storage solutions, including:

  • AWS S3
  • GCP GCS
  • Azure Blob Storage
  • Pure Storage FlashBlade
  • Nutanix Objects
  • NetApp Object Storage
  • Dell EMC ECS
  • MinIO
  • Hitachi Content Platform Object Storage

The high-level workflow to enable and configure Tiered Storage is:

  1. Create the storage bucket for Tiered Storage before configuring CFK Blueprints.

    CFK Blueprints does not create this bucket on your behalf.

  2. Set the Kafka brokers to have appropriate access to the storage bucket.

  3. Provide the configurations for the selected storage bucket.

Give Kafka access to the Tiered Storage bucket

You can use one of the following options to give Kafka brokers access to the storage bucket you want to use for Tiered Storage

Use Service Account to give Kafka brokers access to the storage bucket

Map cloud IAM permissions to the Kubernetes ServiceAccount associated with your Kafka pods.

AWS provides the ability to natively associate AWS IAM permissions with ServiceAccounts in EKS.

Similarly, GCP provides the ability to map IAM permissions with ServiceAccounts in GKE.

You can map the appropriate bucket permissions to the default ServiceAccount in the Kubernetes namespace where you plan to deploy Kafka, or you can map them to a separate ServiceAccount and use CFK Blueprints to ensure the Kafka broker pods are associated with that ServiceAccount. The primary benefit of this approach is that you do not need to actually manage sensitive credentials for bucket access when deploying Confluent Platform via CFK Blueprints.

For more on associating AWS IAM roles for service accounts on EKS, see IAM roles for service accounts.

For more on associating GCP IAM roles for service accounts on GKE, see Workload Identity.

For more information on configuring which Kubernetes Service Account to associate with Confluent Platform components managed by CFK Blueprints, see Provide custom service account.

Use the Kubernetes Secret object to give Kafka brokers access to the storage bucket

Put your AWS or GCP credentials in a secret object, and add the secret to a credential store.

To see what to add to the secret, refer to Tiered Storage.

When your storage credentials change, you need to restart the Kafka cluster.

Configure Tiered Storage

Tiered Storage configurations can be provided at the Blueprint level (in the ConfluentPlatformBlueprint CR) or at the cluster deployment level (in the KafkaCluster CR).

To enable and configure Tiered Storage:

  1. Specify where the Tiered Storage configuration information is set. The valid values are blueprint and deployment:

    kind: ConfluentPlatformblueprints
    spec:
      kafkaTieredStorage:
        providerType:
    

    If the provider type is set to blueprint, level means every Kafka cluster created will have this configuration.

    If the provider type is set to deployment, the configuration applies to the specific Kafka cluster.

  2. Provide the Tiered Storage configuration information:

    • In the ConfluentPlatformBlueprint CR if the providerType is set to blueprint in the previous step:

      kind: ConfluentPlatformblueprints
      spec:
        kafkaTieredStorage:
          blueprint:
      
    • In the KafkaCluster CR if the providerType is set to deployment in the previous step:

      kind: KafkaCluster
      spec:
        kafkaTieredStorage:
      

    Provide the following configurations:

    backendType:            --- [1]
    aws | gcp | other:      --- [2]
      bucketName:           --- [3]
      bucketRegion:         --- [4]
      forcePathStyle:       --- [5]
      credentialStoreRef:   --- [6]
        name:               --- [7]
        key:                --- [8]
    other:
      endpoint:             --- [9]
    
    • [1] Required. The cloud services where Kafka will connect to. The supported values are aws, gcp, and other. The other type can accommodate any cloud provider that supports Amazon S3 APIs.

    • [2] Required based on the backendType ([1]) you set.

    • [3] Required. The name of the bucket. Kafka brokers interact with this bucket for writing and reading tiered data.

    • [4] Required. The region of the bucket. The region cannot have an empty value.

    • [5] Set this to true to enable path-style access for all requests.

    • [6] The credential to connect to the backend API. If this is not configured, the serviceAccount mode is used for accessing the bucket. To use the serviceAccount, make sure the serviceAccount has proper IAM roles.

    • [7] If providerType in the blueprint, set this to the name of the CredentialStoreConfigRefs.name in the ConfluentPlatformBlueprint CR.

      This setting is not available when the providerType is deployment.

    • [8] The secret key used in the CredentialStoreConfig CR referenced in [7].

    • [9] The S3 object endpoint’s fully qualified domain name. Use this for any object storage that works with AWS S3 API.

  3. In addition to the above required settings, you can configure other Tiered Storage settings using kafkaclusterclass.spec.provisioner.cfk.configOverrides in the KafkaClusterClass CR. For the available settings, see Tiered Storage.