Configure Confluent Operator and Confluent Platform

The following is the process of configuring Confluent Operator and Confluent Platform:

  1. Create the global configuration file.
  2. Configure the global infrastructure settings.
  3. Optionally configure the Confluent Platform component level settings.

The examples in this guide use the following assumptions:

  • $VALUES_FILE refers to the configuration file you set up in Create the global configuration file.

  • To present simple and clear examples in the Operator documentation, all the configuration parameters are specified in the config file ($VALUES_FILE). However, in your production deployments, use the --set or --set-file option when applying sensitive data with Helm. For example:

    helm upgrade --install kafka \
     --set kafka.services.mds.ldap.authentication.simple.principal=”cn=mds,dc=test,dc=com” \
     --set kafka.services.mds.ldap.authentication.simple.credentials=”Developer!” \
     --set kafka.enabled=true
    
  • operator is the namespace that Confluent Platform is deployed in.

  • All commands are executed in the helm directory under the directory Confluent Operator was downloaded to.

Configuration overview of Confluent Operator and Confluent Platform

During installation, Confluent Operator and Confluent Platform components are created based on parameters stored in multiple Helm Chart values.yaml files (one for Operator and one for each Confluent Platform component) and the global configuration file.

Do not modify parameters in the individual component values.yaml files. If you need to adjust capacity, add a parameter, or change a parameter for a component, you modify the component section in the global configuration file. You can also adjust configuration parameters after installation using the helm upgrade command.

The global configuration file is layered over the values.yaml files at installation and contains values that are specific to environments and that can be modified by you prior to installation.

After you download the Helm bundle you’ll see that:

  • The values.yaml file for each Confluent Platform component is stored in helm/confluent-operator/charts/<component>/.
  • The values.yaml file for Confluent Operator is stored in helm/confluent-operator/.
  • The <provider>.yaml file for each provider is stored in helm/providers/.

At installation, Helm reads the values files in the following layered order:

  1. The values.yaml for the Confluent Platform component is read.
  2. The values.yaml for Operator is read.
  3. The global configuration file is read.

Create the global configuration file

To customize the default configuration file:

  1. Go to the helm/providers directory under the directory where you downloaded the Confluent Operator bundle.

  2. Make a copy of the provider file corresponding to your provider environment. For example, copy gcp.yaml to my-values.yaml if your provider is Google Cloud.

  3. Set an environment variable pointing to your copy of the configuration file. For example:

    export VALUES_FILE="/path/to/my-values.yaml"
    

    The remainder of this topic uses $VALUES_FILE to refer to the global configuration file.

Configure the global settings

The following are the configuration changes necessary for the initial deployment of Confluent Operator and Confluent Platform. You specify the settings in the configuration file ($VALUES_FILE) you created above.

You can manually update the the configuration file after deployment if necessary. See Updating configuration for detail.

  1. Validate or change your region, and zone or zones (if your Kubernetes cluster spans multiple availability zones).
  2. Specify the storage class.
  3. Validate or change the Docker image registry endpoint.
  4. Validate or change the Docker image tags.
  5. Configure DNS and load balancers for access to Confluent Platform from outside Kubernetes.
  6. Validate or configure network encryption, authentication, and authorization.
  7. Validate or configure your license key.

Configure storage

Operator manages storage by default using dynamic storage provisioning that Kubernetes provides.

If you must rely on statically provisioned storage volumes, you can manually provision and attach storage to your Kubernetes worker nodes, expose those to the platform as PersistentVolumes, and then use Confluent Operator to deploy Confluent Platform clusters so that the broker instances mount those PersistentVolumes.

Depending on how you are specifying storage requirements to Confluent Operator, you have the following options:

  • Creating a StorageClass and specifying the class name for Confluent Operator to use
  • Using the default Kubernetes StorageClass
  • Specifying storage provisioner and other details for Confluent Operator Helm charts to create a StorageClass

Migrating from one storage class to another is not supported in Confluent Operator.

Use pre-defined StorageClass

Starting in Confluent Operator 5.5, you can instruct Operator to use a specific StorageClass for all PersistentVolumes it creates.

You can provide a storage class for your Confluent Platform cluster or for specific components, ZooKeeper, Kafka, ksqlDB, and Control Center.

  1. Create a StorageClass and PersistentVolume for the Confluent Platform cluster. This task requires the Kubernetes admin privilege.
  2. In the configuration file ($VALUES_FILE), specify the name of the StorageClass to use for deploying Confluent Platform.
    1. Use global.storageClassName to specify a StorageClass to be used for all component deployments.
    2. Use <component>.storageClassName to specify a StorageClass for a particular component.

Use the Kubernetes default StorageClass

Starting in Confluent Operator 5.5, you can configure Confluent Operator to use the Kubernetes default storage class.

The process for using statically provisioned storage is the same as above. Simply ensure that the storageClassName specified in your PersistentVolume definitions matches the name of your Kubernetes cluster’s default StorageClass.

To use the Kubernetes default storage class:

  • Do not specify the global level storageClassName values or set it to an empty string (“”).
  • Do not specify the component level storageClassName value or set it to an empty string (“”).
  • Do not set the global.provider.storage object.

The associated volumes will use the default StorageClass of your Kubernetes cluster. The support for default StorageClasses is enabled by default in versions 1.11 and higher of Kubernetes.

Use the StorageClass created by Confluent Operator Helm charts

To have Confluent Operator Helm charts to create a storage class for your Confluent Platform cluster:

  • Specify the storage information in the global.provider.storage object.

    If you are configuring a multi-zone cluster, Confluent Operator creates a storage class per zone (global.provider.kubernetes.deployment.zone).

  • Do not specify the storageClassName at the global or component level.

    See Storage Class Provisioners for configuration examples. The example below uses GCE persistent disk storage (gce-pd) and solid-state drives (pd-ssd).

    global:
      provider:
        name: gcp
        region: us-central1
        kubernetes:
          deployment:
            ## If kubernetes is deployed in multi zone mode then specify availability-zones as appropriate
            ## If kubernetes is deployed in single availability zone then specify appropriate values
            zones:
              - us-central1-a
        storage:
          ## https://kubernetes.io/docs/concepts/storage/storage-classes/#gce
          ##
          provisioner: kubernetes.io/gce-pd
          reclaimPolicy: Delete
          parameters:
            type: pd-ssd
    

When creating Confluent Platform clusters, Confluent Operator creates multiple StorageClasses on the fly using the data under global.provider.storage as the spec for each StorageClass, and they will be named according to the following pattern:

{cp-component-helm-chart-name}-standard-ssd-{zone}

If doing this, then the process for using statically provisioned storage is the same as above.

Precedence of the storage options

The precedence rule of the possible storage configuration options is as follows:

  • If storageClassName is specified both at the global level (in the global object) and component levels (in the component objects), the component-level storageClassName is used.
  • If storageClassName is specified at the component level, and the global.provider.storage object is specified, the component-level storageClassName is used.
  • If both the global level storageClassName and global.provider.storage are specified, Operator will return an error.

Custom Docker registry

The default Confluent Platform image registry is Docker Hub. If you are using a private image registry, specify the registry endpoint and the container image name in the configuration file.

The following example shows the default public image registry for container images. If you are installing from images downloaded from Docker Hub and then moved to a separate image registry, you must enter your image registry’s FQDN.

If the registry you use requires basic authentication, you need to change the credential parameter to required: true and enter a username and password.

## Docker registry endpoint where Confluent Images are available.
##
registry:
  fqdn: docker.io
  credential:
    required: false
    username:
    password:

Custom Docker images

Confluent Platform currently supports two sets of Docker images, one set that uses Debian as the base image and another which uses the Red Hat Universal Base Image (UBI) as the base image.

By default, Debian is used as the base image for all components.

To use the Red Hat UBI-based images:

  1. In the IMAGES file that comes with the Operator bundle, locate the Confluent Platform components that you want to use the UBI-based images for. Get the image tag values of the components. For example, in the entry below, the tag value for Connect is 5.5.0.0.

    connect: 5.5.0.0
    
  2. Append -ubi8 to the tag values and specify the values for the image tag in the configuration YAML file in the corresponding component sections.

    Using the following example of a config file ($VALUES_FILE), the initContainers of all the components will use the UBI-based images, and the main containers of Operator, Kafka, and Connect will use the UBI-based images.

    global:
     initContainer:
       image:
         tag: 5.5.0.0-ubi8
    
    operator:
      image:
        tag: 0.364.0-ubi8
    
    kafka:
      image:
        tag: 5.5.0.0-ubi8
    
    connect:
      image:
        tag: 5.5.0.0-ubi8
    

Namespaced deployment

By default, Confluent Operator deploys Confluent Platform across all namespaces. If you want a Confluent Platform deployed to one namespace where it only reconciles the objects in that namespace, enable a namespaced deployment.

With a namespaced deployment, the Operator service can run without requiring access to cluster scoped Kubernetes resources. The Operator service only manages the resources within the namespace it is deployed to.

To enable a namespaced deployment of Confluent Operator, set the following in your configuration file ($VALUES_FILE):

operator:
  namespaced: true

The previous step does not trigger Confluent Operator to automatically install the required cluster-level CustomResourceDefinitions (CRDs). You need to install the CRDs as a separate step. See Install Custom Resource Definitions (CRDs) for instructions.

Cluster permissions for namespaced deployment

To view the required cluster permissions for a namespaced deployment, run the following command.

helm template <release-name> \
  <path-to-chart> \
  --values <path-to-values-file> \
  --namespace <namespace> \
  --set operator.enabled=true \
  --set operator.namespaced=true \
  --show-only charts/operator/templates/clusterrole.yaml

For example:

helm template operator \
  ./confluent-operator/ \
  --values $VALUES_FILE \
  --namespace operator \
  --set operator.enabled=true \
  --set operator.namespaced=true \
  --show-only charts/operator/templates/clusterrole.yaml

Cluster-wide deployment

By default, Confluent Operator deploys Confluent Platform cluster-wide, across all namespaces. If you want Confluent Operator to manage Confluent Platform components across all namespaces, but you don’t want the user who installs Confluent Operator to need permissions to manage cluster-level resources, you can create the ClusterRole and ClusterRoleBindings needed by Confluent Operator.

The following options are available to use ClusterRoleBinding with Confluent Operator:

  • Confluent Operator Helm charts create the required roles and role binding during the Operator install.
  • Kubernetes admin creates the ClusterRoles and ClusterRoleBinding, and the Confluent Platform admin then uses those when deploying Operator.

ClusterRoleBinding created by Confluent Operator Helm charts

To have Confluent Operator Helm charts create the cluster roles and cluster role bindings, set operator.installClusterResources and operator.namespaced as shown below in your Operator configuration file ($VALUES_FILE):

operator:
  installClusterResources: true
  namespaced: false

ClusterRoleBinding created by Kubernetes admin

To use the ClusterRoleBinding set up by Kubernetes admin:

  1. Your Kubernetes cluster admin sets up ClusterRoleBinding.

    Make sure that the roleRef in ClusterRoleBinding is the name of the ClusterRole.

  2. Set the following in your Operator configuration file ($VALUES_FILE):

    operator:
      installClusterResources: false
      namespaced: false
    

Modify default component settings

The global configuration file ($VALUES_FILE) contains the global configuration parameters. The values.yaml file contains additional component specific parameters you can add to your configuration. The values.yaml files also contain detailed comments that describe each configuration parameter and how to use it. The table below lists each of the values.yaml files and the location under the Confluent Operator home directory.

Component Chart Name values.yaml path
Operator operator helm/confluent-operator/charts/operator/values.yaml
Manager manager helm/confluent-operator/charts/manager/values.yaml
Kafka kafka helm/confluent-operator/charts/kafka/values.yaml
ZooKeeper zookeeper helm/confluent-operator/charts/zookeeper/values.yaml
Connect connect helm/confluent-operator/charts/connect/values.yaml
Schema Registry schemaregistry helm/confluent-operator/charts/schemaregistry/values.yaml
Control Center controlcenter helm/confluent-operator/charts/controlcenter/values.yaml
Replicator replicator helm/confluent-operator/charts/replicator/values.yaml
ksqlDB ksql helm/confluent-operator/charts/ksql/values.yaml

Important

You should not modify a component values.yaml file. When you need to use or modify a component configuration parameter, add it to or change it in the the global configuration file ($VALUES_FILE). The global provider file overrides other values.yaml files when you install and when you upgrade a component configuration.

Complete the following steps to make component configuration changes:

  1. Find the configuration parameter block in the values.yaml file that you want to use.

  2. Copy the configuration parameter into the correct location in the global configuration file ($VALUES_FILE) and make the required changes.

  3. Enter the following upgrade command:

    helm upgrade --install \
      --values $VALUES_FILE \
      --set <component>.enabled=true \
      <component> \
      ./confluent-operator
    

    For example, to change a Kafka configuration parameter, you enter the following upgrade command after saving your configuration changes in the $VALUES_FILE file.

    helm upgrade --install \
      --values $VALUES_FILE \
      --set kafka.enabled=true \
      kafka \
      ./confluent-operator
    

Container memory and CPU settings

The default parameters in the the global configuration file ($VALUES_FILE) specifies pod resources needed. If you are testing Confluent Operator and Confluent Platform, your resource requirements may not be as great as the default values shown. However, ZooKeeper and Kafka must be installed on individual pods on individual nodes.

Important

At least three Kafka brokers are required for a fully functioning Confluent Platform deployment. A one- or two-broker configuration is not supported and should not be used for development testing or production.

Confluent Operator can define pod resource limits for all Confluent Platform components it deploys. You can define these settings using the requests and limits tags for each component in their values.yaml file.

The following example shows the default pod resource parameters in a global configuration file snippet for Kafka. See Managing Compute Resources for Containers for more details.

## Kafka Cluster
##
kafka:
  name: kafka
  replicas: 3
  resources:
    ## It is recommended to set both resource requests and limits.
    ## If not configured, kubernetes will set cpu/memory defaults.
    ## Reference: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
    requests:
      cpu: 200m
      memory: 1Gi
    limits: {}
  loadBalancer:
    enabled: false
    domain: ""
  tls:
    enabled: false
    fullchain: |-
    privkey: |-
    cacerts: |-

Kafka and ZooKeeper configuration overrides

You can override default Kafka and ZooKeeper configuration parameters using the following keys in your configuration file ($VALUES_FILE).

zookeeper:
  configOverrides:
    server: []
    jvm: []
    log4j: []
kafka:
  configOverrides:
    server: []

You can only change settings for configuration parameters that are key/value pairs. Configuration parameters with only a key can’t be modified using the override. Refer to the following Confluent documents for configuration parameters used in Kafka and ZooKeeper:

The following example for Kafka changes:

  • The value for auto.create.topics.enable from false (default value) to true.
  • The log-level from INFO to DEBUG.
kafka:
  configOverrides:
    server:
      - auto.create.topics.enable=true
    jvm: []
    log4j:
      - log4j.rootLogger=DEBUG, stdout

Then, run the upgrade command. For example, if you are adding this override for Kafka you enter the following command:

helm upgrade --install \
  --values $VALUES_FILE \
  --set kafka.enabled=true \
  kafka \
  ./confluent-operator

Schema validation

You use a configuration override in Kafka to set up schema validation. The following example shows HTTP endpoint override for an example release name schemaregistry on namespace operator.

configOverrides:
  server:
    - confluent.schema.registry.url=http://schemaregistry.operator.svc.cluster.local:8081

If Schema Registry is deployed using a secure HTTPS endpoint, use the following configuration override:

configOverrides:
  server:
    - confluent.schema.registry.url=https://<domain>:8081

OR (using the same example names as above):

configOverrides:
  server:
    - confluent.schema.registry.url=http://schemaregistry.operator.svc.cluster.local:9081

Note

You can view the <domain> name by running helm status <release-name>.

See Schema Validation in the Schema Registry documentation for additional details.

Add a license key

You can use Confluent Operator, Kafka, and Confluent Control Center for a 30-day trial period without a license key. After 30 days, Operator, Kafka, and Control Center require license keys.

Note

You add the license key to your configuration file ($VALUES_FILE), if applicable. The YAML file is read when you run the helm upgrade command to apply the license.

Operator license

The sample YAML file helm/confluent-operator/charts/operator/values.yaml contains the license section below. Add this section to the Operator block in your configuration file ($VALUES_FILE) file.

## License Key for Operator
##
operator:
  licenseKey: ""

If you are adding your license key to a running deployment, run helm upgrade to activate the license. The example below assumes you are using your configuration file ($VALUES_FILE) to update your configuration.

helm upgrade --install <operator-release-name> \
--values $VALUES_FILE \
--set operator.enabled=true \
./confluent-operator

Kafka license

The sample yaml file helm/confluent-operator/charts/kafka/values.yaml contains the license section below. Add this section to the Kafka block in the configuration values file ($VALUES_FILE).

## Kafka License information
##
kafka:
  license: ""

If you are adding your license key to a running deployment, run helm upgrade to activate the license. The example below assumes you are using the configuration values ($VALUES_FILE).

helm upgrade --install <kafka-release-name> \
--values $VALUES_FILE \
--set kafka.enabled=true \
./confluent-operator

Confluent Control Center license

The sample yaml file helm/confluent-operator/charts/controlcenter/values.yaml contains the license section below. Add this section to the Control Center block in the configuration values file ($VALUES_FILE).

## C3 License information
##
controlcenter:
  license: ""

If you are adding your license key to a running deployment, run helm upgrade to activate the license. The example below assumes you are using the configuration values file ($VALUES_FILE).

helm upgrade --install <Controlcenter-release-name> \
--values $VALUES_FILE \
--set controlcenter.enabled=true \
./confluent-operator

Pod annotations

You can define custom annotations for Confluent Platform components deployed through Confluent Operator. Those annotations are applied to the Kubernetes pods created by the Operator.

In your configuration file ($VALUES_FILE), under each component, set podAnnotations.

podAnnotations must be a map with string keys and string values. The values cannot be any other types, such as numbers or booleans.

In general, annotation values must pass Kubernetes annotations validation. See the Kubernetes documentation on Annotations for details.

The following are example specifications of podAnnotations for several Confluent Platform components:

zookeeper:
  podAnnotations:
    string: "value"
    number: "1"
    boolean: "true"
    list: "[{\"labels\": {\"key\": \"value\"}},{\"key1\": \"value1\"}]"

kafka:
  podAnnotations:
    key1: "value1"
    key2: "value2"

controlcenter:  # ...
  podAnnotations:
    key1: "value3"
    key2: "value4"