Deploy Confluent for Kubernetes

The Confluent for Kubernetes (CFK) bundle contains Helm charts, templates, and scripts for deploying Confluent Platform to your Kubernetes cluster.

You can deploy CFK using one of the following methods:

Deploy CFK directly from Confluent’s Helm repo
Deploy CFK by downloading the Helm bundle
Deploy CFK using OpenShift OperatorHub repo (valid only for OpenShift clusters)

You can install the optional Confluent plugin for interacting with CFK using one of the following methods:

Install Confluent plugin from the CFK bundle
Install Confluent plugin using Krew

Deploy CFK from Confluent’s Helm repo

Add a Helm repo:

helm repo add confluentinc https://packages.confluent.io/helm

helm repo update

Install CFK using the default configuration:
```
helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes \
  --namespace <namespace>
```
Note
If you want to use the CFK to deploy a KRaft-based cluster with the data recovery option, you need to specify the additional kRaftEnabled flag in the helm upgrade command. See Deploy CFK with KRaft data recovery option for the steps to deploy CFK with the data recovery feature.

Deploy CFK using the download bundle

Download the CFK bundle using the following command, and unpack the downloaded bundle:

curl -O https://packages.confluent.io/bundle/cfk/confluent-for-kubernetes-3.1.1.tar.gz

From the helm sub-directory of where you downloaded the CFK bundle, install CFK:

helm upgrade --install confluent-operator \
  ./confluent-for-kubernetes \
  --namespace <namespace>

Deploy CFK using OpenShift OperatorHub

You can use the OpenShift OperatorHub web console to deploy CFK on OpenShift clusters as described in Adding Operators to a cluster.

The configuration options are limited when you install CFK in OperatorHub where the Subscription object deploys CFK. There are a few deployment settings which can be overridden permanently in the Subscription object, and most other properties of CFK can be changed by editing the ClusterServiceVersion. However, the changes to the ClusterServiceVersion are temporary and will be lost when a new version of CFK is available in Operator Lifecycle Manager (OLM).

Currently, when installing CFK operator using OperatorHub, you can only customize for namespaced deployment.

Other custom parameters such as license Key are not supported when using the OperatorHub.

Refer to the following document for available configuration options in OperatorHub:

Configurations that can be modified by the Subscription object

Refer to the following document for how to configure available options in OperatorHub:

How to change the Operator resource values when under OLM management

Deploy customized CFK

You can customize the configuration of CFK when you install or update it. Using Helm, you can:

Set configuration values in a YAML file which you pass to helm commands via the --values or -f flag.
Set configuration values as key=value pairs directly in your helm commands via the --set, --set-string, and --set-file flags.
Combine the above two approaches, where you may set some configuration in a values file and others on the command line with the --set flags.

Refer to Helm documentation for more details on the Helm flags.

To deploy customized Confluent for Kubernetes using the CFK values.yaml file:

Find the default values.yaml file:
- If you are using Helm repo to deploy CFK, pull the CFK Chart:
```
mkdir -p <CFK-home>

helm pull confluentinc/confluent-for-kubernetes \
  --untar \
  --untardir=<CFK-home> \
  --namespace <namespace>
```
  The values.yaml file is in the <CFK-home>/confluent-for-kubernetes directory.
- If you are using a download bundle to deploy CFK, the values.yaml file is in the helm/confluent-for-kubernetes directory under where you downloaded the bundle.
Create a copy of the values.yaml file to customize CFK configuration. Do not edit the default values.yaml file. Save your copy to any file location; we will refer to this location as <path-to-values-file>.
(Recommended) Preview the rendered Kubernetes manifests before installing CFK.
For debugging and validating deployments, when you have a custom values file, it is recommended that you run the helm upgrade command with the --dry-run flag to preview. The command with the --dry-run flag will not actually apply the changes to the cluster, but will print the Kubernetes manifests that would be applied.
```
helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes \
  --values <path-to-values-file> \
  --namespace <namespace>
  --dry-run
```

Install CFK using the customized configuration:

helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes \
  --values <path-to-values-file> \
  --namespace <namespace>

To deploy customized Confluent for Kubernetes with the helm upgrade command:

Specify the configuration options using the --set flag:

helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes \
  --set <setting1>=<value1> \
  --set <setting2>=<value2> \
  ...
  --namespace <namespace>

Configure CFK to manage Confluent Platform components across all namespaces

By default, CFK deploys Confluent Platform in the namespaced deployment, and it manages Confluent Platform component clusters and resources in the same Kubernetes namespace where CFK itself is deployed.

To enable CFK to manage Confluent Platform resources across all namespaces in the cluster mode, set the namespaced configuration property to false in the install command:

helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes \
  --set namespaced=false \
  --namespace <namespace>

Alternatively, you can update the values.yaml file as described above, and set the following property:

namespaced: false

Configure CFK to manage Confluent Platform components in different namespaces

In a namespaced deployment (namespaced: true), CFK, by default, only watches the same Kubernetes namespace that it is deployed in.

To enable CFK to manage Confluent Platform resources deployed in different namespaces, specify a list of namespaces for CFK to watch. The list must contain the namespaces of CFK as well as the Confluent Platform components.

You can specify the namespaces in the install command. For example:

helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes \
  --set namespaceList="{confluent-namespace,cfk-namespace}" \
  --namespace cfk-namespace \
  --set namespaced=true

Alternatively, you can update the values.yaml file as described above, and set the following property:

namespaceList: [confluent-namespace,cfk-namespace]

Deploy CFK after separately installing CRDs

By default, when you deploy CFK via helm, the helm command also installs the Confluent Platform custom resource definitions (CRDs). However, as the user responsible for deploying CFK, you may not have the permission to install those CRDs, so your helm installation would fail.

The responsibility for installing CRDs may only belong to your Kubernetes cluster administrator. In this situation, your Kubernetes cluster admin must have already installed the required Confluent CRDs in advance as described in Strict permissions and restricted namespace access.

To instruct helm to skip trying to install the CRDs, add --skip-crds to the install command:

helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes \
  --skip-crds \
  --namespace <namespace>

Deploy CFK without creating roles and role bindings

By default, when you deploy CFK via helm, the helm command also creates the Kubernetes role and role binding (or cluster role and cluster role binding) needed for CFK to function at the same time. However, as the user responsible for deploying CFK, you may not have the ability to manage Kubernetes RBAC permissions, so your helm installation would fail. The responsibility for managing Kubernetes RBAC permissions may only belong to your Kubernetes cluster administrator. In this situation, your Kubernetes cluster admin must have already created the requisite RBAC resources in advance (see Prepare Kubernetes Cluster for Confluent Platform and Confluent for Kubernetes). To instruct helm to skip trying to create RBAC resources again, add --setrbac=false to the install command:

helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes \
  --set rbac=false \
  --namespace <namespace>

Alternatively, you can update the values.yaml file as described above, and set the following property:

rbac: false

Deploy CFK with custom service account

To provide a custom service account to manage CFK, add --setserviceAccount.create=false --set serviceAccount.name=<name> to the install command:

helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes \
  --set serviceAccount.create=false \
  --set serviceAccount.name=<service-account-name> \
  --namespace <namespace>

Alternatively, you can update the values.yaml file as described above, and set the following property:

serviceAccount:
  create: false
  name: <service-account-name>

Note that if you use a custom service account and set rbac=false, meaning that the roles and role bindings were pre-created by your Kubernetes cluster admin, then you must ensure that your <service-account-name>> matches the subject name in the pre-created role binding.

Automount service account tokens for CFK

CFK requires access to the Kubernetes API to manage Confluent Platform resources. This access is provided through a ServiceAccount token that Kubernetes automatically mounts into pods.

CFK operator
The CFK operator must have automountServiceAccountToken set to true (which is the Kubernetes default). This applies whether you use the default service account or create a custom service account.
Confluent Platform components
Confluent Platform components deployed by CFK do not directly interact with the Kubernetes API server. For enhanced security, you can disable automountServiceAccountToken at the pod level for specific Confluent Platform components. For more details, see Opt out of API credential automounting.

Deploy CFK with cluster object deletion protection

Confluent for Kubernetes (CFK) provides validating admission webhooks for deletion events of the Confluent Platform clusters.

CFK webhooks are disabled by default in this release of CFK.

CFK provides the following webhooks:

Webhook to prevent component deletion when its persistent volume (PV) reclaim policy is set to Delete
This webhook (cfk-resources.webhooks.platform.confluent.io) blocks deletion requests on CRs with PVs in ReclaimPolicy: Delete. Without this prevention, a CR deletion will result in the deletion of those PVs and data loss.
This webhook only applies to the components that have persistent volumes, namely, ZooKeeper (Confluent Platform 7.9 or earlier only), Kafka, ksqlDB, and Control Center (Legacy).
In addition to blocking deletion as described above, this webhook blocks updates to prevent manual modifications during ZooKeeper to KRaft migration when the CR has the platform.confluent.io/kraft-migration-cr-lock: "true" annotation set.
This webhook does not block normal CR updates (outside of KRaft migration).
Webhook to prevent CFK StatefulSet deletion
The proper way to delete Confluent Platform resources is to delete the component custom resource (CR) as CFK watches those deletion events and properly cleans everything up. Deletion of StatefulSets can result in unintended PV deletion and data loss.
This webhook (core-resources.webhooks.platform.confluent.io) blocks delete requests on CFK StatefulSets.
Webhook to prevent unsafe Kafka pod deletion
This webhook (kafka-pods.webhooks.platform.confluent.io) blocks Kafka pod deletion when the removal of a broker will result in fewer in-sync replicas than configured in the min.insync.replicas Kafka property. Dropping below that value can result in data loss. Pod deletion can happen during Kubernetes maintenance without warning, such as during node replacement, and this webhook is an additional safeguard for your Kafka data.
Review the following when using this webhook:
- This webhook is only supported on clusters with fewer than 140,000 partitions.
- This webhook does not take the Kafka setting, minimum in-sync replicas (min.insync.replicas), into consideration.
  The minimum in-sync replicas setting on all topics is assumed to be 2 for Kafka with 3 or more replicas. Do not create topics with minimum in-sync replicas set to 1.
- To avoid having an internal ksqlDB topic with min in-sync replicas set to 1, set the ksqlDB internal topic replicas setting to 3 using configOverrides in the ksqlDB CR:
```
spec:
  configOverrides:
    server:
      - ksql.internal.topic.replicas=3
```
Webhook to prevent unsafe pod eviction
This webhook (evictions.webhooks.platform.confluent.io) follows the same logic as the pod deletion webhook described above and prevents the creation of the pod eviction object which results in the draining of the pod nodes.

Requirements

TLS certificates

Before you deploy CFK with webhooks enabled, you must provide TLS certificates to be used for secure communication between the webhook server and the Kubernetes API server:

Create signed TLS keys and certificates in the format as described in Provide TLS keys and certificates in PEM format or Provide TLS keys and certificates in Java KeyStore format.
The certificate must have the Subject Alternative Name (SAN) of the form, confluent-operator.<namespace>.svc, which is the cluster-internal DNS name for the service in the namespace the CFK pod is getting deployed into.
Provide the above certificates to the CFK pod using one of the following:
- Add the certificates to a Kubernetes secret and put them in the namespace that the CFK pod is getting deployed in.
- Add the certificates in Vault. All certificates must be in the same directory.

Kubernetes metadata label

If on the Kubernetes version lower than 1.21, before you deploy CFK with webhooks enabled, add the kubernetes.io/metadata.name label on all namespaces where the webhooks should validate requests on.

This label is automatically set in Kubernetes 1.21 or later. For reference, see Automatic labeling.

Check labels on the namespace Confluent Platform is getting deployed in:

kubectl get namespace <namespace> --show-labels

An example output:

NAME       STATUS   AGE   LABELS
operator   Active   48d   <none>

Set the kubernetes.io/metadata.name label to the name of the namespace itself:

kubectl label namespace <namespace> kubernetes.io/metadata.name=<namespace>

Validate.

kubectl get namespace <namespace> --show-labels

Following is an sample output:

NAME       STATUS   AGE   LABELS
operator   Active   48d   kubernetes.io/metadata.name=operator

Enable webhooks

Use the Helm value to enable or disable the validating webhooks.

Update the values.yaml file as described above, and set the following properties:

webhooks:
  enabled: true                  --- [1]
  tls:                           --- [2]
    secretRef:
    directoryPathInContainer:

[1] Required to enable CFK webhooks.
[2] Specify secretRef or directoryPathInContainer value that you created in TLS certificates.

When enabling the webhook to prevent unsafe Kafka pod deletion for clusters with 100,000 or more partitions, increase the memory limit in values.,yaml:

resources:
 limits:
   memory: 1024Mi

Enable the webhook for RBAC-enabled Kafka

After you deploy CFK with the Kafka deletion webhook enabled, if you are deploying an RBAC-enabled Kafka, you may need to give the RBAC user read access to all Kafka topics if that user is not a super user.

In the Kafka custom resource (CR), if the user configured in spec.dependencies.kafkaRest.authentication.bearer.secretRef is not included in the spec.authorization.superUsers list, create a rolebinding CR for that user as in the below example:

apiVersion: platform.confluent.io/v1beta1
kind: ConfluentRolebinding
metadata:
  name: clusterread
  namespace: confluent
spec:
  principal:
    type: user
    name: <principal>
  role: DeveloperRead
  resourcePatterns:
    - resourceType: Topic
      name: '*'
  kafkaRestClassRef:
    name: primary

Disable webhooks

After you deploy CFK with webhooks enabled, you can disable CFK webhooks at the namespace or component level by applying the following labels for a namespace or for a component CR.

Disable all CFK validation webhooks:

confluent-operator.webhooks.platform.confluent.io/disable: "true"

Disable the webhook that validates StatefulSet deletion:
```
confluent-operator.webhooks.platform.confluent.io/allow-statefulset-deletion: "true"
```
This label is only applied at the CR level.

Disable the webhook that validates CR deletion for PV reclaim policy:

confluent-operator.webhooks.platform.confluent.io/allow-pv-deletion: "true"

For example, to allow Kafka CR deletion when PVs are in the Delete mode, apply the following label:

kubectl -n operator label kafka kafka \
  confluent-operator.webhooks.platform.confluent.io/allow-pv-deletion="true"

Disable the webhook that validates Kafka pod deletion:

confluent-operator.webhooks.platform.confluent.io/allow-kafka-pod-deletion: "true"

Deploy CFK with KRaft data recovery option

KRaft data recovery feature supports cluster recovery in case of accidental deletion of a cluster (using CR). When the feature is enabled, CFK stores the KRaft cluster ID as annotation in the persistent volume (corresponding to the persistent volume claim). The persistent volumes for this feature needs to be in the retain mode.

Starting CFK 2.8.0, this feature is optional and is disabled by default.

To enable the KRaft data recovery feature, deploy CFK with the –-setkRaftEnabled=true flag in the helm upgrade command as shown in the below example command:

helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes \
  --set kRaftEnabled=true \
  --namespace <namespace>

You cannot enable the data recovery feature in a namespace-scoped deployment because CFK needs to create the cluster wide permission on persistent volume, ClusterRole and ClusterRolebinding, required for the feature.

If ClusterRole and ClusterRolebindings are created out-of-band, it should have the following rule to get, list, watch, and update PersistentVolumes to deploy KRaft controllers:

kind: ClusterRole
rules:
- apiGroups:
  - ""
  resources:
  - persistentvolumes
  verbs:
  - get
  - list
  - watch
  - update

Deploy multiple replicas of CFK

When you deploy CFK with multiple replicas, one replica is active while the others are passive and standby. CFK logs are only generated by the active replica.

If the active replica fails, one of the standby replicas takes over and becomes active.

To deploy multiple replicas of CFK, in the CFK install command, set the replicas configuration property to the number of replicas you want.

For example, to deploy two CFK replicas:

helm upgrade --install confluent-operator \
  confluentinc/confluent-for-kubernetes \
  --set replicas=2 \
  --namespace <namespace>

Alternatively, you can update the values.yaml file as described in this section to set the following property:

replicas: <number of CFK replicas>

Deploy CFK with custom environment variables

You can add custom variables in values.yaml to be used by Helm during CFK installation.

As a use case, you can set HTTP_PROXY, HTTPS_PROXY, or NO_PROXY, and these environment variables get picked up by http.ProxyFromEnvironment set in the http.transport.

The following snippet of a values.yaml file specifies the address of an HTTP and HTTPS proxy servers:

customEnvVars:
  - name: HTTP_PROXY
    value: "http://proxy.example.com:3128"
  - name: HTTPS_PROXY
    value: "http://proxy.example.com:3128"
  - name: NO_PROXY
    value: "localhost,127.0.0.1,.newdomain.com,.svc.cluster.local"

Confluent plugin

The Confluent for Kubernetes (CFK) bundle contains a Confluent plugin for interacting with Confluent for Kubernetes. It is supported for three environments: Linux, Windows, and Darwin. See Confluent plugin for more information about the tool.

Install Confluent plugin

If you deployed CFK using the Helm repo, download and unpack the CFK bundle as described in the first step in Deploy CFK using the download bundle.
If you deployed CFK using the download bundle, skip this step.
If you are upgrading from an older version of the Confluent plugin, delete the old plugin from the directory where you previously installed it.
From the directory where you deployed CFK, unpack the kubectl plugin that matches your client environment, Linux, Windows or Mac OS (Darwin), into your client environment local executables directory. On Mac OS and Linux, this would be /usr/local/bin/. This will allow the standard CLI kubectl to find the plugin.
1. Check the arm64 or amd64 architecture type of the machine by running the following command at the terminal:
```
arch
```
2. Using the architecture you retrieved in the previous step, amd64 or arm64, unpack the plugin based on the OS and its architecture type:
```
tar -xvf kubectl-plugin/kubectl-confluent-<environment>-<architecture>.tar.gz \
   -C <local directory for the plugin>
```
  For example, on MacOS with arm64 architecture type:
```
tar -xvf kubectl-plugin/kubectl-confluent-darwin-arm64.tar.gz \
   -C /usr/local/bin/
```

Install Confluent plugin using Krew

Krew is the plugin manager for kubectl. Take the following steps if you want to use Krew to install the Confluent plugin.

Install Krew as described in Krew User Guide.
If you deployed CFK using the Helm repo, download and unpack the CFK bundle as described in the first step in Deploy CFK using the download bundle.
If you used Deploy CFK using the download bundle, skip this step.
Go to the kubectl-plugin sub-directory under the directory where you unpacked the CFK bundle.
If you are upgrading from an older version of the Confluent plugin, delete the old plugin:
```
kubectl krew uninstall confluent
```
Install the Confluent plugin:
1. Check the arm64 or amd64 architecture type of the machine by running the following command at the terminal:
```
arch
```
2. Using the architecture you retrieved in the previous step, amd64 or arm64, install the plugin based on the OS and its architecture type:
```
kubectl krew install \
  --manifest=confluent-platform.yaml \
  --archive=kubectl-confluent-<environment>-<architecture>.tar.gz
```
For example, to install the plugin on MacOS with arm64 architecture type:
```
kubectl krew  install \
  --manifest=confluent-platform.yaml \
  --archive=kubectl-confluent-darwin-arm64.tar.gz
```