Configure Kafka Rack Awareness using Confluent for Kubernetes

The Rack Awareness feature in Kafka spreads replicas of the same partition across different racks to minimize data loss in the event of a rack failure.

When you enable rack awareness in Confluent for Kubernetes (CFK), Kubernetes Availability Zones (AZs) are treated as racks. CFK maps each Kafka rack to an AZ and places Kafka partition replicas across AZs.

For more information about Kafka rack awareness, see Balancing Replicas Across Racks.

For a configuration example, see Rack Awareness Tutorial.

Enable the rack awareness

  1. Check the Kubernetes node labels.

    The Kubernetes cluster must have node labels set for the fault domain, and the nodes in each zone should have the same node label.

    Use the following command to check the node labels for topology.kubernetes.io/zone:

    kubectl get node \
      -o=custom-columns=NODE:.metadata.name,ZONE:.metadata.labels."topology\.kubernetes\.io/zone" \
      | sort -k2
    
  2. Configure your Kubernetes cluster with a service account. The service account must be configured with a clusterrole/role that provides the get and list access to both the pods and nodes resources.

    An example CR for a service account with the required role and role binding:

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: kafka
      namespace: confluent
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: kafka-role
    rules:
    - apiGroups:
      - ""
      resources:
      - nodes
      - pods
      verbs:
      - get
      - list
    ---
    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: kafka-binding
    subjects:
    - kind: ServiceAccount
      name: kafka
      namespace: confluent
    roleRef:
      kind: ClusterRole
      name: kafka-role
      apiGroup: rbac.authorization.k8s.io
    
  3. Enable rack awareness in the Kafka CR:

    spec:
      rackAssignment:
        nodeLabels:
        - topology.kubernetes.io/zone --- [1]
      podTemplate:
        serviceAccountName:           --- [2]
      oneReplicaPerNode: true         --- [3]
    
    • [1] Required.
    • [2] The service account that you set up in the previous step with the needed clusterrole/role.
    • [3] Required.

Verify the rack awareness configuration

After you deploy CFK and Confluent Platform, you can verify that rack awareness is correctly configured.

  1. Check that the Kafka pods come up and are running:

    kubectl get pods -owide
    
  2. Check the node labels for topology.kubernetes.io/zone:

    kubectl get node \
     -o=custom-columns=NODE:.metadata.name,ZONE:.metadata.labels."topology\.kubernetes\.io/zone"
    
  3. Check that the broker.rack property has been set inside broker 0 with the correct AZ, cross referencing the outputs from the previous steps:

    kubectl exec -it kafka-0 -- \
      grep 'broker.rack' /opt/confluentinc/etc/kafka/kafka.properties
    
  4. Verify that there is only one broker per node:

    kubectl get pod \
      -o=custom-columns=NODE:.spec.nodeName,NAME:.metadata.name \
        | grep kafka | sort