Fleet Management in Confluent for Kubernetes Blueprints

In Confluent for Kubernetes (CFK) Blueprints, fleet management is a process for managing and monitoring cluster functions such as security, configuration, and monitoring collectively for a set of Confluent Platform clusters in a centralized manner. This topic describes how fleet management works in CFK Blueprints.

When your ConfluentPlatformBlueprint or ClusterClass resource changes, CFK Blueprints automatically detects the changes. However, the rolling of the cluster resources is based on whether the existing clusters are in a running state or not:

  • If the existing cluster is not in a running state, the changes will be automatic and no user intervention is required.

  • If the existing cluster is in a running state, you have to manually intervene by applying the force-roll annotation to the clusters.

    Changes in the Confluent component classes, such as KafkaClusterClass or ConnectClusterClass, do not require a user intervention as CFK Blueprints internally signals all the clusters to reconcile. But the ConfluentPlatformBlueprint CR has other configurations that may require a manual roll when changed.

    Each cluster has a condition that gives an indication if there is any drift between the existing Blueprint or class. For example:

    lastTransitionTime: "2023-03-16T15:19:04Z"
    lastUpdateTime: "2023-03-16T21:17:26Z"
    message: no blueprint or class generation mismatch
    reason: ReconcileSkipped
    status: "True”
    type: cpc.platform.confluent.io/skip-reconcile
    

Below describes impacts of resource changes on the existing cluster managed by CFK Blueprints.

CredentialStoreConfig resource
If the spec.kubernetes.autoRoll field is set to false in the CredentialStoreConfig CR, user intervention is required. You need to apply the force-roll annotation to the cluster deployment CR that uses the CredentialStoreConfig CR.
CertificateStoreConfig resource

Force roll

To force-roll a Confluent Platform cluster, apply the annotation to the cluster as below:

kubectl annotate <CR type> <CR name> -n <namespace> \
  cpc.platform.confluent.io/force-roll=”true” \
  --overwrite

<cluster type> is one of kafkacluster, zookeepercluster, controlcentercluster, ksqldcluster, schemaregistrycluster, connectcluster, or kafkarestproxycluster.

Manual reconcile

You can add the force-reconcile annotation to force the control loop to generate the desired state. This feature force-updates current internal resources managed by CFK Blueprints to align with the desired state. This annotation works for all the CFK Blueprints CRs.

kubectl annotate <CFK Blueprint CR type> <CFK Blueprint CR name> \
  cpc.platform.confluent.io/force-reconcile=true --overwrite

Block reconcile

You can add the block-reconcile annotation to block the changes on the cluster. This helps especially in the debugging scenario. Any changes on the cluster resource will be ignored. The roper condition will be added to the cluster status resource.

kubectl annotate <CFK Blueprint CR type> <CFK Blueprint CR name> \
  cpc.platform.confluent.io/block-reconcile=true --overwrite

Known gaps

  • Because there is no dependency management across Confluent Platform components, some clusters will go into crash loopback for some transition period if the Confluent Platform components that they have dependency on are updating.

    If automation doesn’t work for some reason, you can pass force-roll annotation to trigger an update.

    kubectl annotate <CR type> <CR name> -n <namespace> \
      cpc.platform.confluent.io/force-roll=”true” \
      --overwrite
    
  • Certificate renewal for auto-generated CA is not supported.

    Provide your own CA, or let CFK Blueprints manage the renewal duration which defaults to 5 years.