Migrate to Confluent for Kubernetes

This topic describes how to migrate Operator 1.x to Confluent for Kubernetes (CFK) 2.2.4 to manage your Confluent Platform deployment.

Migration is supported from the following software versions:

  • Confluent Operator 1.6.x or 1.7.x
  • Confluent Platform 6.0.x or 6.1.x

In any of the migration options presented below, there is no destructive change to the Confluent Platform deployment. No Confluent component pods, PersistentVolumeClaims or services are deleted in the migration process.

The following are the recommended migration options based on your current deployment state, referred as the Starting Point in the table.

Starting Point Recommended Migration Approach
No out-of-band changes [1] were made to Operator 1 Helm Chart templates Automated in-place migration [2]
Out-of-band changes were made to Operator 1 Helm Chart templates Manual in-place migration
The legacy storage class created by Operator 1 Helm Chart is used Deploy new cluster and replicate data
  • [1] Out-of-band changes refer to the changes you made to Operator 1 deployment outside of the values.yaml configuration file. These changes do not include the configuration changes you made in the values.yaml file.
  • [2] In the following configuration scenarios, even without any out-of-band changes, an automated migration will not be possible. Contact Confluent support for guidance as the scenario requires a manual migration.
    • RBAC is enabled and internalTLS is set to false
    • RBAC is enabled, internalTLS is set to true, and interbrokerTLS is set to false
    • External access is configured using the Node Port service
    • Simple client authorization is enabled

Prerequisites

  • Validate that the reclaim policy of all PVs and storage classes are set to Retain.

    All your PVs and storage classes must have set their reclaim policy set to Retain before you start the migration process.

    1. Check the reclaim policy of PVs using the following command:

      kubectl get pv
      
    2. Update the PVs with the Delete reclaim policy, and set their reclaim policy to Retain using the following command:

      kubectl patch pv <pv-name> -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
      
    3. Check and ensure that the reclaim policy of the storage classes are set to Retain.

  • Ensure the correct hostname in the TLS certificate for internal access.

    If Connect, ksqlDB, or Schema Registry has TLS enabled, ensure that the TLS certificate have the correct hostnames for the inter-component communication over the internal Kubernetes network in the certificate Subject Alternative Name (SAN). This usually maps to the pod DNS name, for example, *.<component-name>.<namespace>.svc.cluster.local.

Migration job CR

Confluent for Kubernetes provides the MigrationJob custom resource definition (CRD) that allows you to specify your current deployment state declaratively. You can trigger a migration by creating a MigrationJob custom resource (CR).

You can access the current state of the migration through the status sub-resource of the MigrationJob CR.

An example for the Kafka section:

spec:
  initContainerImage: confluentinc/confluent-init-container:2.2.4
                                    --- [1]
  kafka: #component section
    enabled: false                  --- [2]
    name: kafka                     --- [3]
    namespace: confluent            --- [4]
    release: kafka                  --- [5]
  • [1] Set to version confluentinc/confluent-init-container:2.2.4 for all migrations.

  • [2] Required. Set to true to migrate the component with the the MigrationJob CR.

  • [3] Required. Set to the name of the Operator 1.x-deployed Helm Chart component. See where to find it in your Operator 1.x Helm values.yaml:

    ## Kafka Cluster
    ##
    kafka:
      name: <this-component-name>
    
  • [4] Required. Namespace to find the current component, and where the migrated resource will be created.

  • [5] Required. Set to the Helm release name specified with the following command when you deployed Operator 1.x:

    helm install <release-name> -f myvalues.yaml ./chart
    

For configuring migration jobs, see example migration job CRs for Confluent components.

Automated in-place migration

The automated in-place migration follows these steps:

  1. Deploy CFK 2.2.4 alongside Confluent Operator 1.x.

  2. Migrate Confluent Platform components one component at a time. See below for the order to migrate.

    Important

    For each component, once the migration job is kicked off, the migration cannot be rolled back.

    1. Execute the migration job for the component.
    2. Monitor the migration job progress.
    3. Validate that the migration job finished.
  3. Once all components are migrated, remove Confluent Operator 1.x.

  4. Check the current version of Confluent Platform for compatibility with CFK 2.2.4 and upgrade Confluent Platform if needed.

Migrate the components in the following order:

  1. Migrate ZooKeeper
  2. Migrate Kafka
  3. Migrate Schema Registry
  4. Migrate ksqlDB
  5. Migrate Connect
  6. Migrate Replicator
  7. Migrate Confluent Control Center

Step 1. Deploy CFK 2.2.4 along side Confluent Operator 1.x

For the complete deployment steps, refer to Deploy Confluent for Kubernetes.

The following are the simple, standard steps to deploy CFK:

  1. Add the Helm repo:

    helm repo add confluentinc https://packages.confluent.io/helm \
      --namespace <namespace>
    
    helm repo update --namespace <namespace>
    
  2. In the same namespace as you have Confluent Operator 1.x, deploy CFK 2.2.4. Use the --version option to specify the CFK version that you are migrating to. For example, to migrate to CFK 2.2.4:

    helm upgrade --install confluent-for-kubernetes \
      confluentinc/confluent-for-kubernetes \
      --set licenseKey=<CFK license key> \
      --version 0.304.72 \
      --namespace <namespace>
    
  3. Verify that both are deployed and running:

    kubectl get pods
    
    NAME                                     READY   STATUS    RESTARTS   AGE
    cc-operator-7dbc8fd598-bh8r7             1/1     Running   0          2d19h
    confluent-operator-5b99cdd9d9-bzmqr      1/1     Running   0          20h
    

Step 2. Execute the migration job one component at a time

Execute the migration job one component at a time.

  1. Configure the migration CR as described in Migration job CR.

    An example to migrate ZooKeeper:

    spec:
      initContainerImage: confluentinc/confluent-init-container:2.2.4
      zookeeper:
        enabled: true
        name: zookeeper
        namespace: confluent
        release: zookeeper
      kafka:
        enabled: false
    
  2. Apply the migration CR. This will start the migration job for the component.

    After you apply the migration CR, you can no longer edit the migration CR.

Step 3. Monitor the migration job execution progress

The migration job for a component executes the following steps:

  1. Create the CFK 2.2.4 CustomResource and corresponding Kubernetes objects.
  2. Roll the component pods to start being managed through the CFK 2.2.4 CR.
  3. Remove the Operator 1.x resources.

During migration, check the progress and status with the following commands. Replace <component> with one of ZooKeeper, Kafka, Schema Registry, Confluent Control Center, ksqlDB, or Connect. Replace <namespace> with your namespace that Operator 1.x is deployed to.

# Check the status of current migration job
kubectl get migration migration -oyaml  -n <namespace> -w

# Check that the migration job creates the CFK 2.2.4 CustomResource
kubectl get <component> -n <namespace>

# Check that the component pods are rolling
kubectl get pods -n <namespace>

Step 4. Validate that the migration job finished

Verify that the migration job successfully finished:

# Validate that migrated components do not have a release
helm list

# Validate that migrated components do not have a PSC - an Operator 1.x object
kubectl get psc -n <namespace>

# Validate that all secrets are prefixed with v2 prefix
kubectl get secrets -n <namespace>

Step 5. Set the Prometheus JMX exporter rule

In all the Confluent Platform custom resources (CRs), set the Prometheus JMX exporter value factor rule to 1, and apply the changes using the kubectl apply command:

spec:
  metrics:
    prometheus:
      rules:
        - valueFactor: 1

Step 6. Remove Confluent Operator 1.x

When all components are successfully migrated, your Confluent Platform deployment should be in a healthy state.

CFK 2.2.4 uses the same image pull secret that Operator 1.x uses. If you use a custom Docker registry, ensure that you can re-apply this secret:

  1. Copy the Docker registry secret to a file.

    kubectl get secret <your docker registry secret> -oyaml > \
       confluent-docker-registry-secret.yaml \
       -n <namespace>
    
  2. Edit the secret file, confluent-docker-registry-secret.yaml, and remove everything in the metadata section except the name and namespace.

Once you’ve validated that all components have been migrated, uninstall Operator 1.x:

  1. Uninstall Operator 1.x:

    helm uninstall <operator 1.x helm release> -n <namespace>
    
  2. Apply the Docker registry secret:

    kubectl apply -f confluent-docker-registry-secret.yaml -n <namespace>
    

Step 7. Upgrade Confluent Platform to the supported version

The migration process does not upgrade the version of Confluent Platform. Check Supported Versions to see if the version of Confluent Platform is compatible with CFK 2.2.4.

If the Confluent Platform version is not compatible, follow the steps in Upgrade Guide to upgrade Confluent Platform to a version supported by CFK 2.2.4.

Manual in-place migration

If you have made changes to the Confluent Operator 1.x Helm Chart templates, contact Confluent Support to review your migration approach with the Confluent engineering team.