ZooKeeper to KRaft Migration
This topic describes how to migrate your Confluent Platform deployment from ZooKeeper to KRaft by using Confluent for Kubernetes. The migration supports Confluent Platform version 7.6 and later.
Tip
Review the complete end-to-end examples in the GitHub: CFK Examples for KRaft Migration.
The repository includes examples for non-secured clusters, RBAC-enabled clusters, and multi-region clusters (MRC), with complete YAML files, commands, expected outputs, and troubleshooting guidance.
See also
To deploy a new multi-region KRaft cluster with dynamic quorum, see Deploy a multi-region cluster with dynamic quorum.
Before you begin
Ensure you meet the following requirements to start the migration process:
Confluent Platform 7.6 or later
# Check CP version kubectl get kafka <kafka-name> -n <namespace> -o jsonpath='{.status.version}'
Expected:
7.6.0or later.CFK 2.8.4 or later, 2.9.2 or later, 2.10.x, 2.11.x, or 3.0.x or later
# Check CFK version helm list -n confluent
Expected:
confluent-operatorversion 2.8.4 or later, 2.9.2 or later, 2.10.x, 2.11.x, or 3.0.x or laterFor CFK 3.0 or later, add the
platform.confluent.io/use-log4j1: "true"annotation to KRaftController during migration. Add this annotation when creating the KRaftController CR in Step 2.2.Verify Kubernetes webhooks are enabled:
kubectl get validatingwebhookconfigurations | grep confluent
Expected: Webhook configurations like
confluent-operator-validating-webhook-configuration.If you are migrating a deployment that does not have webhooks or VAP (ValidatingAdmissionPolicy) enabled, make sure no other actor, for example, continuous integration and continuous delivery (CI/CD) tools such as GitOps or FluxCD, is updating or deleting ZooKeeper, Kafka, and KRaft resources while migration is in progress.
Note
The Confluent kubectl plugin is available in CFK 3.2.1 and later. This is recommended for guided, interactive migration. The plugin validates migration state before executing each action and provides step-by-step guidance. For installation instructions, see Install the Confluent kubectl plugin.
To verify the plugin is installed:
kubectl confluent cluster kraft-migration --help
The plugin uses the following syntax, and the alias is
kmj.kubectl confluent cluster kraft-migration <subcommand> [flags]
For available subcommands, see KRaft Migration Plugin Commands.
If you are migrating a deployment that does not have webhooks or VAP (ValidatingAdmissionPolicy) enabled, make sure no other actor is updating or deleting ZooKeeper, Kafka, and KRaft resources while migration is in progress. Other actors include continuous integration and continuous delivery (CI/CD) tools such as GitOps or FluxCD.
Enforce CR locks during KRaft migration
During KRaft migration, CFK locks the Kafka, ZooKeeper, and KRaftController CRs by adding the annotation platform.confluent.io/kraft-migration-cr-lock=true. This prevents accidental modifications or deletions that could disrupt the migration process. Only the CFK operator service account can modify locked CRs.
Note
ValidatingAdmissionPolicy (VAP) support for CR lock enforcement is added in CFK 3.2.1. Earlier versions use webhook-only enforcement.
CFK supports two enforcement mechanisms:
Mechanism | When it applies |
|---|---|
ValidatingAdmissionPolicy (VAP) | Clusters where the |
Webhook | Clusters where the |
CFK auto-detects VAP support at Helm installation or upgrade. If VAP is available and vapPolicies.enabled: true, which is the default setting. VAP handles CR lock enforcement and the webhook skips CR lock checks to avoid dual enforcement. If the API is not available, the webhook handles CR lock enforcement regardless of the vapPolicies.enabled setting.
When available and vapPolicies.enabled: true, which is the default setting, VAP handles CR lock enforcement, and the webhook skips CR lock checks to avoid dual enforcement. If the API is not available, the webhook handles CR lock enforcement regardless of the vapPolicies.enabled setting.
When a CR is locked during migration, CFK denies the following operations for all users except the CFK operator service account:
UPDATE on
Kafka,KRaftController, andZookeeperCRsDELETE on
Kafka,KRaftController, andZookeeperCRs
When you attempt a blocked operation, the system returns an error message.
You can disable VAP by setting the following value during Helm install or upgrade:
vapPolicies:
enabled: false
Bypass CR locks for emergency changes
If you need to make an emergency change to a locked CR during migration, add the bypass annotation to the CR:
Warning
Bypassing the CR lock allows any user to modify the CR. Use this only for emergency scenarios. Incorrect modifications during migration can leave the cluster in an inconsistent state.
kubectl annotate <kind> <name> -n <namespace> \
platform.confluent.io/allow-request-during-kraft-migration=true
For example:
kubectl annotate kafka kafka -n confluent \
platform.confluent.io/allow-request-during-kraft-migration=true
After the emergency change is complete, remove the bypass annotation:
kubectl annotate <kind> <name> -n <namespace> \
platform.confluent.io/allow-request-during-kraft-migration-
Step 1: Derive Kafka IBP version
CFK automatically derives the inter-broker protocol (IBP) version from standard Confluent images. For example, confluentinc/cp-server:7.6.0 uses IBP 3.6.
For custom images, manually specify the IBP version using an annotation because an incorrect IBP version causes the migration to fail. Do not set IBP in configOverrides as the migration process manages this automatically.
Confluent Platform version | IBP version |
|---|---|
7.9.x | 3.9 |
7.8.x | 3.8 |
7.7.x | 3.7 |
7.6.x | 3.6 |
7.5.x | 3.5 |
7.4.x | 3.4 |
Step 1.1: Check your Kafka image type
kubectl get kafka <kafka-name> -n <namespace> -o jsonpath='{.spec.image.application}'
For standard Confluent images, skip to Step 2.
For custom images, continue to the next step.
Step 1.2: Apply IBP annotation for custom images
Apply the IBP annotation matching your Confluent Platform version from the table above:
kubectl annotate kafka <kafka-name> \
platform.confluent.io/kraft-migration-ibp-version="<your-ibp-version>" \
-n <namespace>
The annotation is used by the migration job in Step 3.
Step 2: Create KRaftController CR
For MRC deployments, create and apply the KRaftController CR in each region. The migration does not automatically copy configurations from Kafka to KRaftController. You must explicitly configure KRaftController to match your existing Kafka setup.
Step 2.1: Export current Kafka CR for reference
Use your existing Kafka CR as a template. Most configurations are identical, such as TLS, authentication, authorization, RBAC, and custom JVM settings.
kubectl get kafka <kafka-name> -n <namespace> -o yaml > current-kafka-config.yaml
Step 2.2: Create KRaftController CR with required annotations
Create the kraftcontroller.yaml file.
apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
name: kraftcontroller
namespace: <namespace>
annotations:
platform.confluent.io/kraft-migration-hold-krc-creation: "true"
platform.confluent.io/use-log4j1: "true" # Required for CFK 3.0 or later
spec:
replicas: 3
image:
application: confluentinc/cp-server:<cp-version> # Match your Kafka version
init: confluentinc/confluent-init-container:<init-container-version>
dataVolumeCapacity: 10Gi
Required annotations
kraft-migration-hold-krc-creation: "true": Delays pod creation until the migration job modifies the CR.use-log4j1: "true": Required for CFK 3.0 or later because the migration process requires Log4j 1 for compatibility with Confluent Platform 7.9.x. Remove this annotation after the migration completes. For details, see Step 6.3.
Step 2.3: Add security and configuration settings from your Kafka CR
Review your Kafka CR and add the following configurations to your KRaftController CR as needed.
RBAC configuration (if enabled on Kafka):
You can reuse an existing Kafka super user secret for secretRef: kraftcontroller-credential, or create a new secret. Ensure the principal is listed under spec.authorization.superUsers in both Kafka and KRaftController CRs.
spec:
authorization:
type: rbac
superUsers:
- User:kafka
- User:kraftcontroller
dependencies:
mdsKafkaCluster:
authentication:
type: plain
jaasConfig:
secretRef: kraftcontroller-credential # Create new or reuse existing super user credential
bootstrapEndpoint: kafka.confluent.svc.cluster.local:9071
tls:
enabled: true
Password encoder for cluster linking (if enabled on Kafka):
Extract the password encoder values:
kubectl get secret password-encoder-secret -n <namespace> -o yaml
Add to KRaftController:
spec:
configOverrides:
server:
- password.encoder.secret=<value-from-secret>
- password.encoder.old.secret=<old-value-if-present>
TLS and authentication (if configured on Kafka):
spec:
tls:
secretRef: tls-group1 # Same as Kafka
listeners:
controller:
authentication:
type: plain # Match Kafka's authentication type
jaasConfig:
secretRef: credential
tls:
enabled: true
Custom JVM settings (if configured on Kafka):
spec:
configOverrides:
jvm:
- -Xms4g
- -Xmx4g
Step 2.4: Apply KRaftController CR
kubectl apply -f kraftcontroller.yaml
Step 2.5: Verify KRaftController CR and pod state
Verify that the KRaftController status is HOLD:
kubectl get kraftcontroller <kraftcontroller-name> -n <namespace>
Expected: STATUS: HOLD and REPLICAS: 0
Verify no pods are created:
kubectl get pods -n <namespace> -l app=kraftcontroller
Expected: No resources found in <namespace> namespace.
Troubleshoot: If the STATUS is not HOLD or pods are created, verify both the annotations (kraft-migration-hold-krc-creation and use-log4j1) are set to "true" in the CR YAML.
Step 3: Start migration
For MRC deployments, deploy the migration job in each region. The migration job locks ZooKeeper, Kafka, and KRaft CRs to prevent modifications. The lock requires CFK deployments with webhooks enabled.
The KRaftMigrationJob orchestrates the migration process, places locks on resources, manages phased migration, monitors progress, handles errors, and enables safe rollback during the SETUP, MIGRATE, and DUAL-WRITE phases.
Step 3.1: Create the KRaftMigrationJob CR
Create the kraftmigrationjob.yaml file.
apiVersion: platform.confluent.io/v1beta1
kind: KRaftMigrationJob
metadata:
name: <migration-job-name>
namespace: <namespace>
spec:
dependencies:
kafka:
name: <kafka-name>
namespace: <namespace>
zookeeper:
name: <zookeeper-name>
namespace: <namespace>
kRaftController:
name: <kraftcontroller-name>
namespace: <namespace>
Override config validation
Starting with CFK 3.2.1, CFK validates configOverrides.server before starting KRaft migration. If CFK detects a blocklisted configuration key (for example, zookeeper.connect) on the KRaftController CR, it blocks migration with an actionable error message.
In MRC deployments, you might need to set zookeeper.connect manually on the KRaftController. To bypass the pre-flight check, add the following annotation to your KRaftMigrationJob CR:
apiVersion: platform.confluent.io/v1beta1
kind: KRaftMigrationJob
metadata:
name: <migration-job-name>
namespace: <namespace>
annotations:
platform.confluent.io/kraft-migration-bypass-prechecks: "true"
Warning
Only use the bypass annotation when you have a specific need to override pre-flight validation, such as in MRC deployments. Bypassing pre-flight checks can result in migration failures if conflicting configurations exist.
Step 3.2: Apply KRaftMigrationJob CR
kubectl apply -f kraftmigrationjob.yaml
Step 3.3: Verify that migration has started
kubectl get kraftmigrationjob <migration-job-name> -n <namespace> \
-o jsonpath='Phase: {.status.phase} | SubPhase: {.status.subPhase}{"\n"}'
Expected: Phase: SETUP | SubPhase: SubPhaseSetup...
Using the Confluent kubectl plugin:
List all the migration jobs in the namespace:
kubectl confluent cluster kraft-migration list -n <namespace>
Example output:
NAME KAFKA KRAFTCONTROLLER ZOOKEEPER PHASE SUBPHASE AGE
kraftmigrationjob kafka kraftcontroller zookeeper SETUP SubPhaseSetupPreChecks 2m
Troubleshoot: If the migration job fails to start, verify the CR syntax and ensure all dependencies (Kafka, ZooKeeper, kraftcontroller) exist with exact name matches. The migration starts with the SETUP phase. In CFK 3.2.1 or later, the first subphase is SubPhaseSetupPreChecks, which validates configOverrides.server for blocklisted configuration keys (for example, zookeeper.connect) that conflict with CFK-managed migration configurations. If CFK finds blocklisted keys, it blocks migration with an actionable error message unless you bypass pre-flight checks using the annotation described in Step 3.1.
Step 4: Monitor migration
The migration progresses through the following phases: SETUP > MIGRATE > DUAL-WRITE > FINALIZE > COMPLETE.
Step 4.1: Monitor migration progress
Watch phase and subphase progression:
Terminal 1: Monitor the migration job status:
kubectl get kraftmigrationjob <migration-job-name> -n <namespace> -w
Expected: Real-time progression through phases with STATUS column showing current phase.
Terminal 2: Watch pods with
kubectl get pods -n <namespace> -wTerminal 3: Stream operator logs with
kubectl logs -f deployment/confluent-operator -n confluent | grep -i migration
Using the Confluent kubectl plugin:
Check the status of a specific migration job:
kubectl confluent cluster kraft-migration status --name <migration-job-name> -n <namespace>
The status command displays:
The current phase and sub-phase.
Time spent in the current state.
Kafka cluster ID and IBP version.
Health status of all dependencies (Kafka, KRaftController, ZooKeeper) with ready replica counts.
Contextual guidance on what to do next based on the current phase.
Step 4.2: Check for errors if migration stalls
If a subphase takes longer than expected, check:
# Check migration job status
kubectl get kraftmigrationjob <migration-job-name> -n <namespace> -o yaml | grep -A 20 status
# Check operator for errors
kubectl logs deployment/confluent-operator -n <namespace> --tail=100 | grep -i error
# Check pod status
kubectl get pods -n <namespace>
Expected: Current phase/subphase shown, no errors, all pods Running or in normal rolling restarts.
Step 5: Validate and finalize migration
The migration reaches the DUAL-WRITE phase where the cluster writes metadata to both ZooKeeper and KRaft. Validate the migration before finalizing, or check if rollback is needed.
Step 5.1: Verify DUAL-WRITE mode
Check the migration job status:
kubectl get kraftmigrationjob <migration-job-name> -n <namespace>
Expected: STATUS: DUAL-WRITE
Tip
DUAL-WRITE is a stable waiting state. The migration stays in DUAL-WRITE indefinitely until you manually trigger finalization.
Using the Confluent kubectl plugin:
Confirm the migration has reached the DUAL-WRITE phase:
kubectl confluent cluster kraft-migration status --name <migration-job-name> -n <namespace>
Verify if the output shows Phase: DUAL-WRITE.
Step 5.2: Validate cluster health before finalizing
Run these validation checks to ensure the system is healthy.
Verify all Kafka and KRaftController pods are running:
kubectl get pods -n <namespace> -l app=kafka kubectl get pods -n <namespace> -l app=kraftcontroller
Expected: All pods in Running state.
Verify Kafka and KRaftController status:
kubectl get kafka <kafka-name> -n <namespace> kubectl get kraftcontroller <kraftcontroller-name> -n <namespace>
Expected: Both show
STATUS: RUNNING.Check for errors in logs:
kubectl logs deployment/confluent-operator -n <namespace> --since=24h | grep -i error kubectl logs <kafka-pod-name> -n <namespace> --since=24h | grep -E "ERROR|FATAL" kubectl logs <kraftcontroller-pod-name> -n <namespace> --since=24h | grep -E "ERROR|FATAL"
Expected: No ERROR or FATAL messages.
RBAC validation (if enabled)
Verify MDS endpoint is responding:
kubectl exec <kafka-pod-name> -n <namespace> -- curl -k -s -o /dev/null \ -w "%{http_code}" https://<kafka-service>:8090/security/1.0/authenticate
Expected:
400,401, or200(not503or timeout).Verify ACLs are accessible:
kubectl exec <kafka-pod-name> -n <namespace> -- kafka-acls --list \ --bootstrap-server <kafka-bootstrap-server>:9071 \ --command-config /path/to/client.properties
Expected: ACL list displayed without errors.
Step 5.3: Roll back or Finalize
For example, if critical functionality is broken, performance is worse than expected, or you discover KRaft incompatibility, you can Rollback to ZooKeeper.
Warning
Finalizing the migration removes ZooKeeper dependency from Kafka, removes migration configuration from KRaftController, and transitions the cluster irreversibly to KRaft mode. You cannot roll back after this point.
If you have completed the above steps and if you are ready, apply the finalize annotation:
kubectl annotate kraftmigrationjob <migration-job-name> \
platform.confluent.io/kraft-migration-trigger-finalize-to-kraft=true \
-n <namespace>
Using the Confluent kubectl plugin:
Trigger finalization of the migration:
kubectl confluent cluster kraft-migration finalize --name <migration-job-name> -n <namespace>
As finalization is irreversible, the plugin validates that the migration is in the DUAL-WRITE phase before proceeding and prompts for confirmation.
Step 5.4: Verify migration completion
Monitor the migration job until it reaches the COMPLETE phase:
kubectl get kraftmigrationjob <migration-job-name> -n <namespace> -w
Expected: The migration job progresses to COMPLETE phase.
For detailed phase descriptions, see ZooKeeper to KRaft Migration Phases and Sub-phases.
Step 6: Complete post-migration tasks
After the migration completes successfully, perform these tasks in order.
Step 6.1: Release migration locks
Manually release the migration locks applied in Step 3.
Apply the release lock annotation:
kubectl annotate kraftmigrationjob <migration-job-name> \
platform.confluent.io/kraft-migration-release-cr-lock=true \
-n <namespace>
Using the Confluent kubectl plugin:
Release the migration locks on the Kafka, KRaftController, and ZooKeeper CRs:
kubectl confluent cluster kraft-migration release-lock --name <migration-job-name> -n <namespace>
The plugin validates that the migration is in the COMPLETE phase and prompts for confirmation before releasing locks on Kafka, KRaftController, and ZooKeeper CRs.
Verify locks are removed:
kubectl get kafka <kafka-name> -n <namespace> -o yaml | grep kraft-migration-cr-lock
kubectl get kraftcontroller <kraftcontroller-name> -n <namespace> -o yaml | grep kraft-migration-cr-lock
Expected: No output. It means the locks are released successfully.
Important
Without releasing locks, you cannot modify Kafka or KRaftController configurations, scale resources, or apply upgrades.
Step 6.2: Download updated CRs (optional)
Download the updated CRs for backup or GitOps repository updates:
# Kafka CR
kubectl get kafka <kafka-name> -n <namespace> -o yaml > kafka-kraft-mode.yaml
# KRaftController CR
kubectl get kraftcontroller <kraftcontroller-name> -n <namespace> -o yaml > kraftcontroller.yaml
# Optional: ZooKeeper backup before deletion
kubectl get zookeeper <zookeeper-name> -n <namespace> -o yaml > zookeeper-backup.yaml
Step 6.3: Remove Log4j1 annotation from KRaftController (CFK 3.0 or later)
If using CFK 3.0 or later, remove the platform.confluent.io/use-log4j1 annotation:
kubectl annotate kraftcontroller <kraftcontroller-name> \
platform.confluent.io/use-log4j1- \
-n <namespace>
Verify the annotation is removed:
kubectl get kraftcontroller <kraftcontroller-name> -n <namespace> \
-o jsonpath='{.metadata.annotations.platform\.confluent\.io/use-log4j1}'
Expected: Empty output.
Note
This triggers a KRaftController pod roll to apply Log4j 2 configuration, which is normal and safe after the migration completes. Skip this step if you are using CFK 2.x versions.
Step 6.4: Validate KRaft-only operation
Before deleting ZooKeeper, validate Kafka operates correctly in KRaft-only mode.
Verify Kafka has no ZooKeeper dependency:
kubectl get kafka <kafka-name> -n <namespace> -o jsonpath='{.spec.dependencies}'
Expected: Output shows only kRaftController dependencies with no zookeeper reference.
Verify Kafka is running without ZooKeeper errors:
kubectl get kafka <kafka-name> -n <namespace>
kubectl logs <kafka-pod-name> -n <namespace> --since=24h | grep -iE "zookeeper.*(error|failed|disconnect|timeout|expired)"
Expected: STATUS shows RUNNING, and no ZooKeeper connection errors in logs.
Step 6.5: Delete the ZooKeeper cluster
Warning
Delete ZooKeeper only after confirming:
Kafka has been stable in KRaft-only mode.
All validation tests pass.
No other Kafka clusters use this ZooKeeper.
You have backups of ZooKeeper data if needed.
Verify that no other Kafka clusters depend on this ZooKeeper:
kubectl get kafka --all-namespaces -o yaml | grep -A 5 "zookeeper:"
Expected: No output.
Delete ZooKeeper cluster:
kubectl delete zookeeper <zookeeper-name> -n <namespace>
Watch ZooKeeper pods terminate:
kubectl get pods -n <namespace> -l app=zookeeper -w
Verify Kafka remains operational:
kubectl get kafka <kafka-name> -n <namespace>
Expected: STATUS shows RUNNING.
Clean up ZooKeeper Persistent Volume Claims:
kubectl get pvc -n <namespace> | grep zookeeper
kubectl delete pvc <pvc-name> -n <namespace>
Warning
This action deletes ZooKeeper data permanently. Only delete PVCs if you no longer need the data.
Step 6.6: Clean up migration resources
This is optional. You can save the final status for records, if needed:
kubectl get kraftmigrationjob <migration-job-name> -n <namespace> -o yaml > kmj-final-status.yaml
Delete the migration job:
kubectl delete kraftmigrationjob <migration-job-name> -n <namespace>
Check for and delete migration-specific ConfigMaps or Secrets if they exist:
kubectl get configmaps -n <namespace> | grep migration
kubectl get secrets -n <namespace> | grep migration
Rollback to ZooKeeper
Rollback is supported during the SETUP, MIGRATE, and DUAL-WRITE phases. After applying the finalize annotation or completing migration, rollback is not possible.
Step 1: Trigger rollback
Apply the rollback annotation:
kubectl annotate kraftmigrationjob <migration-job-name> \
platform.confluent.io/kraft-migration-trigger-rollback-to-zk=true --overwrite \
-n <namespace>
Using the Confluent kubectl plugin:
Trigger a rollback of the migration to ZooKeeper:
kubectl confluent cluster kraft-migration rollback --name <migration-job-name> -n <namespace>
The plugin validates that the migration is in the SETUP, MIGRATE, or DUAL-WRITE phase and prompts for confirmation. After triggering, the plugin displays phase-specific guidance. Monitor progress with the status command and proceed to znode removal when prompted.
Tip
If you triggered rollback before KRaftController was started, znode removal is not needed and rollback completes automatically. You can skip to Step 4.
Step 2: Remove nodes from ZooKeeper
Remove the controller and migration znodes from ZooKeeper. You can use the Confluent kubectl plugin (recommended) or run the commands manually.
First, wait for the migration job to reach the correct phase:
kubectl get kraftmigrationjob <migration-job-name> -n <namespace> \
-o jsonpath='{.status.subPhase}'
Expected: SubPhaseRollbackToZkWaitForManualNodeRemovalFromZk (when rolling back from MIGRATE or DUAL-WRITE phase) or SubPhaseRollbackToZkFromSetupWaitForManualNodeRemovalFromZk (when rolling back from SETUP phase).
Using the Confluent kubectl plugin (recommended):
The zk-node-removal command interactively removes both znodes and applies the continue annotation in a single workflow. After running this command, skip to Step 4.
kubectl confluent cluster kraft-migration zk-node-removal --name <migration-job-name> -n <namespace>
If ZooKeeper has TLS enabled:
kubectl confluent cluster kraft-migration zk-node-removal --name <migration-job-name> -n <namespace> \
--zk-tls-config-file /opt/confluentinc/etc/kafka/zk-tls.properties
The plugin guides you through the znode cleanup process, requiring confirmation before each step:
Deletes the controller znode from ZooKeeper.
Deletes the migration znode from ZooKeeper.
Applies the continue annotation to resume rollback.
The plugin automatically derives the ZooKeeper connection details from the KRaftMigrationJob status.
Manual approach:
If the plugin is not available, run the commands manually, then proceed to Step 3 to apply the continue annotation. In production environments with secured ZooKeeper, run these commands from inside a ZooKeeper pod.
Note
Add -zk-tls-config-file <path-to-zookeeper-client-properties> to the zookeeper-shell command only when TLS is enabled.
Step 2.1: Remove controller node
zookeeper-shell <zkhost:zkport> \
deleteall /<kafka-cr-name>-<kafka-cr-namespace>/controller
Step 2.2: Remove migration node
zookeeper-shell <zkhost:zkport> \
deleteall /<kafka-cr-name>-<kafka-cr-namespace>/migration
Troubleshoot: For NoAuthException or Failed to delete some node(s) errors, see Troubleshoot ZooKeeper to KRaft Migration Issues.
Step 3: Continue rollback process (manual approach only)
If you used the Confluent kubectl plugin in Step 2, skip to Step 4. The plugin already applied the continue annotation.
If you removed the znodes manually, apply the continue annotation to resume rollback:
kubectl annotate kraftmigrationjob <migration-job-name> \
platform.confluent.io/continue-kraft-migration-post-zk-node-removal=true \
--overwrite \
-n <namespace>
Step 4: Verify rollback completion
Step 4.1: Verify migration job status
kubectl get kraftmigrationjob <migration-job-name> -n <namespace>
Expected: STATUS: COMPLETE
Step 4.2: Verify Kafka is using ZooKeeper
kubectl get kafka <kafka-name> -n <namespace> -o jsonpath='{.spec.dependencies}'
Expected: Output shows zookeeper dependency.
Step 4.3: Verify data preservation
# List all topics
kubectl exec <kafka-pod-name> -n <namespace> -- kafka-topics --list \
--bootstrap-server kafka:9071 \
--command-config /path/to/client.properties
# Check consumer groups
kubectl exec <kafka-pod-name> -n <namespace> -- kafka-consumer-groups --list \
--bootstrap-server kafka:9071 \
--command-config /path/to/client.properties
Expected: All topics and consumer groups created during DUAL-WRITE are present.
Tip
If Kafka pods are not ready or a rolling restart appears stuck after rollback, check spec.configOverrides.server on the Kafka CR. Fix any incorrect values and the roll resumes automatically.
Step 4.4: Check for errors
kubectl logs <kafka-pod-name> -n <namespace> --tail=50 | grep -E "ERROR|FATAL"
kubectl logs deployment/confluent-operator -n <namespace> --tail=50 | grep -i error
Expected: No ERROR or FATAL messages.
Step 5: Clean up after rollback
Step 5.1: Release CR lock
Apply the release lock annotation:
kubectl annotate kraftmigrationjob <migration-job-name> \
platform.confluent.io/kraft-migration-release-cr-lock=true \
--overwrite -n <namespace>
Using the Confluent kubectl plugin:
kubectl confluent cluster kraft-migration release-lock --name <migration-job-name> -n <namespace>
Step 5.2: Delete migration job
kubectl delete kraftmigrationjob <migration-job-name> -n <namespace>
Step 5.3: Delete KRaftController
kubectl delete kraftcontroller <kraftcontroller-name> -n <namespace>
Tip
To re-attempt migration after rollback, create a fresh KRaftController with the platform.confluent.io/kraft-migration-hold-krc-creation annotation and a new KRaftMigrationJob.