Upgrade Confluent Platform with Ansible Playbooks

Ansible Playbooks for Confluent Platform (Confluent Ansible) can upgrade the Confluent Platform components. To safely upgrade your hosts, make use of the Rolling Deployment Strategy, which will go host by host, shutting down the component, upgrading packages, restarting the service, and validating service health before moving on to the next one.

Confluent Platform does not guarantee the clusters and Confluent Platform components in different major versions will be functional. An upgrade process is complete when all the components are upgraded.

Requirements

The upgrade playbooks have the following requirements:

  • Confluent Platform components must have been originally installed and configured using Ansible Playbooks for Confluent Platform.
  • You must have the same hosts.yml file used during the installation.

Upgrade notes

Before you start the upgrade process, review the following changes and make any necessary updates.

Note that ZooKeeper-related points only apply when upgrading ZooKeeper-based deployments.

  • SASL/SCRAM default version

    The default SASL/SCRAM version was changed from 256 to 512.

    If the version of SSL/SCRAM is specified as 256 in your server.properties, you must update your inventory and change sasl_protocol: scram to sasl_protocol: scram256.

  • ZooKeeper TLS

    TLS is now enabled by default for ZooKeeper when ssl_enabled: true is set.

    If your current ZooKeeper deployment does not use TLS, set zookeeper_ssl_enabled: false in your inventory.

  • ZooKeeper server-to-server TLS

    Improved logic and variables were added around ZooKeeper server to server authentication. If you have sslQuorum=true in your zookeeper.properties, set zookeeper_quorum_authentication_type: mtls in your inventory.

  • Enable Admin REST APIs

    When upgrading from 5.5.x to 6.2.x, you must enable Admin REST APIs by setting the following property in your inventory file. If Admin REST APIs is not enabled, component upgrades will fail:

    kafka_broker_rest_proxy_enabled: true
    
  • Disable canonicalization

    If canonicalization has not been enabled during the Confluent Platform cluster creation, explicitly set the following property in the hosts.yml inventory file. An example scenario is when you upgrade a Confluent Platform cluster that uses Kerberos to authenticate Kafka brokers to ZooKeeper.

    kerberos:
      canonicalize: false
    
  • Variable name updates in hosts.yaml

    Misspelled variable names were corrected in the 7.2.2 version.

    If upgrading from a version, earlier than 7.2.2, to a version, 7.2.2 or later, make the following updates in your inventory file:

    • From: kakfa_connect_replicator_<property_name>
    • To: kafka_connect_replicator_<property_name>
  • ZooKeeper version dependency

    The ZooKeeper dependency has been upgraded to 3.8.1 due to 3.6 reaching end-of-life. To bring both your Kafka and ZooKeeper clusters to the latest versions:

    (Case A) If upgrading from a Kafka version 2.4 (Confluent Platform version 5.4) or later:

    1. Kafka clusters can be updated directly.
    2. ZooKeeper clusters that are running binaries bundled with Kafka versions 2.4 or later can be updated normally as specified in the following sections in this document.

    (Case B) If upgrading from a Kafka version older than 2.4 (Confluent Platform version older than 5.4):

    1. Kafka clusters first need to be upgraded to a version 2.4 or later, and earlier than 3.6.
    2. ZooKeeper clusters that are running binaries bundled with a Kafka version that are older than 2.4 need to be updated to the binaries bundled with Kafka versions later than 2.4 and earlier than 3.6.
    3. You could then continue following the steps in Case A above.

Upgrade ZooKeeper-based Confluent Platform deployment

Perform the upgrades in the following order:

  1. Upgrade ZooKeeper.
  2. Upgrade Kafka brokers.
  3. Upgrade (in any order):
    • Schema Registry
    • REST Proxy
    • Connect
  1. Upgrade Confluent Control Center.
  2. Upgrade external clients.
  3. Upgrade Kafka log format. This step ensures that the log is formatted properly for the new version of Confluent Platform after all upgrades have been completed.

Step 1. Download Ansible Playbooks for Confluent Platform

Download the Confluent Ansible for the target version of Confluent Platform that you are upgrading to:

ansible-galaxy collection install confluent.platform:<version>

For example, to upgrade to Confluent Platform 7.5.7:

ansible-galaxy collection install confluent.platform:7.5.7

To upgrade to the latest version of Confluent Platform, you can omit <version>:

ansible-galaxy collection install --upgrade confluent.platform

Step 2. Set rolling deployment strategy

In order to avoid component outages, set the Deployment Strategy to rolling as below:

deployment_strategy: rolling

Step 3. Upgrade ZooKeeper

To upgrade ZooKeeper for a ZooKeeper-based deployment, run the provisioning playbook with the zookeeper tag:

ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
  --tags zookeeper

Step 4. Upgrade Kafka

Upgrading Kafka takes a few steps because you need to apply extra care to the inter.broker.protocol.version and log.message.format.version properties.

  1. Enter the following command and note the current version of confluent-kafka or confluent-server installed on your hosts.

    On Red Hat hosts:

    rpm -qa | grep confluent
    

    An example output:

    confluent-server-6.0.1-1.noarch
    

    On Debian hosts:

    apt list --installed confluent-server
    
  2. Set the version properties.

    • If you are upgrading from 6.0.x Confluent Platform package version, set these custom properties in your inventory:

      kafka_broker_custom_properties:
        inter.broker.protocol.version: 2.6
        log.message.format.version: 2.6
      
    • If you are upgrading from 6.1.x Confluent Platform package version, set these custom properties in your inventory:

      kafka_broker_custom_properties:
        inter.broker.protocol.version: 2.7
        log.message.format.version: 2.7
      
    • If you are upgrading from 6.2.x Confluent Platform package version, set these custom properties in your inventory:

      kafka_broker_custom_properties:
        inter.broker.protocol.version: 2.8
        log.message.format.version: 2.8
      
    • If you are upgrading from Confluent Platform version 7.0 or later, review the Kafka upgrade documentation, and set the following custom property in your inventory with the versions listed in the version table.

      If you are upgrading from Confluent Platform version 7.0 or later, you do not need to set log.message.format.version.

      kafka_broker_custom_properties:
        inter.broker.protocol.version:
      

      For example, for the 3.5.x Kafka package version:

      kafka_broker_custom_properties:
        inter.broker.protocol.version: 3.5
      
  3. To upgrade and set the inter.broker.protocol.version and log.message.format.version (for some versions as specified in the previous step) properties, run the provisioning playbook with the kafka_broker tag:

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags kafka_broker
    
  4. At this point the packages or archive have been upgraded, but the two properties are set to the starting version. Update inter.broker.protocol.version as below:

    kafka_broker_custom_properties:
      inter.broker.protocol.version: 3.5
    

    And run the provisioning playbook again:

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags kafka_broker \
      --skip-tags package
    

Step 5. Upgrade other components

Note

If you need to upgrade specific hosts instead of all of them, you can limit the upgrade. This can be useful when your components are behind a load balancer. In this case, remove a specific host from the load balancer pool, upgrade it, then add it back. This ensures no traffic is disrupted. Enter the following command to limit the upgrade to one or more specific hosts.

ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
  --tags <component> \
  --limit "<host1>,<host2>"

Use the commands below to update the other components:

  • Schema Registry

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags schema_registry
    
  • Connect

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags kafka_connect
    
  • ksqlDB

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags ksql
    
  • REST Proxy

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags kafka_rest
    
  • Confluent Control Center

    After upgrading other Confluent Platform components, upgrade Control Center as the last step.

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags control_center
    

Step 6. Upgrade clients

Confluent Platform Ansible playbooks do not currently support upgrading clients. Review the Confluent Platform upgrade preparation for additional information.

Upgrade co-located ZooKeeper-based Confluent Platform deployment

When your Confluent Platform services are colocated, the upgrade process can leave your hosts in an unstable state as Confluent components all share the same packages. For example if you upgrade ZooKeeper where ZooKeeper and Kafka are running on the same host, the binaries and JARs Kafka depends on will get upgraded/replaced as the service is running.

This section describes how to manually upgrade co-located Kafka, ZooKeeper, and other Confluent Platform components.

Use this upgrade method when the package installation is used (installation_method: package) and when multiple Confluent Platform components are running on the same hosts.

Important

You must upgrade the Kafka controller last, and the ZooKeeper leader second to last.

Step 1. Set broker_id variable on each Kafka host

Confirm each kafka_broker host has the broker_id variable set. For example:

kafka_broker:
  hosts:
    ip-172-31-34-246.us-east-2.compute.internal:
      broker_id: 0
    ip-172-31-34-247.us-east-2.compute.internal:
      broker_id: 1
    ip-172-31-34-248.us-east-2.compute.internal:
      broker_id: 2

If the broker_id variable is not set on each kafka_broker host, use the following command to query the meta.properties file on the Kafka hosts. If you have customized the logs directory, replace /var/lib/kafka/data/ with the log.dirs property value:

ansible -i /path/to/hosts.yml \
  -m shell \
  -a "grep broker.id /var/lib/kafka/data/meta.properties" \
  kafka_broker

In the output similar to below, retrieve the applicable broker.id value.

ip-172-31-34-247.us-east-2.compute.internal | CHANGED | rc=0 >>
broker.id=2
ip-172-31-34-246.us-east-2.compute.internal | CHANGED | rc=0 >>
broker.id=1
ip-172-31-34-248.us-east-2.compute.internal | CHANGED | rc=0 >>
broker.id=3

Step 2. Take note of Kafka controller

In the upgrade process, you must upgrade the Kafka controller and ZooKeeper leader last.

Important

In this section, Kafka controller refers to “the lead broker” that communicates with ZooKeeper. It is not the same as KRaft controller that is referred in other parts of the Ansible Playbooks for Confluent Platform documents.

To query the Kafka controller, run the following command:

ansible -i /path/to/hosts.yml kafka_broker \
  -m import_role \
  -a "name=confluent.platform.kafka_broker tasks_from=dynamic_groups.yml"

In the sample output below, the controller runs on host: ip-172-31-34-246.us-east-2.compute.internal whose Broker ID is the same as the Controller ID.

ip-172-31-34-246.us-east-2.compute.internal | SUCCESS => {
    "msg": "Broker ID: 1 and Controller ID: 1"
}
ip-172-31-34-247.us-east-2.compute.internal | SUCCESS => {
    "msg": "Broker ID: 2 and Controller ID: 1"
}
ip-172-31-34-248.us-east-2.compute.internal | SUCCESS => {
    "msg": "Broker ID: 3 and Controller ID: 1"
}

Step 3. Take note of ZooKeeper leader

To query the ZooKeeper leader, run the following command:

ansible -i /path/to/hosts.yml zookeeper \
  -m import_role \
  -a "name=confluent.platform.zookeeper tasks_from=dynamic_groups.yml"

In the sample output below, the leader runs on host: ip-172-31-34-248.us-east-2.compute.internal.

ip-172-31-34-247.us-east-2.compute.internal | SUCCESS => {
    "msg": "Mode: follower"
}
ip-172-31-34-246.us-east-2.compute.internal | SUCCESS => {
    "msg": "Mode: follower"
}
ip-172-31-34-248.us-east-2.compute.internal | SUCCESS => {
    "msg": "Mode: leader"
}

Step 4. Upgrade Confluent hosts using Limits

Upgrade Confluent components on each host with Ansible Limits.

Using the hosts discovered in the previous steps, select a host that is not a Kafka Controller or ZooKeeper Leader and follow the below steps:

  1. Stop all Confluent services on the host. For example:

    ansible -i /path/to/hosts.yml \
      -m shell \
      -a "systemctl stop confluent-*" \
      ip-172-31-34-247.us-east-2.compute.internal
    
  2. Upgrade and start Confluent components on a given host.

    ansible-playbook \
      -i /path/to/hosts.yml confluent.platform.all \
      --limit ip-172-31-34-247.us-east-2.compute.internal
    

Step 4. Upgrade Kafka and ZooKeeper hosts using Limits

Using the hosts discovered in the previous steps, upgrade ZooKeeper Leader, and then Kafka Controller.

  1. Stop all Confluent services on the host. For example:

    ansible -i /path/to/hosts.yml \
      -m shell \
      -a "systemctl stop confluent-*" \
      ip-172-31-34-247.us-east-2.compute.internal
    
  2. For Kafka, set the inter.broker.protocol.version and log.message.format.version properties as described in Step 4. Upgrade Kafka.

  3. Upgrade and start the ZooKeeper leader and the Kafka controller on a given host.

    You must upgrade the Kafka controller last, and the ZooKeeper leader second to last.

    ansible-playbook \
      -i /path/to/hosts.yml confluent.platform.all \
      --limit ip-172-31-34-247.us-east-2.compute.internal
    

Upgrade KRaft-based Confluent Platform deployment

If you are running a Confluent Platform version later than 7.4.0 and your clusters are running in KRaft mode, you can upgrade your KRaft-based Confluent Platform deployment as described in this section.

Perform the upgrade in the following order:

  1. Upgrade KRaft.
  2. Upgrade Kafka brokers.
  3. Upgrade (in any order):
    • Schema Registry
    • REST Proxy
    • Connect
  1. Upgrade Confluent Control Center.
  2. Upgrade external clients.
  3. Upgrade Kafka metadata version.

Step 1. Download Ansible Playbooks for Confluent Platform

Download the Confluent Ansible for the target version of Confluent Platform that you are upgrading to:

ansible-galaxy collection install confluent.platform:<version>

For example, to upgrade to Confluent Platform 7.5.7:

ansible-galaxy collection install confluent.platform:7.5.7

To upgrade to the latest version of Confluent Platform, you can omit <version>:

ansible-galaxy collection install --upgrade confluent.platform

Step 2. Set rolling deployment strategy

In order to avoid component outages, set the Deployment Strategy to rolling as below:

deployment_strategy: rolling

Step 3. Upgrade KRaft controller and Kafka brokers

  1. To upgrade KRaft, run the provisioning playbook with the kafka_controller tag:

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags kafka_controller
    

    Important: If upgrading a deployment with RBAC and mTLS authentication configured, use the following command to upgrade KRaft and Kafka in one step, and then skip Step #2 below.

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags kafka_controller,kafka_broker
    
  2. To upgrade Kafka, run the provisioning playbook with the kafka_broker tag:

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags kafka_broker
    
  3. Once you are satisfied that the broker performance meets your expectations, increment the metadata.version for the broker by running the kafka-features tool with the upgrade argument:

    ./bin/kafka-features --bootstrap-server server:port \
      upgrade --metadata 3.5
    

    See Configure listeners and Configure listeners for where the bootstrap server host and port are configured.

Step 4. Upgrade other components

Note

If you need to upgrade specific hosts instead of all of them, you can limit the upgrade. This is useful when your components are behind a load balancer. In this case, remove a specific host from the load balancer pool, upgrade it, then add it back. This ensures no traffic is disrupted. Enter the following command to limit the upgrade to one or more specific hosts.

ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
  --tags <component> \
  --limit "<host1>,<host2>"

Use the commands below to update the other components:

  • Schema Registry

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags schema_registry
    
  • Connect

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags kafka_connect
    
  • ksqlDB

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags ksql
    
  • REST Proxy

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags kafka_rest
    
  • Confluent Control Center

    ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
      --tags control_center
    

Step 5. Upgrade clients

Confluent Platform Ansible playbooks do not currently support upgrading clients. Review the Confluent Platform upgrade preparation for additional information.

Upgrade co-located KRaft-based Confluent Platform deployments

When your Confluent Platform services are colocated, the upgrade process can leave your hosts in an unstable state as Confluent components all share the same packages.

This section describes how to manually upgrade colocated Kafka, KRaft, and other Confluent Platform components using Ansible Limits.

Use this upgrade method when the package installation is used (installation_method: package) and when multiple Confluent Platform components are running on the same hosts.

Step 1: Upgrade KRaft controller

  1. Get a list of KRaft hosts where node_id is set:

    ansible -i /path/to/hosts.yml \
      -m shell -a "grep node.id /var/lib/controller/data/meta.properties" \
      kafka_controller
    
  2. For each KRaft host you retrieved in the previous step, repeat:

    1. Stop the KRaft controller using the command. For example:

      ansible -i /path/to/hosts.yml \
        -m shell \
        -a "systemctl stop confluent-kcontroller" \
        ip-172-31-34-247.us-east-2.compute.internal
      
    2. Upgrade and start the KRaft controller on the host. For example:

      ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
        --limit ip-172-31-34-247.us-east-2.compute.internal
        --tags kafka_controller
      

Step 2. Upgrade Kafka brokers

  1. Get a list of Kafka broker hosts where broker_id is set:

    ansible -i /path/to/hosts.yml \
      -m shell -a "grep node.id /var/lib/kafka/data/meta.properties" \
      kafka_broker
    
  2. For each Kafka host you retrieved in the previous step, repeat:

    1. Stop the Kafka broker host using the command. For example:

      ansible -i /path/to/hosts.yml \
        -m shell \
        -a "systemctl stop confluent-server" \
        ip-172-31-34-247.us-east-2.compute.internal
      
    2. Upgrade and start Kafka brokers on the host using the command. For example:

      ansible-playbook -i /path/to/hosts.yml confluent.platform.all \
        --limit ip-172-31-34-247.us-east-2.compute.internal
        --tags kafka_broker
      
  3. Once you are satisfied that the broker performance meets your expectations, increment the metadata.version for the broker by running the kafka-features tool with the upgrade argument:

    ./bin/kafka-features --bootstrap-server <server>:<port> \
      upgrade --metadata 3.5
    

    See Configure listeners and Configure listeners for where the bootstrap server host and port are configured.

Step 3. Upgrade other Confluent components

Repeat the steps described below to upgrade each Confluent host.

  1. Stop all Confluent services on the host. For example:

    ansible -i /path/to/hosts.yml \
      -m shell \
      -a "systemctl stop confluent-*" \
      ip-172-31-34-247.us-east-2.compute.internal
    
  2. Upgrade and start all Confluent components on the host. For example:

    ansible-playbook \
      -i /path/to/hosts.yml confluent.platform.all \
      --limit ip-172-31-34-247.us-east-2.compute.internal