Reconfigure Confluent Platform after installation¶
You can use Ansible Playbooks for Confluent Platform to update the configuration of Confluent Platform components by rerunning the provisioning playbook with an updated inventory file.
There are two deployment strategies: rolling and parallel. Parallel is the default mode for redeployment on running clusters.
Parallel deployment¶
In a parallel deployment, the deployment steps happen across all nodes in a component at once. This method saves time, but leads to a service-wide simultaneous restart.
Because rolling deployments are less impactful and do not cause a service disruption, they are generally the safer option, but they do not work for every use case. Major authentication and encryption changes do not work in a rolling redeployment because, taking authentication as an example, the first node will be restarted with an updated authentication mechanism that is invalid against the rest of the cluster.
The following reconfiguration use cases are best handled with a parallel redeployment:
- Major authentication changes
- Updating certificates signed by a new CA or intermediate CA
- Enabling RBAC
Rolling deployment¶
In a rolling deployment, one node is reconfigured, redeployed, and has health checks validated before moving onto the next. In the event of a deployment failure on a node, the playbook stops and all remaining nodes stay untouched and keep the old configuration.
The following reconfigurations are best handled with a rolling deployment:
- Simple property updates such as the Kafka property
log.retention.hours
- Java arguments updates
- Environment variable updates
- Updating certificates which are signed by the same CA or intermediate CA
To enable rolling deployment mode, set the following variable:
deployment_strategy: rolling
Or, to select specific components to use the rolling deployment mode, set the following variables:
zookeeper_deployment_strategy: rolling
kafka_broker_deployment_strategy: rolling
Additional variables and tags for reconfiguration¶
To have cp-ansible
pause after each node passes its health
check, set the below variable to true
. The Ansible output logs will stop
and wait for user input to proceed again. This can be useful if you want to do
additional manual verification on each node.
pause_rolling_deployment: true
To specify a component to pause on, set <component>_pause_rolling_deployment
to true
. For example:
zookeeper_pause_rolling_deployment: true
kafka_broker_pause_rolling_deployment: true
The Ansible Tag package
has been added to package installation tasks. It is
highly recommended to skip those tasks to ensure no upgrade happens. Skipping
those tasks will also save time. Add the following argument to your
Ansible command:
--skip-tags package
Update the Confluent Platform configuration¶
Before proceeding with any update, create a backup of your existing inventory file in a version control system, such as Git. This allows you to roll back to a previous configuration in case the reconfiguration fails.
Update your inventory file to reflect desired property changes on the cluster.
Run the provisioning playbook.
For a rolling deployment, run:
ansible-playbook -i hosts.yml confluent.platform.all \ --skip-tags package \ --extra-vars deployment_strategy=rolling
The
--extra-vars
argument overrides variables in your inventory file.For a parallel deployment, run:
ansible-playbook -i hosts.yml confluent.platform.all \ --skip-tags package
Failure handling¶
The following options are supported when a configuration update fails.
Note that many Confluent Platform components (especially Kafka) can handle single node outages.
After a deployment fails on a node, to rollback the node, revert your inventory file and redeploy on the node:
# Revert your inventory file and run the following command. ansible-playbook -i hosts.yml confluent.platform.all \ --skip-tags package \ --limit <broken-node>
Try a new configuration on the broken node. Update your inventory file once again and redeploy on the node:
# Update your inventory file and run the following command. ansible-playbook -i hosts.yml confluent.platform.all \ --skip-tags package \ --limit <broken-node> # Now deploy against all nodes. ansible-playbook -i hosts.yml confluent.platform.all \ --skip-tags package
Enter the following command if you need a parallel restart for the change to work (for example, when enabling RBAC):
ansible-playbook -i hosts.yml confluent.platform.all \ --skip-tags package \ -e deployment_strategy=parallel