Migrate Stand-alone Confluent Platform to Ansible Environment¶
Confluent Ansible Discovery is a tool that helps you to migrate a Confluent Platform cluster that was not originally installed by Ansible Playbooks for Confluent Platform (Confluent Ansible) to the Ansible environment.
The tool logs into each host and discovers which Confluent Platform services are running on the host, builds out properties for each Confluent Platform component, and maps them against the Confluent Ansible configurations. The output of the tool is an Ansible inventory file that represents your current Confluent Platform cluster. You can use that inventory file along with Ansible to manage and upgrade the Confluent Platform cluster.
It is recommended that you run the tool in a non-production environment and validate it manually.
Requirements¶
The following requirements are specific to Confluent Ansible Discovery and not for the Confluent Ansible managed nodes.
- The host machine (where the Discovery tool is to run) has password-less SSH access to all Confluent Platform hosts.
- The username has been provided if it is other than
root
. - All Confluent Platform hosts have the same Linux distribution installed.
The following dependencies for Confluent Ansible Discovery must be installed on the machine where you plan to execute the tool from:
- Python 3.8+
- Ansible 2.11
- PyYAML 6.0
- Ansible Runner
- jProperties
Run Confluent Ansible Discovery¶
Create your input file in the
.yml
format, and specify the settings:all: vars: ansible_connection: --- [1] ansible_become: --- [2] ansible_python_interpreter: --- [3] ansible_user: --- [4] ansible_become_user: --- [5] ansible_become_method: --- [6] ansible_ssh_extra_args: --- [7] ansible_ssh_private_key_file: --- [8] from_version: --- [9] verbosity: --- [10] output_file: --- [11] service_override: --- [12] zookeeper_service_name: kafka_broker_service_name: schema_registry_service_name: kafka_connect_service_name: kafka_rest_service_name: ksql_service_name: control_center_service_name:
- [1] Required. The connection plugin used for the task on the target host.
Specify
ssh
ordocker
. - [2] Boolean to use privileged. The default is
False
. - [3] Python interpreter path. The default is
auto
. - [4] The user to log into the hosts.
- [5] The user after privilege escalation.
- [6] The method to become privileged. The default is
sudo
. - [7] Extra arguments for SSH.
- [8] The path to the private key for SSH login.
- [9] Confluent Platform version.
- [10] Log level between
0
and4
,4
being the most verbose. The default value is3
(INFO). - [11] Output inventory file path. The default value is
$(pwd)/inventory.yml
. - [12] Use this section to override the default service names.
The following is a snippet of a sample input file:
vars: ansible_connection: ssh ansible_user: centos ansible_become: true ansible_ssh_extra_args: -o StrictHostKeyChecking=no ansible_ssh_private_key_file: ~/Work/keys/muckrake.pem
- [1] Required. The connection plugin used for the task on the target host.
Specify
In the input file, specify a list of Confluent Platform hosts that the Confluent Platform components are running.
Confluent Ansible Discovery must be able to discover the Confluent Platform services.
All Confluent Platform hosts must have the same Linux distribution installed.
The user running the tool must be able to log into the hosts.
For a cluster with known host group mappings, specify a list of host groups and their hosts. Confluent Ansible Discovery will look for the configured services on the given hosts. Update the service name if using custom service names.
The following is a sample input file for Confluent Ansible Discovery (
hosts.yml
) with known host group mappings.all: hosts: - zookeeper: - kafka_broker: - schema_registry: - kafka_connect: - kafka_rest: - ksql: - control_center: -
For example:
hosts: zookeeper: - host-1.mydomain.com - host-2.mydomain.com kafka_broker: - host-3.mydomain.com - host-4.mydomain.com schema_registry: - host-5.mydomain.com - host-6.mydomain.com
For a cluster with unknown host group mappings, specify a list of hosts. In this case, Confluent Ansible Discovery will try to map the service to the corresponding hosts.
For example:
all: hosts: - host-1.mydomain.com - host-2.mydomain.com - host-3.mydomain.com
Run Confluent Ansible Discovery using the file you created in the previous step (
<path-to-the-input-file>
):cd <some_path>/ansible_collections/confluent/platform PYTHONPATH=. python discovery/main.py --input <path-to-the-input-file> [optional arguments]
The following arguments are supported. If the same setting exists in the input file, the command line argument overrides what is in the input file:
--input
: Input file path.--limit
: A list of host names to be discovered. Use the flag to limit the discovery to a subset of hosts specified in the input file.--from_version
: Confluent Platform version.--verbosity
: The verbosity level between 0 to 4, where 4 means more verbose.--output_file
: The output inventory file name. Default value is./inventory.yml
.
For example:
python discovery/main.py --input discovery/hosts.yml --verbosity 4 --limit host1,host2
Troubleshoot¶
- Issue: The tool returns an error that it failed to create the temporary directory
The default Ansible temp directory might be locked by another Ansible process.
Solution: Delete the temp directory in the Ansible home directory. Ansible home directory is defined in
ansible.cfg
, and the default is~/.ansible/tmp
.
Known issues and limitations¶
- If passwords are encrypted using secret encryption or any other encryption algorithm, this tool would not be able to decrypt them. You have to explicitly add the passwords in the generated inventory file in order to continue using the Confluent Ansible.
- The input file doesn’t support regular expressions (regex) for the hosts name patterns.
- Confluent Ansible Discovery doesn’t support the Confluent Platform community edition.
- At the time of running the tool, the Confluent Platform services should be up and running. Otherwise, the tool will ignore those nodes/services.
- If secrets protection is enabled on the cluster, the tool can populate the master key. However, all passwords should be filled in by the user before using the playbooks.