Migrate Stand-alone Confluent Platform to Ansible Environment

Confluent Ansible Discovery is a tool that helps you to migrate a Confluent Platform cluster that was not originally installed by Ansible Playbooks for Confluent Platform (Confluent Ansible) to the Ansible environment.

The tool logs into each host and discovers which Confluent Platform services are running on the host, builds out properties for each Confluent Platform component, and maps them against the Confluent Ansible configurations. The output of the tool is an Ansible inventory file that represents your current Confluent Platform cluster. You can use that inventory file along with Ansible to manage and upgrade the Confluent Platform cluster.

It is recommended that you run the tool in a non-production environment and validate it manually.

Requirements

The following requirements are specific to Confluent Ansible Discovery and not for the Confluent Ansible managed nodes.

  • The host machine (where the Discovery tool is to run) has password-less SSH access to all Confluent Platform hosts.
  • The username has been provided if it is other than root.
  • All Confluent Platform hosts have the same Linux distribution installed.

The following dependencies for Confluent Ansible Discovery must be installed on the machine where you plan to execute the tool from:

Run Confluent Ansible Discovery

  1. Create your input file in the .yml format, and specify the settings:

    all:
      vars:
        ansible_connection:             --- [1]
        ansible_become:                 --- [2]
        ansible_python_interpreter:     --- [3]
        ansible_user:                   --- [4]
        ansible_become_user:            --- [5]
        ansible_become_method:          --- [6]
        ansible_ssh_extra_args:         --- [7]
        ansible_ssh_private_key_file:   --- [8]
        from_version:                   --- [9]
        verbosity:                      --- [10]
        output_file:                    --- [11]
        service_override:               --- [12]
          zookeeper_service_name:
          kafka_broker_service_name:
          schema_registry_service_name:
          kafka_connect_service_name:
          kafka_rest_service_name:
          ksql_service_name:
          control_center_service_name:
    
    • [1] Required. The connection plugin used for the task on the target host. Specify ssh or docker.
    • [2] Boolean to use privileged. The default is False.
    • [3] Python interpreter path. The default is auto.
    • [4] The user to log into the hosts.
    • [5] The user after privilege escalation.
    • [6] The method to become privileged. The default is sudo.
    • [7] Extra arguments for SSH.
    • [8] The path to the private key for SSH login.
    • [9] Confluent Platform version.
    • [10] Log level between 0 and 4, 4 being the most verbose. The default value is 3 (INFO).
    • [11] Output inventory file path. The default value is $(pwd)/inventory.yml.
    • [12] Use this section to override the default service names.

    The following is a snippet of a sample input file:

    vars:
      ansible_connection: ssh
      ansible_user: centos
      ansible_become: true
      ansible_ssh_extra_args: -o StrictHostKeyChecking=no
      ansible_ssh_private_key_file: ~/Work/keys/muckrake.pem
    
  2. In the input file, specify a list of Confluent Platform hosts that the Confluent Platform components are running.

    Confluent Ansible Discovery must be able to discover the Confluent Platform services.

    All Confluent Platform hosts must have the same Linux distribution installed.

    The user running the tool must be able to log into the hosts.

    • For a cluster with known host group mappings, specify a list of host groups and their hosts. Confluent Ansible Discovery will look for the configured services on the given hosts. Update the service name if using custom service names.

      The following is a sample input file for Confluent Ansible Discovery (hosts.yml) with known host group mappings.

      all:
        hosts:                            -
          zookeeper:
            -
          kafka_broker:
            -
          schema_registry:
            -
          kafka_connect:
            -
          kafka_rest:
            -
          ksql:
            -
          control_center:
            -
      

      For example:

      hosts:
        zookeeper:
          - host-1.mydomain.com
          - host-2.mydomain.com
        kafka_broker:
          - host-3.mydomain.com
          - host-4.mydomain.com
        schema_registry:
          - host-5.mydomain.com
          - host-6.mydomain.com
      
    • For a cluster with unknown host group mappings, specify a list of hosts. In this case, Confluent Ansible Discovery will try to map the service to the corresponding hosts.

      For example:

      all:
        hosts:
          - host-1.mydomain.com
          - host-2.mydomain.com
          - host-3.mydomain.com
      
  3. Run Confluent Ansible Discovery using the file you created in the previous step (<path-to-the-input-file>):

    cd <some_path>/ansible_collections/confluent/platform
    
    PYTHONPATH=. python discovery/main.py --input <path-to-the-input-file> [optional arguments]
    

    The following arguments are supported. If the same setting exists in the input file, the command line argument overrides what is in the input file:

    • --input: Input file path.
    • --limit: A list of host names to be discovered. Use the flag to limit the discovery to a subset of hosts specified in the input file.
    • --from_version: Confluent Platform version.
    • --verbosity: The verbosity level between 0 to 4, where 4 means more verbose.
    • --output_file: The output inventory file name. Default value is ./inventory.yml.

    For example:

    python discovery/main.py --input discovery/hosts.yml --verbosity 4 --limit host1,host2
    

Troubleshoot

Issue: The tool returns an error that it failed to create the temporary directory

The default Ansible temp directory might be locked by another Ansible process.

Solution: Delete the temp directory in the Ansible home directory. Ansible home directory is defined in ansible.cfg, and the default is ~/.ansible/tmp.

Known issues and limitations

  • If passwords are encrypted using secret encryption or any other encryption algorithm, this tool would not be able to decrypt them. You have to explicitly add the passwords in the generated inventory file in order to continue using the Confluent Ansible.
  • The input file doesn’t support regular expressions (regex) for the hosts name patterns.
  • Confluent Ansible Discovery doesn’t support the Confluent Platform community edition.
  • At the time of running the tool, the Confluent Platform services should be up and running. Otherwise, the tool will ignore those nodes/services.
  • If secrets protection is enabled on the cluster, the tool can populate the master key. However, all passwords should be filled in by the user before using the playbooks.