Prepare Ansible Inventory File to Install Confluent Platform
Before running the Ansible playbooks, you need to generate an inventory file. The inventory file specifies the hosts in which to provision Confluent Platform components. For more information about the Ansible inventory file, see Ansible Inventory Basics.
Generate an inventory file
To generate an inventory file, gather all of the Fully Qualified Domain Names (FQDNs) of your hosts and create a file called hosts.yml on your Ansible control node, setting each hostname under the desired groups as shown below.
The built-in inventory_hostname variable of each host is set to the hostname of the host. The hostname can be internal or external addresses that can be reachable from the control node.
Note
You cannot co-locate Control Center on the same host as Control Center (Legacy).
You cannot co-locate Control Center on the same host as KRaft because of the port collision on 9093.
You cannot install Control Center on a host on an Internet Protocol version 6 (IPv6) network.
An example inventory file snippet, when using KRaft on IPv4 network:
kafka_controller:
hosts:
ip-172-31-34-246.us-east-2.compute.internal:
ip-172-31-37-15.us-east-2.compute.internal:
ip-172-31-34-231.us-east-2.compute.internal:
kafka_broker:
hosts:
ip-172-31-34-246.us-east-2.compute.internal:
ip-172-31-34-247.us-east-2.compute.internal:
ip-172-31-34-248.us-east-2.compute.internal:
schema_registry:
hosts:
ip-172-31-34-246.us-east-2.compute.internal:
kafka_rest:
hosts:
ip-172-31-34-246.us-east-2.compute.internal:
ksql:
hosts:
ip-172-31-37-15.us-east-2.compute.internal:
ip-172-31-37-16.us-east-2.compute.internal:
kafka_connect:
hosts:
ip-172-31-34-246.us-east-2.compute.internal:
control_center:
hosts:
ip-172-31-37-15.us-east-2.compute.internal:
control_center_next_gen:
hosts:
ip-172-31-34-247.us-east-2.compute.internal:
When using ZooKeeper, replace the KRaft controller (kafka_controller) hosts in the above example with ZooKeeper hosts as shown below:
zookeeper:
hosts:
ip-172-31-34-246.us-east-2.compute.internal:
ip-172-31-37-15.us-east-2.compute.internal:
ip-172-31-34-231.us-east-2.compute.internal:
An example inventory file snippet, when deploying Confluent Platform with KRaft on IPv6 network:
kafka_controller:
hosts:
2600:1f14:1a1d:6904:5b64:5d77:4417:370b:
kafka_broker:
hosts:
2600:1f14:1a1d:6904:5b64:5d77:4417:370b:
Manage connections to Confluent Platform hosts
Confluent Ansible supports two ways to manage connections to Confluent Platform hosts:
SSH
AWS Systems Manager (SSM)
Use SSH to connect to Confluent Platform hosts
After generating an inventory file, set connection variables so that the Ansible control node can connect to each Confluent Platform host. Most commonly, Ansible uses SSH for its connections.
For more information about setting up connection variables, see Connecting to hosts: behavioral inventory parameters.
Add the following section to hosts.yml:
all:
vars:
ansible_connection: ssh
ansible_user: ec2-user
ansible_become: true
ansible_ssh_private_key_file: /tmp/certs/ssh_priv.pem
Use the following command to verify that Ansible can connect over SSH:
ansible -i /path/to/hosts.yml all -m ping
The above command validates that a Python Interpreter is available for use on all the hosts, and it returns pong on success.
If you cannot reach the host even after providing the SSH private key path in the hosts.yml file for Ansible, the SSH private key file might have incorrect permissions.
Use the chmod command to update the permissions of the SSH private key file to read-only for the owner.
chmod 400 <ansible_ssh_private_key_file>
It is recommended that you store your inventory file in its own Git repository. You may have a Git repo with an inventory file for each of your deployments.
Use ansible_host for SSH connections
When your inventory_hostname does not work with SSH, you can specify one additional hostname for SSH connection using the ansible_host variable.
In the following example, ip-172-31-40-189.us-west-2.compute.internal is the inventory_hostname, and ec2-34-217-174-252.us-west-2.compute.amazonaws.com is used for SSH.
kafka_broker:
hosts:
ip-172-31-40-189.us-west-2.compute.internal:
ansible_host: ec2-34-217-174-252.us-west-2.compute.amazonaws.com
Use AWS Systems Manager to connect to Confluent Platform hosts
You can use AWS Systems Manager (SSM) to securely manage connections with Confluent Platform hosts on AWS EC2 instances without requiring traditional SSH access.
AWS infrastructure prerequisites
Before running the playbook with SSM configuration, ensure the following AWS infrastructure components are correctly configured to work with AWS Systems Manager.
Required libraries and tools
The Ansible control node requires the following tools to be installed in the Ansible environment:
Infrastructure setup
Requirement | Description |
|---|---|
IAM Roles (mandatory) | EC2 instances must have IAM roles attached for agent communication and S3 access. For more information, see Configure instance permissions for Systems Manager. |
S3 Bucket (mandatory) | An S3 bucket is required by the Ansible SSM connection plugin for the transfer of files. |
VPC Endpoints (optional) | Interface VPC Endpoints are required if you want to enable private subnet access. For more information, see Using VPC Endpoints for Systems Manager. |
Security Groups (optional, required if VPC endpoints are created) | Security Groups must be configured for the EC2 instances and VPC Endpoints to securely route traffic. |
For more information about AWS Systems Manager requirements:
Inventory file configuration
To enable the SSM connection, define the following variables globally within the all:vars section of your inventory file:
Variable Name | Value | Description |
|---|---|---|
|
| Required. Instructs Ansible to use the SSM connection plugin. |
|
| The S3 staging bucket used for temporary file transfer. |
|
| The AWS region where your EC2 instances and S3 bucket reside (for example, |
|
| Required. Enables privilege escalation (sudo) on the remote host for configuration tasks. |
Host definition changes
The host definition must use the IP address or DNS name of the EC2 instance as the inventory hostname, but the actual connection is directed to the Instance ID using the ansible_host variable.
Variable Name | Purpose |
|---|---|
Inventory Hostname | Use the EC2 instance’s network address (for example, |
| Required. Set this to the EC2 Instance ID (for example, |
Example inventory structure:
all:
vars:
ansible_connection: amazon.aws.aws_ssm
ansible_aws_ssm_region: "us-east-2"
ansible_aws_ssm_bucket_name: "cp-ansible-ssm-staging-bucket"
ansible_become: true
kafka_controller:
hosts:
# This is the EC2 instance's network address, mandatory for health checks
ip-172-31-34-246.us-east-2.compute.internal:
# ansible_host MUST be the EC2 Instance ID for SSM connection
ansible_host: i-08c4235335ff8b977
kafka_broker:
hosts:
ip-172-31-34-247.us-east-2.compute.internal:
ansible_host: i-09e133c94f11d8844
ip-172-31-34-248.us-east-2.compute.internal:
ansible_host: i-08d133c94f11d8845
Important
If you have enabled hostname aliasing by setting hostname_aliasing_enabled to true in your inventory, a change is required for your inventory.
Previously, you may have used the ansible_host variable to specify the host’s FQDN. Since ansible_host must now strictly contain the EC2 Instance ID (the SSM target), you must move the host’s network address to the dedicated hostname variable to maintain correct aliasing within the platform.