Advanced Confluent Platform Configurations with Ansible Playbooks

This section provides information about various deployment configurations for Confluent Platform using Ansible.

Configure Tiered Storage with GCS buckets

Configure Tiered Storage to use the Custom Properties and the Copy Files features of Ansible Playbooks for Confluent Platform.

Get the GCP credentials JSON file on the Ansible control node.

Set the following variables in the hosts.yml file:

all:
  vars:
    kafka_broker_copy_files:
      - source_path: /tmp/gcloud-a5f9c87c81ae.json
        destination_path: /etc/security/google/creds.json

    kafka_broker_custom_properties:
      confluent.tier.feature: "true"
      confluent.tier.enable: "true"
      confluent.tier.backend: GCS
      confluent.tier.gcs.bucket: bucket-name
      confluent.tier.gcs.region: us-west2
      confluent.tier.gcs.cred.file.path: /etc/security/google/creds.json

The credential file destination path must match the confluent.tier.gcs.cred.file.path custom property.

Configure Tiered Storage with S3 buckets

Configure Tiered Storage to use the Custom Properties and the Copy Files features of Ansible Playbooks for Confluent Platform.

Get the AWS credentials file on the Ansible control node.

Set the following variables in the hosts.yml file:

all:
  vars:
    kafka_broker_copy_files:
      - source_path: /tmp/credentials
        destination_path: /etc/security/aws/credentials

    kafka_broker_custom_properties:
      confluent.tier.feature: "true"
      confluent.tier.enable: "true"
      confluent.tier.backend: S3
      confluent.tier.s3.bucket: bucket-name
      confluent.tier.s3.region: us-west-2
      confluent.tier.s3.cred.file.path: /etc/security/aws/credentials

The credential file destination path must match the confluent.tier.s3.cred.file.path custom property.

Deploy Confluent Replicator

Starting in the 6.1.0 release, Ansible Playbooks for Confluent Platform supports deployment of Confluent Replicator.

Using Ansible, you can deploy Replicator with the following security mechanisms:

SASL/PLAIN
SASL/SCRAM
Kerberos
mTLS
Plaintext (which is no auth no encryption)

The general deployment model is to deploy Replicator after both the source and destination clusters have been deployed.

We recommend creating an inventory file specifically for the Replicator deployment, excluding other cluster deployment-related configuration. In this section, an example file, replicator-hosts.yml, is used.

There are two clusters in this example, the source cluster and the destination cluster.

Replicator has four client connections split across the two clusters:

Replicator configuration connection to the cluster which is used for storing configuration information in topics. See Configure Replicator configuration connection.
Replicator monitoring connection which is used to produce metrics to the metrics cluster. This is often the same cluster as the cluster used to store configuration information. See Configure monitoring connection.
Replicator consumer connection which is used to consume data from the source cluster. See Configure consumer connection.
Replicator producer connection which is used to produce data to the destination cluster. See Configure producer connection.

The following sections list the configuration properties required in the Replicator inventory file. The examples use:

SASL/PLAIN with TLS on the source cluster
Kerberos with TLS on the destination cluster

After configuring the replicator, you deploy the replicator with the following command. The command uses the example inventory file, replicator-hosts.yml.

ansible-playbook -i replicator-hosts.yml playbooks/all.yml

Configure Replicator configuration connection

Define the kafka_connect_replicator group and hosts to deploy to.

For example:

kafka_connect_replicator:
  hosts:
    ip-172-31-34-246.us-east-2.compute.internal:

Define the listener for Replicator configuration cluster.
The following is an example of a listener with Kerberos authentication and TLS enabled:
```
kafka_connect_replicator_listener:
  ssl_enabled: true
  ssl_mutual_auth_enabled: false
  sasl_protocol: kerberos
```

Define the basic configuration for Replicator connection:

kafka_connect_replicator_white_list: <a comma-separated list of topics to be replicated>
kafka_connect_replicator_bootstrap_servers: <configuration cluster hostname:port>

Define security configuration for the Replicator connection:

kafka_connect_replicator_kerberos_principal: <Kafka principal primary>
kafka_connect_replicator_kerberos_keytab_path: <path to your keytab>
kafka_connect_replicator_ssl_ca_cert_path: <path to your CA certificate>
kafka_connect_replicator_ssl_cert_path: <path to your signed certificate>
kafka_connect_replicator_ssl_key_path: <path to your SSL key>
kafka_connect_replicator_ssl_key_password: <SSL key password>

For RBAC-enabled deployment, define the additional security configuration.

Specify either the Kafka cluster id (kafka_connect_replicator_kafka_cluster_id) or the cluster name (kafka_connect_replicator_kafka_cluster_name).

kakfa_connect_replicator_rbac_enabled: true
kafka_connect_replicator_erp_tls_enabled: <true if Confluent REST API has TLS enabled>
kafka_connect_replicator_erp_host: <Confluent Rest API host URL>
kafka_connect_replicator_erp_admin_user: <mds or your Kafka super user>
kafka_connect_replicator_erp_admin_password: <password>
kafka_connect_replicator_kafka_cluster_id: <destination cluster id>
kafka_connect_replicator_kafka_cluster_name: <destination cluster name>
kafka_connect_replicator_erp_pem_file: <path to oauth pem file>

Set the CLASSPATH to the replicator installation directory in kafka_connect_service_environment_overrides:
```
kafka_connect_service_environment_overrides:
  CLASSPATH: <path to replicator install>/*
```
For more information about setting required Confluent Platform environment variables using Ansible, see Set environment variables.

Configure consumer connection

Define the configuration for the consumer listener on the source cluster.

The following is an example with TLS and SASL/PLAIN enabled:

kafka_connect_replicator_consumer_listener:
  ssl_enabled: true
  ssl_mutual_auth_enabled: false
  sasl_protocol: plain

Define the basic configuration for the consumer client connection:

kafka_connect_replicator_consumer_bootstrap_servers: <source cluster hostname:port>

Define the security configuration for the consumer client connection:

kafka_connect_replicator_consumer_ssl_ca_cert_path: <path to your CA certificate>
kafka_connect_replicator_consumer_ssl_cert_path: <path to your signed certificate>
kafka_connect_replicator_consumer_ssl_key_path: <path to your SSL key>
kafka_connect_replicator_consumer_ssl_key_password: <SSL key password>

Define custom properties for each client connection:

kafka_connect_replicator_consumer_custom_properties:
  <custom property: value>

For RBAC-enabled deployment, define the additional client custom properties.

Specify either the Kafka cluster id (kafka_connect_replicator_consumer_kafka_cluster_id) or the cluster name (kafka_connect_replicator_consumer_kafka_cluster_name).

kafka_connect_replicator_consumer_erp_tls_enabled: <true if Confluent REST API has TLS enabled>
kafka_connect_replicator_consumer_erp_host: <Confluent Rest API host URL>
kafka_connect_replicator_consumer_erp_admin_user: <mds or your Kafka super user>
kafka_connect_replicator_consumer_erp_admin_password: <password>
kafka_connect_replicator_consumer_kafka_cluster_id: <source cluster id>
kafka_connect_replicator_consumer_kafka_cluster_name: <source cluster name>
kafka_connect_replicator_consumer_erp_pem_file: <path to oauth pem file>

Configure producer connection

Define the listener configuration for the producer connection to the destination cluster.
The following is an example with TLS and Kerberos for authentication enabled:
```
kafka_connect_replicator_producer_listener:
  ssl_enabled: true
  ssl_mutual_auth_enabled: false
  sasl_protocol: kerberos
```

Define the basic producer configuration:

kafka_connect_replicator_producer_bootstrap_servers: <destination cluster hostname:port>

Define the security configuration for the producer connection:

kafka_connect_replicator_producer_kerberos_principal: <kafka principal primary>
kafka_connect_replicator_producer_kerberos_keytab_path: <path to your keytab>
kafka_connect_replicator_producer_ssl_ca_cert_path: <path to your CA cert>
kafka_connect_replicator_producer_ssl_cert_path: <path to your signed cert>
kafka_connect_replicator_producer_ssl_key_path: <path to your ssl key>
kafka_connect_replicator_producer_ssl_key_password: <ssl key password>

Define custom properties for each client connection:

kafka_connect_replicator_producer_custom_properties:
  <custom property:value>

For RBAC-enabled deployment, define the additional producer custom properties.

kafka_connect_replicator_producer configs default to match kafka_connect_replicator configs. The following are required only if you are producing to a different cluster than where you are storing your configs.

Specify either the Kafka cluster id (kafka_connect_replicator_producer_kafka_cluster_id) or the cluster name (kafka_connect_replicator_producer_kafka_cluster_name).

kakfa_connect_replicator_producer_rbac_enabled: true
kafka_connect_replicator_producer_erp_tls_enabled: <true if Confluent REST API has TLS enabled>
kafka_connect_replicator_producer_erp_host: <Confluent Rest API host URL>
kafka_connect_replicator_producer_erp_admin_user: <mds or your Kafka super user>
kafka_connect_replicator_producer_erp_admin_password: <password>
kafka_connect_replicator_producer_kafka_cluster_id: <destination cluster id>
kafka_connect_replicator_producer_kafka_cluster_name: <destination cluster name>
kafka_connect_replicator_producer_erp_pem_file: <path to oauth pem file>

Configure monitoring connection

Define the listener configuration for the monitoring interceptors:

kafka_connect_replicator_monitoring_interceptor_listener:
  ssl_enabled: true
  ssl_mutual_auth_enabled: false
  sasl_protocol: kerberos

Define the basic monitoring configuration:

kafka_connect_replicator_monitoring_interceptor_bootstrap_servers: <monitoring cluster hostname:port>

Define the security configuration for the monitoring connection.

kafka_connect_replicator_monitoring_interceptor_kerberos_principal: <kafka principal primary>
kafka_connect_replicator_monitoring_interceptor_kerberos_keytab_path: <path to your keytab>
kafka_connect_replicator_monitoring_interceptor_ssl_ca_cert_path: <path to your CA cert>
kafka_connect_replicator_monitoring_interceptor_ssl_cert_path: <path to your signed cert>
kafka_connect_replicator_monitoring_interceptor_ssl_key_path: <path to your ssl key>
kafka_connect_replicator_monitoring_interceptor_ssl_key_password: <ssl key password>

For RBAC-enabled deployment, define additional custom properties for the monitoring connection.

kafka_connect_replicator_monitoring_interceptor configs default to match kafka_connect_replicator configs. The following are required only if you are producing metrics to a different cluster than where you are storing your configs.

Specify either the Kafka cluster id (kafka_connect_replicator_monitoring_interceptor_kafka_cluster_id) or the cluster name (kafka_connect_replicator_monitoring_interceptor_kafka_cluster_name).

kakfa_connect_replicator_monitoring_interceptor_rbac_enabled: true
kafka_connect_replicator_monitoring_interceptor_erp_tls_enabled: <true if Confluent REST API has TLS enabled>
kafka_connect_replicator_monitoring_interceptor_erp_host: <Confluent REST API host URL>
kafka_connect_replicator_monitoring_interceptor_erp_admin_user: <mds or your Kafka super user>
kafka_connect_replicator_monitoring_interceptor_erp_admin_password: password
kafka_connect_replicator_monitoring_interceptor_kafka_cluster_id: <destination cluster id>
kafka_connect_replicator_monitoring_interceptor_kafka_cluster_name: <destination cluster name>
kafka_connect_replicator_monitoring_interceptor_erp_pem_file: <path to oauth pem file>

Deploy Confluent Platform across multiple regions

To configure multi region clusters, use the following properties in the hosts.yml inventory file:

Set replica.selector.class on kafka_broker group.
Set broker.rack uniquely on each Kafka broker host.

For example:

kafka_broker:
  vars:
    kafka_broker_custom_properties:
      replica.selector.class: org.apache.kafka.common.replica.RackAwareReplicaSelector

  hosts:
    ip-192-24-10-207.us-west.compute.internal:
      broker_id: 1
      kafka_broker_custom_properties:
        broker.rack: us-west-2a
    ip-192-24-5-30.us-west.compute.internal:
      broker_id: 2
      kafka_broker_custom_properties:
        broker.rack: us-west-2b
    ip-192-24-10-0.us-west.compute.internal:
      broker_id: 3
      kafka_broker_custom_properties:
        broker.rack: us-west-2a

You can apply the kafka_broker_custom_properties directly within the kafka_broker group as well.

Configure ksqlDB log streaming

To configure ksqlDB log stream, use the following properties in the hosts.yml inventory file.

Without RBAC enabled, set ksql_log_streaming_enabled to true:
```
all:
  ksql_log_streaming_enabled: true
```
With RBAC and Kerberos enabled, set the ksql_log_streaming_enabled to true, and additionally provide the keytab location, ksql_kerberos_keytab_path, and keytab principal, ksql_kerberos_principal, for connecting to your internal listener:
```
all:
  ksql_log_streaming_enabled: true
ksql:
  hosts:
    ip-192-24-34-224.us-west.compute.internal:
      ksql_kerberos_keytab_path: /tmp/keytabs/ksql-ip-192-24-34-224.us-west.compute.internal.keytab
      ksql_kerberos_principal: ksql/ip-192-24-34-224.us-west.compute.internal@REALM.EXAMPLE.COM
```
When RBAC is enabled, you also need to add your keytab principal to your LDAP server in order for the client connection to authenticate.
To configure ksqlDB log streaming with RBAC and MTLS enabled, add the Certificate’s CN, without any special formatting to your LDAP server for the client connection to authenticate. For example, LDAP Username should be specified as ksql1 instead of cn=ksql1.

Configure multiple ksqlDB clusters

To configure multiple ksqlDB clusters, create new groups for each cluster and set them as children of the ksqlDB group.

The Ansible groups cannot be named ksql.

The name of these groups determine how each cluster is named in Control Center (Legacy).

Each ksqlDB cluster needs a unique value for the ksql_service_id property. By convention, the service ID should end with an underscore.

For example:

ksql:
  children:
    ksql1:
    ksql2:

ksql1:
  vars:
    ksql_service_id: ksql1_
  hosts:
    ip-172-31-34-15.us-east-2.compute.internal:
    ip-172-31-37-16.us-east-2.compute.internal:

ksql2:
  vars:
    ksql_service_id: ksql2_
  hosts:
    ip-172-31-34-17.us-east-2.compute.internal:
    ip-172-31-37-18.us-east-2.compute.internal:

To configure Control Center (Legacy) for multiple ksqlDB clusters, set the ksql_cluster_ansible_group_names property to a list of all ksqlDB children groups.

For example:

control_center:
  vars:
    ksql_cluster_ansible_group_names:
      - ksql1
      - ksql2

  hosts:
    ip-172-31-37-15.us-east-2.compute.internal:

Configure multiple Connect clusters

To configure multiple Connect clusters, create a new group for each cluster and set it as children of the kafka_connect group.

The Ansible groups cannot be named kafka_connect.

Each Connect cluster needs a unique value for the kafka_connect_group_id property. The value of kafka_connect_group_id will be the name of the connect cluster within Control Center (Legacy).

For example:

kafka_connect:
  children:
    syslog:
    elastic:

syslog:
  vars:
    kafka_connect_group_id: connect_syslog
  hosts:
    ip-172-31-34-246.us-east-2.compute.internal:

elastic:
  vars:
    kafka_connect_group_id: connect-elastic
  hosts:
    ip-172-31-34-247.us-east-2.compute.internal:

For an example inventory file that configures two Connect clusters on the same host, see:

https://github.com/confluentinc/cp-ansible/blob/7.2.16-post/docs/sample_inventories/multi_connect_workers_on_single_node.yml

To configure Control Center (Legacy) for multiple Connect clusters, set the kafka_connect_cluster_ansible_group_names property to a list of all kafka_connect children groups.

For example:

control_center:
  vars:
    kafka_connect_cluster_ansible_group_names:
      - syslog
      - elastic
  hosts:
    ip-172-31-37-15.us-east-2.compute.internal:

Configure multiple Connect services on one host

To configure multiple Connect services on the same host, give each Connect instance a unique name and enable hostname aliasing.

For example:

kafka_connect:
  vars:
    hostname_aliasing_enabled: true
  hosts:
    connect01:
      ansible_host: ec2-34-217-174-252.us-west-2.compute.amazonaws.com
      hostname: ip-172-31-40-189.us-west-2.compute.internal
    connect02:
      ansible_host: ec2-34-217-174-252.us-west-2.compute.amazonaws.com
      hostname: ip-172-31-40-189.us-west-2.compute.internal

When hostname aliasing is enabled, the hostname used in configuration files for a given host will be set to these variables in this precedence:

hostname
ansible_host
inventory_hostname

In the above example, two Connect services will be placed on the same server, and ip-172-31-40-189.us-west-2.compute.internal will be their hostname alias.

Additionally, you must configure variables to make sure each Connect service gets its own configuration files and ports.

Using the previous example, on the connect02 instance, set the following variables:

connect02:
  kafka_connect_service_name: confluent-kafka-connect02
  kafka_connect_config_filename: connect-distributed02.properties
  kafka_connect_rest_port: 8084
  # If JMX Exporter is enabled:
  kafka_connect_jmxexporter_port: 8078
  # If Jolokia is enabled:
  kafka_connect_jolokia_port: 7774
  kafka_connect_jolokia_config: /etc/kafka/kafka_connect_jolokia02.properties

During provisioning, Ansible Playbooks for Confluent Platform will set up a systemd service called confluent-kafka-connect02 for the second Connect instance, colocated with connect01.

Connect to Confluent Cloud

You can use Ansible Playbooks for Confluent Platform to configure and deploy on-premises Confluent Platform to connect to Kafka and Schema Registry running in Confluent Cloud.

Connect to Confluent Cloud Kafka

To enable Confluent Platform components to connect to Confluent Cloud Kafka, get the bootstrap servers, api key, and secret, and set the following variables in the hosts.yml file:

ccloud_kafka_enabled
ccloud_kafka_bootstrap_servers
ccloud_kafka_key
ccloud_kafka_secret

For example:

all:
  vars:
    ccloud_kafka_enabled: true
    ccloud_kafka_bootstrap_servers: pkc-xxxxx.europe-west1.gcp.confluent.cloud:9092,pkc-yyyy.europe-west1.gcp.confluent.cloud:9092,pkc-zzzz.europe-west1.gcp.confluent.cloud:9092
    ccloud_kafka_key: YYYYYYYYYYYYYY
    ccloud_kafka_secret: zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz

Note

There should not be a zookeeper or kafka_broker group in your inventory file.

Connect to Confluent Cloud Schema Registry

To enable components to connect to Confluent Cloud Schema Registry, get the Schema Registry URL, the api key, and the secret, and set the following variables in the hosts.yml file:

ccloud_schema_registry_enabled
ccloud_schema_registry_url
ccloud_schema_registry_key
ccloud_schema_registry_secret

For example:

all:
  vars:
    ccloud_schema_registry_enabled: true
    ccloud_schema_registry_url: https://psrc-zzzzz.europe-west3.gcp.confluent.cloud
    ccloud_schema_registry_key: AAAAAAAAAAAAAAAA
    ccloud_schema_registry_secret: bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb

See a sample inventory file for Confluent Cloud Kafka and Schema Registry configuration at the following location:

https://github.com/confluentinc/cp-ansible/blob/7.2.16-post/docs/sample_inventories/ccloud.yml

Next step

Install Confluent Platform with Ansible Playbooks.