Important

You are viewing documentation for an older version of Confluent Platform. For the latest, click here.

Common Deployment Architectures with Confluent Operator

Install Confluent Operator and Confluent Platform provides a basic Confluent Operator and Confluent Platform deployment example. The following sections provide information about other deployment configurations.

The examples in this guide use the following assumptions:

  • $VALUES_FILE refers to the configuration file you set up in Create the global configuration file.

  • To present simple and clear examples in the Operator documentation, all the configuration parameters are specified in the config file ($VALUES_FILE). However, in your production deployments, use the --set or --set-file option when applying sensitive data with Helm. For example:

    helm upgrade --install kafka \
     --set kafka.services.mds.ldap.authentication.simple.principal=”cn=mds,dc=test,dc=com” \
     --set kafka.services.mds.ldap.authentication.simple.credentials=”Developer!” \
     --set kafka.enabled=true
    
  • operator is the namespace that Confluent Platform is deployed in.

  • All commands are executed in the helm directory under the directory Confluent Operator was downloaded to.

Deploy multiple Confluent Platform clusters

To deploy multiple clusters, you deploy the additional Confluent Platform cluster to a different namespace. Make sure to name the clusters differently than any of your other clusters in the same Kubernetes cluster. Note that when running multiple clusters in a single Kubernetes cluster, you do not install additional Confluent Operator instances.

For example, you have one Operator instance and you want to deploy two Apache Kafka® clusters, you could name the first Kafka cluster kafka1 and the second Kafka cluster kafka2. Once this is done, you deploy each one in a different namespace.

Additionally, since you are not installing a second Operator instance, you need to make sure the Docker registry secret is installed in the new namespace. To do this with the Helm install command, you add global.injectPullSecret=true when you enable the component.

Note

The parameter global.injectPullSecret=true is only required if the Docker secret does not exist in the new namespace or if a Docker secret is actually required to pull images. If you attempt run global.injectPullSecret=true to install a new component in the namespace where a secret exists, Helm returns an error saying that resources are already found.

Using kafka2 in namespace operator2 as an example, the command would resemble the following:

helm upgrade --install \
  kafka2 \
  --values $VALUES_FILE \
  --namespace operator2 \
  --set kafka.enabled=true,global.injectPullSecret=true \
  ./confluent-operator/

Patch the service account once the Docker secret is created in the new namespace.

kubectl -n operator patch serviceaccount default -p '{"imagePullSecrets": [{"name": "confluent-docker-registry" }]}'

If you are using a private or local registry with basic authentication, use the following command:

kubectl -n operator patch serviceaccount default -p '{"imagePullSecrets": [{"name": "<your registry name here>" }]}'

Use multiple availability zones

To use multiple availability zones (AZs), you first need to configure the zones values block in your configuration file ($VALUES_FILE). The example below shows three zones (us-central-a, -b, and -c):

provider:
name: gcp
region: us-central1
kubernetes:
   deployment:
     zones:
      - us-central1-a
      - us-central1-b
      - us-central1-c

Important

If your Kubernetes cluster spans zones a, b, and c and you configure only zones a and b in the yaml block shown above, Confluent Operator schedules pods in zones a, b, and c, not just a and b. Additionally, if you do this, storage disks for all pods are in zones a and b only, but the pods are spread out over zones a, b, and c.

Note

Kubernetes nodes in public clouds are tagged with their AZs. Kubernetes automatically attempts to spread pods across these zones. For more information, see Running in multiple zones.

Use Node Affinity and Pod Anti-Affinity

The following provides information about using Node Affinity and Pod Anti-Affinity.

Note

To use these features, you must add labels to resources in your Kubernetes environment. For more information, see Assigning Pods to Nodes.

  • Node Affinity: If you have special hardware that you want to use to run one or more pods, use Node Affinity. This pins these pods to these special hardware nodes. The values.yaml file for each component has the following section that you can use for this purpose. For more information, see Affinity and anti-affinity. For example:

    ## The node Affinity configuration uses preferredDuringSchedulingIgnoredDuringExecution
    ## https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
    #nodeAffinity:
    #  key: components
    #  values:
    #  - connect
    #  - app
    nodeAffinity:
      key: type
      values:
        kafka
    
  • Pod Anti-Affinity: ZooKeeper and Kafka should run on their own nodes. You can force ZooKeeper and Kafka to never run on the same node by using the section in the values.yaml below. Note that your Kubernetes cluster should have a rack topology domain label for this to work. For more information, see anti-affinity. For example:

    ## Pod Anti-Affinity
    ## It uses preferredDuringSchedulingIgnoredDuringExecution
    ## https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#interlude-built-in-node-labels
    ## Use this capability, if the cluster has some kind of concept of racks
    rack:
      topology: kubernetes.io/hostname
    

Use Replicator with multiple Kafka clusters

The following steps guide you through deploying Replicator on multiple Kafka clusters. This example is useful for testing and development purposes only.

Prerequisite
  • Make sure external DNS is configured and running on your development platform.
  • Make sure you are allowed to create DNS names.

The following steps:

  • Are based on a GCP environment.
  • Use the example mydevplatform.gcp.cloud DNS name.
  • Use two-way TLS security.

Deploy clusters

Complete the following steps to deploy the clusters. Use the example deployment instructions.

  1. Deploy Operator in namespace operator.
  2. Deploy the destination Kafka and ZooKeeper clusters in namespace kafka-dest. Use the following names:
    • ZooKeeper cluster: zookeeper-dest
    • Kafka cluster: kafka-dest
  3. Deploy the source Kafka and ZooKeeper clusters in namespace kafka-src. Use the following names:
    • ZooKeeper cluster: zookeeper-src
    • Kafka cluster: kafka-src
  4. Deploy Replicator in namespace kafka-dest using the default name replicator. Set the Replicator endpoint to replicator.dependencies.kafka.bootstrapEndpoint:kafka-dest:9071. Configure the endpoint for TLS one-way security.

Configure Replicator

Complete the following steps to configure Replicator.

  1. On your local machine, use kubectl exec to start a bash session on one of the pods in the cluster. The example uses the default pod name replicator-0 on a Kafka cluster using the name kafka-dest.

    kubectl -n kafka-dest exec -it replicator-0 bash
    
  2. On the Replicator pod, create and populate a file named test.json. There is no text editor installed in the containers, so you use the cat command as shown below to create this file. Use CTRL+D to save the file.

    cat <<EOF >> test.json
    {
    "name": "test-replicator",
    "config": {
       "connector.class":
       "io.confluent.connect.replicator.ReplicatorSourceConnector",
       "tasks.max": "4",
       "topic.whitelist": "example",
       "key.converter": "io.confluent.connect.replicator.util.ByteArrayConverter",
       "value.converter":
       "io.confluent.connect.replicator.util.ByteArrayConverter",
       "src.kafka.bootstrap.servers": "kafka-src.mydevplatform.gcp.cloud:9092",
       "src.kafka.security.protocol": "SSL",
       "src.kafka.ssl.keystore.location": "/tmp/keystore.jks",
       "src.kafka.ssl.keystore.password": "mystorepassword",
       "src.kafka.ssl.key.password": "mykeypassword",
       "src.kafka.ssl.truststore.location": "/tmp/truststore.jks",
       "src.kafka.ssl.truststore.password": "mystorepassword",
       "dest.kafka.bootstrap.servers": "kafka-dest:9071",
       "dest.kafka.security.protocol": "PLAINTEXT",
       "confluent.license": "",
       "confluent.topic.replication.factor": "3"
      }
    }
    EOF
    
  3. From one of the Replicator pods, enter the command below to POST test the Replicator configuration.

    curl -XPOST -H "Content-Type: application/json" --data @test.json https://localhost:8083/connectors -kv
    
  4. Verify that the Replicator connector is created. The command below should return [''test-replicator''].

    curl -XGET -H "Content-Type: application/json" https://localhost:8443/connectors -kv
    

Test Replicator

Complete the following steps to test Replicator.

  1. On your local machine, use kubectl exec to start a bash session on one of the pods in the cluster. The example uses the default pod name kafka-src-0 on a Kafka cluster using the name kafka-src.

    kubectl -n kafka-src exec -it kafka-src-0 bash
    
  2. On the pod, create and populate a file named kafka.properties. There is no text editor installed in the containers, so you use the cat command as shown below to create this file. Use CTRL+D to save the file.

    cat <<EOF > kafka.properties
    bootstrap.servers=kafka-src.mydevplatform.gcp.cloud:9092
    security.protocol=SSL
    ssl.keystore.location=/tmp/keystore.jks
    ssl.keystore.password=mystorepassword
    ssl.key.password=mystorepassword
    ssl.truststore.location=/tmp/truststore.jks
    ssl.truststore.password=mystorepassword
    EOF
    
  3. Validate the kafka.properties (make sure your DNS is configured correctly).

    kafka-broker-api-versions \
    --command-config kafka.properties \
    --bootstrap-server kafka-src.mydevplatform.gcp.cloud:9092
    
  4. Create a topic on the source Kafka cluster. Enter the following command on the Kafka pod.

    kafka-topics --create --topic test-topic \
    --replication-factor 1 --partitions 4 \
    --bootstrap-server kafka-src.mydevplatform.gcp.cloud:9092
    
  5. Produce in the source Kafka cluster kafka-src.

    seq 10000 | kafka-console-producer --topic test-topic \
    --broker-list kafka-src.mydevplatform.gcp.cloud:9092 \
    --producer.config kafka.properties
    
  6. From a new terminal, start a bash session on kafka-dest-0.

    kubectl -n kafka-dest exec -it kafka-dest-0 bash
    
  7. On the pod, create the following kafka.properties file.

    cat <<EOF > kafka.properties
    sasl.mechanism=PLAIN
    bootstrap.servers=kafka-dest:9071
    security.protocol=PLAINTEXT
    EOF
    
  8. Validate the kafka.properties (make sure your DNS is configured correctly).

    kafka-broker-api-versions --command-config kafka.properties \
    --bootstrap-server kafka-dest:9071
    
  9. Validate that the test-topic is created in kafka-dest.

    kafka-topics --describe --topic test-topic.replica \
    --bootstrap-server kafka-dest:9071
    
  10. Confirm delivery in the destination Kafka cluster kafka-dest.

    kafka-console-consumer --from-beginning --topic test-topic \
    --bootstrap-server kafka-dest:9071 \
    --consumer.config kafka.properties