Deploying Confluent Operator and Confluent Platform

Tip

See Introduction to Kubernetes and Confluent Operator for an introduction to Kubernetes and Confluent Operator.

The following sections guide you through deploying one cluster containing the following components:

  • Confluent Operator (one instance)
  • Apache ZooKeeper™ (three replicas)
  • Apache Kafka® (three broker replicas)
  • Confluent Schema Registry (two replicas)
  • Kafka Connect (two replicas)
  • Confluent Replicator (two replicas)
  • Confluent Control Center (one instance)
  • Confluent KSQL (two replicas)

The procedure uses the Google Kubernetes Engine (GKE) as the example provider environment. Also, use this as a guide for deploying Operator and Confluent Platform in supported provider environments.

Note

Confluent recommends Helm 3 for Confluent Operator and Confluent Platform 5.4 deployments. See the following documentation for Helm migration and Operator upgrade instructions.

Important

Red Hat OpenShift support: OpenShift does not currently support Helm 3. You need to use Helm 2 if you want to deploy Operator and Confluent Platform 5.4 on Red Hat OpenShift. After downloading the bundle, click the v5.3.2 tab at the bottom of this page and use the Helm 2 instructions for OpenShift deployments.

Unless specified, the deployment procedure shows examples based on the default configuration parameters in the Helm YAML files provided by Confluent. This includes default naming conventions for components and namespaces. Note that you should not use the default YAML parameters for deploying into a production environment. However, using the default parameters allows you quickly get up and running in a test environment.

Tip

For more information about how to use a shell script to automate deployment of Confluent Operator and Confluent Platform, see the Confluent Operator Quick Start.

General Prerequisites
  • A Kubernetes cluster conforming to one of the supported environments.
  • Cluster size based on the Sizing Recommendations.
  • The provider CLI or Cloud SDK is installed and initialized.
  • kubectl is installed, initialized, with the context set. You also must have the kubeconfig file configured for your cluster.
  • Access to the Confluent Operator bundle (Step 1 below).
  • Storage: StorageClass-based storage provisioner support. This is the default storage class used. Other external provisioners can be used. SSD or SSD-like disks are required for persistent storage.
  • Security: TLS certificates for each component required (if using TLS). Default SASL/PLAIN security is used in the example steps. See Configuring security for information about how to configure additional security.
  • DNS: DNS support on your platform environment is required for external access to Confluent Platform components after deployment. After deployment, you create DNS entries to enable external access to individual Confluent Platform components. See Configuring external load balancers for additional information. If your organization does not allow external access for development testing, see No-DNS development access.
  • Kubernetes Load Balancing:
    • Layer 4 load balancing with passthrough support (terminating at the application) is required for Kafka brokers with external access enabled.
    • Layer 7 load balancing can be used for Operator and all other Confluent Platform components.

Prepare to deploy components

Complete the following steps to prepare for deployment. The procedure uses GCP and the Google Kubernetes Engine (GKE) as the example provider environment. You can use this procedure as a guide for deploying Operator and Confluent Platform in other supported provider environments.

Step 1. Download the Operator bundle for Confluent Platform 5.4

Confluent offers a bundle of Helm charts, templates, and scripts used to deploy Confluent Operator and Confluent Platform components to your Kubernetes cluster. Note that the bundle is extracted to a directory named helm. You should run Helm install commands from within this directory.

Bundle download link:
https://platform-ops-bin.s3-us-west-1.amazonaws.com/operator/confluent-operator-20200115-v0.142.1.tar.gz

Step 2. Install Helm 3

Use the following instructions to install Helm 3. Confluent recommends Helm 3 for use with Confluent Operator. If you need to migrate from Helm 2 to Helm 3, see Migrating from Helm 2 to Helm 3.

Note

For information about the differences between Helm 2 and Helm 3, see the Helm 2 to Helm 3 changes .

Complete the following steps to install Helm 3.

  1. Install Helm using the Helm installation instructions.

  2. Verify that your $PATH is pointing to the Helm 3 binary file. Enter the following command:

    helm version
    

    This command should return Helm 3 version output similar to the following:

    version.BuildInfo{Version:"v3.0.2", GitCommit:"19e47ee3283ae98139d98460de796c1be1e3975f", GitTreeState:"clean", GoVersion:"go1.13.5"}
    

    If you see Helm 2 version information in the output, you must update your $PATH so it points to the Helm 3 binary file.

Step 3. Configure the default provider YAML file

The following are the default YAML configuration changes necessary for initial deployment. You can manually update the YAML configuration after deployment if necessary. For more information about how to manually update the YAML configuration, see Default Component Modifications.

  1. Go to the helm/providers directory on your local machine.

  2. Open the gcp.yaml file (or other public cloud provider YAML file).

  3. Validate or change your region and zone or zones (if your cluster spans multiple availability zones). The example below uses region: us-central1 and zones: - us-central1-a.

  4. Validate or change your storage provisioner. See Storage Classe Provisioners for configuration examples. The example below uses GCE persistent disk storage (gce-pd) and solid-state drives (pd-ssd).

    global:
      provider:
        name: gcp
        region: us-central1
        kubernetes:
          deployment:
            ## If kubernetes is deployed in multi zone mode then specify availability-zones as appropriate
            ## If kubernetes is deployed in single availability zone then specify appropriate values
            zones:
              - us-central1-a
        storage:
          ## https://kubernetes.io/docs/concepts/storage/storage-classes/#gce
          ##
          provisioner: kubernetes.io/gce-pd
          reclaimPolicy: Delete
          parameters:
            type: pd-ssd
    
  5. Enter the image registry endpoint.

    The following example shows the default public image registry for container images. If you are installing from images downloaded and stored locally or located elsewhere, you must enter your unique endpoint. If the endpoint you use requires basic authentication, you need to change the credential parameter to required: true and enter a username and password.

    ## Docker registry endpoint where Confluent Images are available.
    ##
    registry:
      fqdn: docker.io
      credential:
        required: false
        username:
        password:
    
  6. Enable load balancing for external access to the Kafka cluster. The domain name is the domain name you use (or that you create) for your cloud project in the provider environment. See Configuring the network for more information.

    ## Kafka Cluster
    ##
    kafka:
      name: kafka
      replicas: 3
      resources:
        requests:
          cpu: 200m
          memory: 1Gi
      loadBalancer:
        enabled: true
        domain: "<provider-domain>"
      tls:
        enabled: false
        fullchain: |-
        privkey: |-
        cacerts: |-
    

The deployment example steps use SASL/PLAIN security with TLS disabled. This level of security can typically be used for testing and development purposes. For production environments, see Configuring security for information about how to set up the component YAML files with TLS enabled.

Deploy components

Complete the following steps to deploy Confluent Operator and Confluent Platform.

Important

  • Components must be installed in the order provided in the following steps.
  • Wait for all component services to start before installing the next component.

Step 1. Create a Namespace and Install Confluent Operator

  1. Create a Kubernetes namespace.

    kubectl create namespace <namespace-name>
    

    For example:

    kubectl create namespace operator
    
  2. Go to the helm directory on your local machine.

  3. Enter the following command (using the example operator namespace):

    helm install \
    operator \
    ./confluent-operator -f \
    ./providers/gcp.yaml \
    --namespace operator \
    --set operator.enabled=true
    

    You should see output similar to the following example:

    NAME: operator
    LAST DEPLOYED: Tue Jan  7 17:47:04 2020
    NAMESPACE: operator
    STATUS: deployed
    REVISION: 1
    TEST SUITE: None
    NOTES:
    The Confluent Operator
    
    The Confluent Operator interacts with kubernetes API to create statefulsets
    resources. The Confluent Operator runs three controllers, two component
    specific controllers for kubernetes by providing components specific Custom
    Resource Definition (CRD) (for Kafka and Zookeeper) and one controller for
    creating other statefulsets resources.
    
    1. Validate if Confluent Operator is running.
    
    kubectl get pods -n operator | grep cc-operator
    
    2. Validate if custom resource definition (CRD) is created.
    
    kubectl get crd | grep confluent
    

Step 2. Install ZooKeeper

  1. Enter the following command to verify that Operator is running:

    kubectl get pods -n operator
    
  2. After verifying that Operator is running, enter the following command:

    helm install \
    zookeeper \
    ./confluent-operator -f \
    ./providers/gcp.yaml \
    --namespace operator \
    --set zookeeper.enabled=true
    
    NAME: zookeeper
    LAST DEPLOYED: Wed Jan  8 14:51:26 2020
    NAMESPACE: operator
    STATUS: deployed
    REVISION: 1
    TEST SUITE: None
    NOTES:
    Zookeeper Cluster Deployment
    
    Zookeeper cluster is deployed through CR.
    
      1. Validate if Zookeeper Custom Resource (CR) is created
    
         kubectl get zookeeper -n operator | grep zookeeper
    
      2. Check the status/events of CR: zookeeper
    
         kubectl describe zookeeper zookeeper -n operator
    
      3. Check if Zookeeper cluster is Ready
    
         kubectl get zookeeper zookeeper -ojson -n operator
    
         kubectl get zookeeper zookeeper -ojsonpath='{.status.phase}' -n operator
    
      4. Update/Upgrade Zookeeper Cluster
    
         The upgrade can be done either through the helm upgrade or by editing the CR directly as below;
    
         kubectl edit zookeeper zookeeper  -n operator
    

Step 3. Install Kafka brokers

  1. Enter the following command to verify that all ZooKeeper services are running:

    kubectl get pods -n operator
    
  2. After verifying that all ZooKeeper services are running, enter the following command:

    helm install \
    kafka \
    ./confluent-operator -f \
    ./providers/gcp.yaml \
    --namespace operator \
    --set kafka.enabled=true
    
    NAME: kafka
    LAST DEPLOYED: Wed Jan  8 15:07:46 2020
    NAMESPACE: operator
    STATUS: deployed
    REVISION: 1
    TEST SUITE: None
    NOTES:
    Kafka Cluster Deployment
    
    Kafka Cluster is deployed to kubernetes through CR Object
    
    
      1. Validate if Kafka Custom Resource (CR) is created
    
         kubectl get kafka -n operator | grep kafka
    
      2. Check the status/events of CR: kafka
    
         kubectl describe kafka kafka -n operator
    
      3. Check if Kafka cluster is Ready
    
         kubectl get kafka kafka -ojson -n operator
    
         kubectl get kafka kafka -ojsonpath='{.status.phase}' -n operator
    
    ... output omitted
    

Step 4. Install Schema Registry

  1. Enter the following command to verify that all Kafka services are running:

    kubectl get pods -n operator
    
  2. After verifying that all Kafka services are running, enter the following command:

    helm install \
    schemaregistry \
    ./confluent-operator -f \
    ./providers/gcp.yaml \
    --namespace operator \
    --set schemaregistry.enabled=true
    
    NAME: schemaregistry
    LAST DEPLOYED: Thu Jan  9 15:51:21 2020
    NAMESPACE: operator
    STATUS: deployed
    REVISION: 1
    TEST SUITE: None
    NOTES:
    Schema Registry is deployed through PSC. Configure Schema Registry through REST Endpoint
    
      1. Validate if schema registry cluster is running
    
         kubectl get pods -n operator | grep schemaregistry
    
      2. Access
    
        Internal REST Endpoint : http://schemaregistry:8081  (Inside kubernetes)
    
        OR
    
        http://localhost:8081 (Inside Pod)
    
        More information about schema registry REST API can be found here,
    
        https://docs.confluent.io/current/schema-registry/docs/api.html
    

Step 5. Install Kafka Connect

  1. Enter the following command to verify that all Schema Registry services are running:

    kubectl get pods -n operator
    
  2. After verifying that all Schema Registry services are running, enter the following command:

    helm install \
    connectors \
    ./confluent-operator -f \
    ./providers/gcp.yaml \
    --namespace operator \
    --set connect.enabled=true
    

Step 6. Install Confluent Replicator

  1. Enter the following command to verify that all Connect services are running:

    kubectl get pods -n operator
    
  2. After verifying that all Connect services are running, enter the following command:

    helm install \
    replicator \
    ./confluent-operator -f \
    ./providers/gcp.yaml \
    --namespace operator \
    --set replicator.enabled=true
    

Step 7. Install Confluent Control Center

  1. Enter the following command to verify that all Replicator services are running:

    kubectl get pods -n operator
    
  2. After verifying that all Replicator services are running, enter the following command:

    helm install \
    controlcenter \
    ./confluent-operator -f \
    ./providers/gcp.yaml \
    --namespace operator \
    --set controlcenter.enabled=true
    

Step 8. Install Confluent KSQL

  1. Enter the following command to verify that all KSQL services are running:

    kubectl get pods -n operator
    
  2. After verifying that all KSQL services are running, enter the following command:

    helm install \
    ksql \
    ./confluent-operator -f \
    ./providers/gcp.yaml \
    --namespace operator \
    --set ksql.enabled=true
    

All components should be successfully installed and running.

kubectl get pods -n operator

NAME                           READY   STATUS    RESTARTS   AGE
cc-operator-54f54c694d-qjb7w   1/1     Running   0          4h31m
connectors-0                   1/1     Running   0          4h15m
connectors-1                   1/1     Running   0          4h15m
controlcenter-0                1/1     Running   0          4h18m
kafka-0                        1/1     Running   0          4h20m
kafka-1                        1/1     Running   0          4h20m
kafka-2                        1/1     Running   0          4h20m
ksql-0                         1/1     Running   0          21m
ksql-1                         1/1     Running   0          21m
replicator-0                   1/1     Running   0          4h18m
replicator-1                   1/1     Running   0          4h18m
schemaregistry-0               1/1     Running   0          4h18m
schemaregistry-1               1/1     Running   0          4h18m
zookeeper-0                    1/1     Running   0          4h30m
zookeeper-1                    1/1     Running   0          4h30m
zookeeper-2                    1/1     Running   0          4h30m

Note

Deleting components: If you are installing components for testing purposes, you may want to delete components soon after deploying them. See Deleting a Cluster for instructions; otherwise, continue to next steps to test the deployment.

Step 9. Test the deployment

Complete the following steps to test and validate your deployment.

Internal validation

  1. On your local machine, enter the following command to display cluster namespace information (using the example namespace operator). This information contains the bootstrap endpoint you need to complete internal validation.

    kubectl get kafka -n operator -oyaml
    

    The bootstrap endpoint is shown on the bootstrap.servers line.

    ... omitted
    
       internalClient: |-
          bootstrap.servers=kafka:9071
    
  2. On your local machine, use kubectl exec to start a bash session on one of the pods in the cluster. The example uses the default pod name kafka-0 on a Kafka cluster using the default name kafka.

    kubectl -n operator exec -it kafka-0 bash
    
  3. On the pod, create and populate a file named kafka.properties. There is no text editor installed in the containers, so you use the cat command as shown below to create this file. Use CTRL+D to save the file.

    Note

    The example shows default SASL/PLAIN security parameters. A production environment requires additional security. See Configuring security for additional information.

    cat << EOF > kafka.properties
    bootstrap.servers=kafka:9071
    sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="test" password="test123";
    sasl.mechanism=PLAIN
    security.protocol=SASL_PLAINTEXT
    EOF
    
  4. On the pod, query the bootstrap server using the following command:

    kafka-broker-api-versions --command-config kafka.properties --bootstrap-server kafka:9071
    

    You should see output for each of the three Kafka brokers that resembles the following:

    kafka-1.kafka.operator.svc.cluster.local:9071 (id: 1 rack: 0) -> (
       Produce(0): 0 to 7 [usable: 7],
       Fetch(1): 0 to 10 [usable: 10],
       ListOffsets(2): 0 to 4 [usable: 4],
       Metadata(3): 0 to 7 [usable: 7],
       LeaderAndIsr(4): 0 to 1 [usable: 1],
       StopReplica(5): 0 [usable: 0],
       UpdateMetadata(6): 0 to 4 [usable: 4],
       ControlledShutdown(7): 0 to 1 [usable: 1],
       OffsetCommit(8): 0 to 6 [usable: 6],
       OffsetFetch(9): 0 to 5 [usable: 5],
       FindCoordinator(10): 0 to 2 [usable: 2],
       JoinGroup(11): 0 to 3 [usable: 3],
       Heartbeat(12): 0 to 2 [usable: 2],
    
    ... omitted
    

    This output validates internal communication within your cluster.

External validation

Complete the following steps to validate external communication.

Prerequisites:

Note

The examples use default component names.

  1. You use the Confluent CLI running on your local machine to complete external validation. The Confluent CLI is included with the Confluent Platform. On your local machine, download and start the Confluent Platform.

  2. On your local machine, use the kubectl get kafka -n operator -oyaml command to get the bootstrap servers endpoint for external clients. In the example below, the boostrap servers endpoint is kafka.<providerdomain>:9092.

    ... omitted
    
    externalClient: |-
       bootstrap.servers=kafka.<providerdomain>:9092
       sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="test" password="test123";
       sasl.mechanism=PLAIN
       security.protocol=SASL_PLAINTEXT
    
  3. On your local machine where you have the Confluent Platform running locally, create and populate a file named kafka.properties based on the example used in the previous step.

    bootstrap.servers=kafka.<providerdomain>:9092
    sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="test" password="test123";
    sasl.mechanism=PLAIN
    security.protocol=SASL_PLAINTEXT
    

    Note

    The example shows default SASL/PLAIN security parameters. A production environment requires additional security. See Configuring security for additional information.

  4. Using the Confluent CLI on your local machine, create a topic using the bootstrap endpoint kafka<providerdomain>:9092. The example below creates a topic with 1 partition and 3 replicas.

    kafka-topics --bootstrap-server kafka.<providerdomain>:9092 \
    --command-config kafka.properties \
    --create --replication-factor 3 \
    --partitions 1 --topic example
    
  5. Using the Confluent CLI on your local machine, produce to the new topic using the bootstrap endpoint kafka.<providerdomain>:9092. Note that the bootstrap server load balancer is the only Kafka broker endpoint required because it provides gateway access to the load balancers for all Kafka brokers.

    seq 10000 | kafka-console-producer \
    --topic example --broker-list kafka.<providerdomain>:9092 \
    --producer.config kafka.properties
    
  6. In a new terminal on your local machine, use the Confluent CLI to consume from the new topic.

    kafka-console-consumer --from-beginning \
    --topic example --bootstrap-server kafka.<providerdomain>:9092 \
    --consumer.config kafka.properties
    

Successful completion of these steps validates external communication with your cluster.

Note

Deleting components: If you are installing components for testing purposes, you may want to delete components soon after deploying them. See Deleting a Cluster for instructions; otherwise, continue to next step to configure external access to Confluent Control Center.

The following step is an optional activity you can complete to gain additional knowledge of how upgrades work and how to access your environment using Control Center.

Step 10. Configure external access to Confluent Control Center

Complete the following steps to perform a rolling upgrade to your configuration, enable external access, and launch Control Center.

Upgrade the configuration

  1. Enter the following Helm upgrade command to add an external load balancer for the Control Center instance. Replace <provider-domain> with your platform environment domain. This upgrades your cluster configuration and adds a bootstrap load balancer for Control Center.

    helm upgrade -f ./providers/gcp.yaml \
     --set controlcenter.enabled=true \
     --set controlcenter.loadBalancer.enabled=true \
     --set controlcenter.loadBalancer.domain=<provider-domain> controlcenter \
     ./confluent-operator
    
  2. Get the Control Center bootstrap load balancer public IP. In the example, namespace operator is used. Change this to the namespace for your cluster.

    kubectl get services -n operator
    
  3. Add the bootstrap load balancer DNS entry and public IP to the DNS table for your platform environment.

Launch Confluent Control Center

Complete the following steps to launch Confluent Control Center in your cluster.

  1. Start a new terminal session on your local machine. Enter the following command to set up port forwarding to the default Confluent Control Center endpoint. In the example, namespace operator is used.

    kubectl port-forward svc/controlcenter 9021:9021 -n operator
    
  2. Connect to Control Center using http://localhost:9021/.

  3. Log in to Control Center. Basic authorization credentials are configured in the default <provider.yaml> file. In the example below, the userID is admin and the password is Developer1.

    ##
    ## C3 authentication
    ##
    auth:
      basic:
        enabled: true
        ##
        ## map with key as user and value as password and role
        property:
          admin: Developer1,Administrators
          disallowed: no_access
    

    Important

    Basic authentication to Confluent Control Center can be used for development testing. Typically, this authentication type is disabled for production environments and LDAP is configured for user access. LDAP parameters are provided in the Control Center YAML file.

Note

Deleting components: If you are installing components for testing purposes, you may want to delete components soon after deploying them. See Deleting a Cluster for instructions.

See also

To get started with Confluent Operator on Kubernetes, try out the Kubernetes Demos.