Important

You are viewing documentation for an older version of Confluent Platform. For the latest, click here.

Manual Install using Systemd on RHEL and CentOS

This topic provides instructions for installing a production-ready Confluent Platform configuration in a multi-node RHEL or CentOS environment with a replicated ZooKeeper ensemble.

The YUM repositories provide packages for RHEL, CentOS, and Fedora-based distributions.

Important

You must complete these steps for each node in your cluster.

Prerequisites
Before installing Confluent Platform, your environment must have the following software and hardware requirements.

Get the Software

The YUM repositories provide packages for RHEL, CentOS, and Fedora-based distributions. You can install individual Confluent Platform packages or the entire platform. For a list of available packages, see the documentation or you can search the repository (yum search <package-name>).

  1. Install the curl and which tools.

    sudo yum install curl which
    
  2. Install the Confluent Platform public key. This key is used to sign packages in the YUM repository.

    sudo rpm --import https://packages.confluent.io/rpm/5.5/archive.key
    
  3. Navigate to /etc/yum.repos.d/ and create a file named confluent.repo with these contents. This adds the Confluent repositories. You must have the entries for both repositories, [Confluent.dist] and [Confluent], as shown below.

    [Confluent.dist]
    name=Confluent repository (dist)
    baseurl=https://packages.confluent.io/rpm/5.5/7
    gpgcheck=1
    gpgkey=https://packages.confluent.io/rpm/5.5/archive.key
    enabled=1
    
    [Confluent]
    name=Confluent repository
    baseurl=https://packages.confluent.io/rpm/5.5
    gpgcheck=1
    gpgkey=https://packages.confluent.io/rpm/5.5/archive.key
    enabled=1
    
  4. Clear the YUM caches and install Confluent Platform.

    • Confluent Platform:

      With this Confluent Platform package, confluent-kafka is installed by default. If you want confluent-server, you uninstall confluent-kafka and install confluent-server as described in Migrate to Confluent Server.

      Enterprise features, such as Replicator or RBAC, require confluent-server.

      sudo yum clean all && sudo yum install confluent-platform-2.12
      
    • Confluent Platform with Confluent Server and RBAC:

      sudo yum clean all && sudo yum install confluent-platform-2.12 && \
      sudo yum remove confluent-kafka-2.11 && sudo yum install confluent-server
      
    • Confluent Platform using only Confluent Community components:

      sudo yum clean all &&  sudo yum install confluent-community-2.12
      

    Note

    The installation package names end with the Scala version that the Kafka is built on. For example, the confluent-platform-2.12 package is for Confluent Platform 5.5.15 and is based on Scala 2.12.

    The Zip and Tar packages contain the Confluent Platform version followed by the Scala version. For example, a Zip package, confluent-5.5.15-2.12.zip denotes Confluent Platform version 5.5.15 and Scala version 2.12.

    For Confluent Platform your output should resemble:

    Installed:
      confluent-platform-2.12-5.5.15-1.noarch                         confluent-common-5.5.15-1.noarch
      confluent-control-center-5.5.15-1.noarch                        confluent-control-center-fe-5.5.15-1.noarch
      confluent-hub-client-5.5.15-1.noarch                            confluent-kafka-2.11-5.5.15-1.noarch
      confluent-kafka-connect-elasticsearch-5.5.15-1.noarch           confluent-kafka-connect-jdbc-5.5.15-1.noarch
      confluent-kafka-connect-jms-5.5.15-1.noarch                     confluent-kafka-connect-replicator-5.5.15-1.noarch
      confluent-kafka-connect-s3-5.5.15-1.noarch                      confluent-kafka-connect-storage-common-5.5.15-1.noarch
      confluent-kafka-mqtt-5.5.15-1.noarch                            confluent-kafka-rest-5.5.15-1.noarch
      confluent-ksqldb-5.5.15-1.noarch                                confluent-rebalancer-5.5.15-1.noarch
      confluent-rest-utils-5.5.15-1.noarch                            confluent-schema-registry-5.5.15-1.noarch
    
    Complete!
    

Configure Confluent Platform

Tip

You can store passwords and other configuration data securely by using the confluent secret commands. For more information see Secrets.

Configure Confluent Platform with the individual component properties files. By default these are located in <path-to-confluent>/etc/. You must minimally configure the following components.

ZooKeeper

These instructions assume you are running ZooKeeper in replicated mode. A minimum of three servers are required for replicated mode, and you must have an odd number of servers for failover. For more information, see the ZooKeeper documentation.

  1. Navigate to the ZooKeeper properties file (/etc/kafka/zookeeper.properties) file and modify as shown.

    tickTime=2000
    dataDir=/var/lib/zookeeper/
    clientPort=2181
    initLimit=5
    syncLimit=2
    server.1=zoo1:2888:3888
    server.2=zoo2:2888:3888
    server.3=zoo3:2888:3888
    autopurge.snapRetainCount=3
    autopurge.purgeInterval=24
    

    This configuration is for a three node ensemble. This configuration file should be identical across all nodes in the ensemble. tickTime, dataDir, and clientPort are all set to typical single server values. The initLimit and syncLimit govern how long following ZooKeeper servers can take to initialize with the current leader and how long they can be out of sync with the leader. In this configuration, a follower can take 10000 ms to initialize and can be out of sync for up to 4000 ms based on the tickTime being set to 2000ms.

    The server.* properties set the ensemble membership. The format is

    server.<myid>=<hostname>:<leaderport>:<electionport>
    
    • myid is the server identification number. There are three servers that each have a different myid with values 1, 2, and 3 respectively. The myid is set by creating a file named myid in the dataDir that contains a single integer in human readable ASCII text. This value must match one of the myid values from the configuration file. You will see an error if another ensemble member is already started with a conflicting myid value.
    • leaderport is used by followers to connect to the active leader. This port should be open between all ZooKeeper ensemble members.
    • electionport is used to perform leader elections between ensemble members. This port should be open between all ZooKeeper ensemble members.

    The autopurge.snapRetainCount and autopurge.purgeInterval have been set to purge all but three snapshots every 24 hours.

  2. Navigate to the ZooKeeper log directory (e.g., /var/lib/zookeeper/) and create a file named myid. The myid file consists of a single line that contains the machine ID in the format <machine-id>. When the ZooKeeper server starts up, it knows which server it is by referencing the myid file. For example, server 1 will have a myid value of 1.

Kafka

In a production environment, multiple brokers are required. During startup brokers register themselves in ZooKeeper to become a member of the cluster.

Navigate to the Apache Kafka® properties file (/etc/kafka/server.properties) and customize the following:

  • Connect to the same ZooKeeper ensemble by setting the zookeeper.connect in all nodes to the same value. Replace all instances of localhost to the hostname or FQDN (fully qualified domain name) of your node. For example, if your hostname is zookeeper:

    zookeeper.connect=zookeeper:2181
    
  • Configure the broker IDs for each node in your cluster using one of these methods.

    • Dynamically generate the broker IDs: add broker.id.generation.enable=true and comment out broker.id. For example:

      ############################# Server Basics #############################
      
      # The ID of the broker. This must be set to a unique integer for each broker.
      #broker.id=0
      broker.id.generation.enable=true
      
    • Manually set the broker IDs: set a unique value for broker.id on each node.

  • Configure how other brokers and clients communicate with the broker using listeners, and optionally advertised.listeners.

    • listeners: Comma-separated list of URIs and listener names to listen on.
    • advertised.listeners: Comma-separated list of URIs and listener names for other brokers and clients to use. The advertised.listeners parameter ensures that the broker advertises an address that is accessible from both local and external hosts.

    For more information, see Production Configuration Options.

  • Configure security for your environment.

Control Center

  1. Navigate to the Control Center properties file (/etc/confluent-control-center/control-center-production.properties) and customize the following:

    # host/port pairs to use for establishing the initial connection to the Kafka cluster
    bootstrap.servers=<hostname1:port1,hostname2:port2,hostname3:port3,...>
    # location for Control Center data
    confluent.controlcenter.data.dir=/var/lib/confluent/control-center
    # the Confluent license
    confluent.license=<your-confluent-license>
    # ZooKeeper connection string with host and port of a ZooKeeper servers
    zookeeper.connect=<hostname1:port1,hostname2:port2,hostname3:port3,...>
    

    This configuration is for a three node multi-node cluster. For more information, see Control Center configuration details. For information about Confluent Platform licenses, see Managing Confluent Platform Licenses.

  2. Navigate to the Kafka server configuration file (/etc/kafka/server.properties) and enable Confluent Metrics Reporter.

    ##################### Confluent Metrics Reporter #######################
    # Confluent Control Center and Confluent Auto Data Balancer integration
    #
    # Uncomment the following lines to publish monitoring data for
    # Confluent Control Center and Confluent Auto Data Balancer
    # If you are using a dedicated metrics cluster, also adjust the settings
    # to point to your metrics Kafka cluster.
    metric.reporters=io.confluent.metrics.reporter.ConfluentMetricsReporter
    confluent.metrics.reporter.bootstrap.servers=localhost:9092
    #
    # Uncomment the following line if the metrics cluster has a single broker
    confluent.metrics.reporter.topic.replicas=1
    
  3. Add these lines to the Kafka Connect properties file (/etc/kafka/connect-distributed.properties) to add support for the interceptors.

    # Interceptor setup
    consumer.interceptor.classes=io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor
    producer.interceptor.classes=io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor
    

Confluent REST Proxy

Navigate to the Confluent REST Proxy properties file (/etc/kafka-rest/kafka-rest.properties) and customize the following:

  • Optionally configure zookeeper.connect. ZooKeeper connectivity is needed for the earlier /v1/ consumer endpoints. Change localhost to the hostname or FQDN (fully qualified domain name) of your node. For example, if your hostname is zookeeper:

    zookeeper.connect=zookeeper:2181
    

Schema Registry

Navigate to the Schema Registry properties file (/etc/schema-registry/schema-registry.properties) and specify the following properties:

# Specify the address the socket server listens on, e.g. listeners = PLAINTEXT://your.host.name:9092
listeners=http://0.0.0.0:8081

# The host name advertised in ZooKeeper. This must be specified if your running Schema Registry
# with multiple nodes.
host.name=192.168.50.1

# List of Kafka brokers to connect to, e.g. PLAINTEXT://hostname:9092,SSL://hostname2:9092
kafkastore.bootstrap.servers=PLAINTEXT://hostname:9092,SSL://hostname2:9092

This configuration is for a three node multi-node cluster. For more information, see Running Schema Registry in Production.

Start Confluent Platform

Start Confluent Platform and its components using systemd service unit files. You can start immediately by using the systemctl start command or enable for automatic startup by using the systemctl enable command. These instructions use the syntax for immediate startup.

Tip

ZooKeeper, Kafka, and Schema Registry must be started in this specific order, and must be started before any other components.

  1. Start ZooKeeper.

    sudo systemctl start confluent-zookeeper
    
  2. Start Kafka.

    • Confluent Platform:

      sudo systemctl start confluent-server
      
    • Confluent Platform using only Confluent Community components:

      sudo systemctl start confluent-kafka
      
  3. Start Schema Registry.

    sudo systemctl start confluent-schema-registry
    
  4. Start other Confluent Platform components as desired.

    • Control Center

      sudo systemctl start confluent-control-center
      
    • Kafka Connect

      sudo systemctl start confluent-kafka-connect
      
    • Confluent REST Proxy

      sudo systemctl start confluent-kafka-rest
      
    • ksqlDB

      sudo systemctl start confluent-ksqldb
      

Tip

You can check service status with this command: systemctl status confluent*. For more information about the systemd service unit files, see Using Confluent Platform systemd Service Unit Files.

Uninstall

Run this command to remove Confluent Platform, where <component-name> is either confluent-platform-2.12 (Confluent Platform) or confluent-community-2.12 (Confluent Platform using only Confluent Community components).

sudo yum autoremove <component-name>

For example, run this command to remove Confluent Platform:

sudo yum autoremove confluent-platform-2.12

Your output should resemble:

Loaded plugins: fastestmirror, langpacks
Resolving Dependencies
--> Running transaction check
---> Package confluent-platform-2.12.noarch 0:5.5.15-0.1.cp2 will be erased
...
Removed:
confluent-platform-2.12.noarch 0:5.5.15-0.1.cp2

Dependency Removed:
confluent-common.noarch 0:5.5.15-0.1.cp2 confluent-control-center.noarch 0:5.5.15-0.1.cp2
confluent-control-center-fe.noarch 0:5.5.15-0.1.cp2 confluent-hub-client.noarch 0:5.5.15-0.1.cp2
confluent-kafka-2.12.noarch 0:5.5.15-0.1.cp2 confluent-kafka-connect-elasticsearch.noarch 0:5.5.15-0.1.cp2
confluent-kafka-connect-jdbc.noarch 0:5.5.15-0.1.cp2 confluent-kafka-connect-jms.noarch 0:5.5.15-0.1.cp2
confluent-kafka-connect-replicator.noarch 0:5.5.15-0.1.cp2 confluent-kafka-connect-s3.noarch 0:5.5.15-0.1.cp2
confluent-kafka-connect-storage-common.noarch 0:5.5.15-0.1.cp2 confluent-kafka-mqtt.noarch 0:5.5.15-0.1.cp2
confluent-kafka-rest.noarch 0:5.5.15-0.1.cp2 confluent-ksql.noarch 0:5.5.15-0.1.cp2
confluent-rebalancer.noarch 0:5.5.15-0.1.cp2 confluent-rest-utils.noarch 0:5.5.15-0.1.cp2
confluent-schema-registry.noarch 0:5.5.15-0.1.cp2

Complete!

Next Steps

Try out the Apache Kafka Quick Start.