Quickstart

With this quickstart you will set up and configure, ZooKeeper, Kafka, Kafka Connect, and Control Center. You will then read and write data to and from Kafka.

Prerequisites:

Configure Kafka

Before starting Control Center, you must configure Metrics Reporter, Kafka, and Kafka Connect.

  1. Configure Confluent Metrics Reporter.

    1. Optional: Copy the default Kafka server configuration (/etc/kafka/server.properties).

      $ cp <path-to-confluent>/etc/kafka/server.properties <path-to-file>/server.properties
      
    2. Uncomment these Metrics Reporter values:

      ##################### Confluent Metrics Reporter #######################
      # Confluent Control Center and Confluent Auto Data Balancer integration
      #
      # Uncomment the following lines to publish monitoring data for
      # Confluent Control Center and Confluent Auto Data Balancer
      # If you are using a dedicated metrics cluster, also adjust the settings
      # to point to your metrics kakfa cluster.
      metric.reporters=io.confluent.metrics.reporter.ConfluentMetricsReporter
      confluent.metrics.reporter.bootstrap.servers=localhost:9092
      #
      # Uncomment the following line if the metrics cluster has a single broker
      confluent.metrics.reporter.topic.replicas=1
      

      Tip: You can do this using sed commands:

      $ sed -i 's/#metric.reporters=io.confluent.metrics.reporter.ConfluentMetricsReporter/metric.reporters=io.confluent.metrics.reporter.ConfluentMetricsReporter/g' && \
      sed -i 's/#confluent.metrics.reporter.bootstrap.servers=localhost:9092/confluent.metrics.reporter.bootstrap.servers=localhost:9092/g' && \
      sed -i 's/#confluent.metrics.reporter.zookeeper.connect=localhost:2181/confluent.metrics.reporter.zookeeper.connect=localhost:2181/g' && \
      sed -i 's/#confluent.metrics.reporter.topic.replicas=1/confluent.metrics.reporter.topic.replicas=1/g' /tmp/server.properties
      
  2. Optional: Copy the settings for Kafka Connect (connect-distributed.properties).

    $ cp <path-to-confluent>/etc/kafka/connect-distributed.properties <path-to-file>/connect-distributed.properties
    
  3. Add support for the interceptors to the Connect properties (connect-distributed.properties) file.

    $ cat <<EOF >> connect-distributed.properties
    
    # Interceptor setup
    consumer.interceptor.classes=io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor
    producer.interceptor.classes=io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor
    EOF
    
  4. Start Confluent Platform.

    $ confluent start
    

Configure and start Control Center

  1. Start Control Center in its own terminal (set to run with one replica):

    1. Optional: Copy the control-center.properties file and save the original.

      $ cp ./etc/confluent-control-center/control-center.properties <path-to-file>/control-center.properties
      
    2. Define these values in your properties file.

      $ cat <<EOF >> control-center.properties
      
      # Quickstart partition and replication values
      confluent.controlcenter.internal.topics.partitions=1
      confluent.controlcenter.internal.topics.replication=1
      confluent.controlcenter.command.topic.replication=1
      confluent.monitoring.interceptor.topic.partitions=1
      confluent.monitoring.interceptor.topic.replication=1
      confluent.metrics.topic.partition=1
      confluent.metrics.topic.replication=1
      EOF
      
    3. Start Control Center.

      $ <path-to-confluent>/bin/control-center-start <path-to-file>/control-center.properties
      

      You can navigate to the Control Center web interface at http://localhost:9021/.

Setup stream monitoring

Now that you have all of the services running, you can start building a data pipeline. As an example, you can create a small job to create data.

  1. Open an editor and enter the following text (our apologies to William Carlos Williams), and save this as totail.sh.

    cat <<EOF >> totail.sh
    #!/usr/bin/env bash
    
    file=totail.txt
    
    while true; do
        echo This is just to say >> \${file}
        echo >> \${file}
        echo I have eaten >> \${file}
        echo the plums >> \${file}
        echo that were in >> \${file}
        echo the icebox >> \${file}
        echo >> \${file}
        echo and which >> \${file}
        echo you were probably >> \${file}
        echo saving >> \${file}
        echo for breakfast >> \${file}
        echo >> \${file}
        echo Forgive me >> \${file}
        echo they were delicious >> \${file}
        echo so sweet >> \${file}
        echo and so cold >> \${file}
        sleep 1
    done
    EOF
    
  2. Start this script. It writes the poem to /tmp/totail.txt once per second. Kafka Connect is used to load that into a Kafka topic.

    1. Run chmod to grant user execution permissions.

      $ sudo chmod u+x totail.sh
      
    2. Run the script.

      $ totail.sh
      
  3. Use the Kafka Topics tool to create a new topic:

    $ <path-to-confluent>/bin/kafka-topics --zookeeper localhost:2181 --create --topic poem \
       --partitions 1 --replication-factor 1
    
  4. From the Control Center web interface http://localhost:9021/, click on the Kafka Connect button on the left side of the web interface. On this page you can see a list of sources that have been configured - by default it will be empty. Click the New source button.

    ../../_images/c3newsource.png
  5. From the Connection Class drop-down menu select FileStreamSourceConnector. Specify the Connection Name as Poem File Source. Once you have specified a name for the connection a set of other configuration options will appear.

    ../../_images/c3filestreamsourceconnector.png
  6. In the General section specify the file as /totail.txt and the topic as poem.

    ../../_images/c3specifypoem.png
  7. Click Continue, verify your settings, and then Save & Finish to apply the new configuration.

    ../../_images/c3verifyfinal.png
  8. Create a new sink.

    1. From the Kafka Connect tab, click the Sinks tab and then New sink.

      ../../_images/c3newsink.png
    2. From the Topics drop-down list, choose poem and click Continue.

      ../../_images/c3poemsink.png
    3. In the Sinks tab, set the Connection Class to FileStreamSinkConnector, specify the Connection Name as Poem File Sink, and in the General section specify the file as sunk.txt.

      ../../_images/filestreamsinkconnector.png
    4. Click Continue, verify your settings, and then Save & Finish to apply the new configuration.

      ../../_images/c3poemfilesink.png

      Now that you have data flowing into and out of Kafka, let’s monitor what’s going on!

  9. Click the Data Streams tab and you will see a chart that shows the total number of messages produced and consumed on the cluster. If you scroll down, you will see more details on the consumer group for your sink. Depending on your machine, this chart may take a few minutes to populate.

    ../../_images/c3qsfinalview.png

This quickstart described Kafka, Kafka Connect, and Control Center. For component-specific quickstart guides, see the documentation: