Create an Apache Kafka Client App for Scala¶
In this tutorial, you will run a Scala client application that produces messages to and consumes messages from an Apache Kafka® cluster.
After you run the tutorial, use the provided source code as a reference to develop your own Kafka client application.
Prerequisites¶
Client¶
- Scala
sbt
.
Kafka Cluster¶
The easiest way to follow this tutorial is with Confluent Cloud because you don’t have to run a local Kafka cluster.
From the Console, click on LEARN to provision a cluster and click on Clients
to get the cluster-specific configurations and credentials to set for your client application.
You can alternatively use the supported CLI or REST API, or the community-supported ccloud-stack utility for Confluent Cloud.
If you don’t want to use Confluent Cloud, you can also use this tutorial with a Kafka cluster running on your local host or any other remote server.
Setup¶
Clone the confluentinc/examples GitHub repository and check out the
7.0.16-post
branch.git clone https://github.com/confluentinc/examples cd examples git checkout 7.0.16-post
Change directory to the example for Scala.
cd clients/cloud/scala/
Create a local file (for example, at
$HOME/.confluent/java.config
) with configuration parameters to connect to your Kafka cluster. Starting with one of the templates below, customize the file with connection information to your cluster. Substitute your values for{{ BROKER_ENDPOINT }}
,{{CLUSTER_API_KEY }}
, and{{ CLUSTER_API_SECRET }}
(see Configure Confluent Cloud Clients for instructions on how to manually find these values, or use the ccloud-stack Utility for Confluent Cloud to automatically create them).Template configuration file for Confluent Cloud
# Required connection configs for Kafka producer, consumer, and admin bootstrap.servers={{ BROKER_ENDPOINT }} security.protocol=SASL_SSL sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='{{ CLUSTER_API_KEY }}' password='{{ CLUSTER_API_SECRET }}'; sasl.mechanism=PLAIN # Required for correctness in Apache Kafka clients prior to 2.6 client.dns.lookup=use_all_dns_ips # Best practice for higher availability in Apache Kafka clients prior to 3.0 session.timeout.ms=45000 # Best practice for Kafka producer to prevent data loss acks=all
Template configuration file for local host
# Kafka bootstrap.servers=localhost:9092
Basic Producer and Consumer¶
In this example, the producer application writes Kafka data to a topic in your Kafka cluster.
If the topic does not already exist in your Kafka cluster, the producer application will use the Kafka Admin Client API to create the topic.
Each record written to Kafka has a key representing a username (for example, alice
) and a value of a count, formatted as json (for example, {"count": 0}
).
The consumer application reads the same Kafka topic and keeps a rolling sum of the count as it processes each record.
Produce Records¶
Compile the Scala code:
sbt clean compile
Run the producer, passing in arguments for:
- the local file with configuration parameters to connect to your Kafka cluster
- the topic name
sbt "runMain io.confluent.examples.clients.scala.Producer $HOME/.confluent/java.config test1"
Verify that the producer sent all the messages. You should see:
<snipped> Produced record at test1-0@120 Produced record at test1-0@121 Produced record at test1-0@122 Produced record at test1-0@123 Produced record at test1-0@124 Produced record at test1-0@125 Produced record at test1-0@126 Produced record at test1-0@127 Produced record at test1-0@128 Produced record at test1-0@129 Wrote ten records to test1 [success] Total time: 6 s, completed 10-Dec-2018 16:50:13
View the producer code.
Consume Records¶
Run the consumer, passing in arguments for:
- the local file with configuration parameters to connect to your Kafka cluster
- the topic name you used earlier
sbt "runMain io.confluent.examples.clients.scala.Consumer $HOME/.confluent/java.config test1"
Verify the consumer received all the messages. You should see:
... Consumed record with key alice and value {"count":1}, and updated total count to 1 Consumed record with key alice and value {"count":2}, and updated total count to 3 Consumed record with key alice and value {"count":3}, and updated total count to 6 Consumed record with key alice and value {"count":4}, and updated total count to 10 Consumed record with key alice and value {"count":5}, and updated total count to 15 Consumed record with key alice and value {"count":6}, and updated total count to 21 Consumed record with key alice and value {"count":7}, and updated total count to 28 Consumed record with key alice and value {"count":8}, and updated total count to 36 Consumed record with key alice and value {"count":9}, and updated total count to 45 Consumed record with key alice and value {"count":10}, and updated total count to 55 ...
When you are done, press
Ctrl-C
.View the consumer code.
Kafka Streams¶
Run the Kafka Streams application, passing in arguments for:
- the local file with configuration parameters to connect to your Kafka cluster
- the topic name you used earlier
sbt "runMain io.confluent.examples.clients.scala.Streams $HOME/.confluent/java.config test1"
In the streams app, verify you see the following output:
... [Consumed record]: alice, 1 [Consumed record]: alice, 2 [Consumed record]: alice, 3 [Consumed record]: alice, 4 [Consumed record]: alice, 5 [Consumed record]: alice, 6 [Consumed record]: alice, 7 [Consumed record]: alice, 8 [Consumed record]: alice, 9 [Consumed record]: alice, 10 [Running count]: alice, 1 [Running count]: alice, 3 [Running count]: alice, 6 [Running count]: alice, 10 [Running count]: alice, 15 [Running count]: alice, 21 [Running count]: alice, 28 [Running count]: alice, 36 [Running count]: alice, 45 [Running count]: alice, 55 ...
When you are done, press
Ctrl-C
.View the Kafka Streams code.