KSQL Datagen¶
In this tutorial, you will run a KSQL Datagen client application using the KSQL Datagen command-line tool that produces messages to and consumes messages from an Apache Kafka® cluster.
Note
Use KSQL Datagen for development purposes only. It isn’t suitable for a production environment.
After you run the tutorial, view the provided source code and use it as a reference to develop your own Kafka client application.
Prerequisites¶
Kafka Cluster¶
- You can use this tutorial with a Kafka cluster in any environment:
- In Confluent Cloud
- On your local host
- Any remote Kafka cluster
- If you are running on Confluent Cloud, you must have access to a
Confluent Cloud cluster
with an API key and secret.
- The first 20 users to sign up for Confluent Cloud and use promo code
C50INTEG
will receive an additional $50 free usage (details) - For an automated way to create a Kafka cluster, credentials, and ACLs in Confluent Cloud, see ccloud-stack Utility for Confluent Cloud.
- The first 20 users to sign up for Confluent Cloud and use promo code
Setup¶
Clone the confluentinc/examples GitHub repository and check out the
6.0.0-post
branch.git clone https://github.com/confluentinc/examples cd examples git checkout 6.0.0-post
Change directory to the example for KSQL Datagen.
cd clients/cloud/ksql-datagen/
Create a local file (for example, at
$HOME/.confluent/java.config
) with configuration parameters to connect to your Kafka cluster. Starting with one of the templates below, customize the file with connection information to your cluster. Substitute your values for{{ BROKER_ENDPOINT }}
,{{CLUSTER_API_KEY }}
, and{{ CLUSTER_API_SECRET }}
(see Configure Confluent Cloud Clients for instructions on how to manually find these values, or use the ccloud-stack Utility for Confluent Cloud to automatically create them).Template configuration file for Confluent Cloud
# Required connection configs for Kafka producer, consumer, and admin bootstrap.servers={{ BROKER_ENDPOINT }} security.protocol=SASL_SSL sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='{{ CLUSTER_API_KEY }}' password='{{ CLUSTER_API_SECRET }}'; sasl.mechanism=PLAIN # Required for correctness in Apache Kafka clients prior to 2.6 client.dns.lookup=use_all_dns_ips # Best practice for Kafka producer to prevent data loss acks=all
Template configuration file for local host
# Kafka bootstrap.servers=localhost:9092
Basic Producer and Consumer¶
In this example, the producer application writes Kafka data to a topic in your Kafka cluster.
If the topic does not already exist in your Kafka cluster, the producer application will use the Kafka Admin Client API to create the topic.
Each record written to Kafka has a key representing a username (for example, alice
) and a value of a count, formatted as json (for example, {"count": 0}
).
The consumer application reads the same Kafka topic and keeps a rolling sum of the count as it processes each record.
Produce Records¶
Create the Kafka topic.
kafka-topics --bootstrap-server `grep "^\s*bootstrap.server" $HOME/.confluent/java.config | tail -1` --command-config $HOME/.confluent/java.config --topic test1 --create --replication-factor 3 --partitions 6
Generate a file of ENV variables used by Docker to set the bootstrap servers and security configuration.
../../../ccloud/ccloud-generate-cp-configs.sh $HOME/.confluent/java.config
Source the generated file of
ENV
variables.source ./delta_configs/env.delta
Start Docker by running the following command:
docker-compose up -d
View the ksql-datagen code.
Consume Records¶
Consume from topic
test1
by doing the following:Referencing a properties file
docker-compose exec connect bash -c 'kafka-console-consumer --topic test1 --bootstrap-server $CONNECT_BOOTSTRAP_SERVERS --consumer.config /tmp/ak-tools-ccloud.delta --max-messages 5'
Referencing individual properties
docker-compose exec connect bash -c 'kafka-console-consumer --topic test1 --bootstrap-server $CONNECT_BOOTSTRAP_SERVERS --consumer-property sasl.mechanism=PLAIN --consumer-property security.protocol=SASL_SSL --consumer-property sasl.jaas.config="$SASL_JAAS_CONFIG_PROPERTY_FORMAT" --max-messages 5'
You should see messages similar to the following:
{"ordertime":1489322485717,"orderid":15,"itemid":"Item_352","orderunits":9.703502112840228,"address":{"city":"City_48","state":"State_21","zipcode":32731}}
When you are done, press
CTRL-C
.View the consumer code.
Avro and Confluent Cloud Schema Registry¶
This example is similar to the previous example, except the value is formatted as Avro and integrates with the Confluent Cloud Schema Registry. Before using Confluent Cloud Schema Registry, check its availability and limits.
As described in the Quick Start for Schema Management on Confluent Cloud in the Confluent Cloud GUI, enable Confluent Cloud Schema Registry and create an API key and secret to connect to it.
Verify that your VPC can connect to the Confluent Cloud Schema Registry public internet endpoint.
Update your local configuration file (for example, at
$HOME/.confluent/java.config
) with parameters to connect to Schema Registry.Template configuration file for Confluent Cloud
# Required connection configs for Kafka producer, consumer, and admin bootstrap.servers={{ BROKER_ENDPOINT }} security.protocol=SASL_SSL sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='{{ CLUSTER_API_KEY }}' password='{{ CLUSTER_API_SECRET }}'; sasl.mechanism=PLAIN # Required for correctness in Apache Kafka clients prior to 2.6 client.dns.lookup=use_all_dns_ips # Best practice for Kafka producer to prevent data loss acks=all # Required connection configs for Confluent Cloud Schema Registry schema.registry.url=https://{{ SR_ENDPOINT }} basic.auth.credentials.source=USER_INFO schema.registry.basic.auth.user.info={{ SR_API_KEY }}:{{ SR_API_SECRET }}
Template configuration file for local host
# Kafka bootstrap.servers=localhost:9092 # Confluent Schema Registry schema.registry.url=http://localhost:8081
Verify your Confluent Cloud Schema Registry credentials work from your host. In the following example, substitute your values for
{{ SR_API_KEY}}
,{{SR_API_SECRET }}
, and{{ SR_ENDPOINT }}
.# View the list of registered subjects $ curl -u {{ SR_API_KEY }}:{{ SR_API_SECRET }} https://{{ SR_ENDPOINT }}/subjects # Same as above, as a single bash command to parse the values out of $HOME/.confluent/java.config $ curl -u $(grep "^schema.registry.basic.auth.user.info" $HOME/.confluent/java.config | cut -d'=' -f2) $(grep "^schema.registry.url" $HOME/.confluent/java.config | cut -d'=' -f2)/subjects
Produce Avro Records¶
Create the topic in Confluent Cloud.
kafka-topics --bootstrap-server `grep "^\s*bootstrap.server" $HOME/.confluent/java.config | tail -1` --command-config $HOME/.confluent/java.config --topic test2 --create --replication-factor 3 --partitions 6
Generate a file of
ENV
variables used by Docker to set the bootstrap servers and security configuration.../../../ccloud/ccloud-generate-cp-configs.sh $HOME/.confluent/java.config
Source the generated file of
ENV
variables.source ./delta_configs/env.delta
Start Docker by running the following command:
docker-compose up -d
View the ksql-datagen Avro code.
Consume Avro Records¶
Consume from topic
test2
by doing the following:Referencing a properties file
docker-compose exec connect bash -c 'kafka-avro-console-consumer --topic test2 --bootstrap-server $CONNECT_BOOTSTRAP_SERVERS --consumer.config /tmp/ak-tools-ccloud.delta --property basic.auth.credentials.source=$CONNECT_VALUE_CONVERTER_BASIC_AUTH_CREDENTIALS_SOURCE --property schema.registry.basic.auth.user.info=$CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_BASIC_AUTH_USER_INFO --property schema.registry.url=$CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL --max-messages 5'
Referencing individual properties
docker-compose exec connect bash -c 'kafka-avro-console-consumer --topic test2 --bootstrap-server $CONNECT_BOOTSTRAP_SERVERS --consumer-property sasl.mechanism=PLAIN --consumer-property security.protocol=SASL_SSL --consumer-property sasl.jaas.config="$SASL_JAAS_CONFIG_PROPERTY_FORMAT" --property basic.auth.credentials.source=$CONNECT_VALUE_CONVERTER_BASIC_AUTH_CREDENTIALS_SOURCE --property schema.registry.basic.auth.user.info=$CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_BASIC_AUTH_USER_INFO --property schema.registry.url=$CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL --max-messages 5'
You should see messages similar to the following:
{"ordertime":{"long":1494153923330},"orderid":{"int":25},"itemid":{"string":"Item_441"},"orderunits":{"double":0.9910185646928878},"address":{"io.confluent.ksql.avro_schemas.KsqlDataSourceSchema_address":{"city":{"string":"City_61"},"state":{"string":"State_41"},"zipcode":{"long":60468}}}}
When you are done, press
CTRL-C
.View the consumer Avro code.
Confluent Cloud Schema Registry¶
View the schema information registered in Confluent Cloud Schema Registry. In the following output, substitute values for
<SR API KEY>
,<SR API SECRET>
, and<SR ENDPOINT>
.curl -u <SR API KEY>:<SR API SECRET> https://<SR ENDPOINT>/subjects
Verify the subject
test2-value
exists.["test2-value"]
View the schema information for subject
test2-value
. In the following output, substitute values for<SR API KEY>
,<SR API SECRET>
, and<SR ENDPOINT>
.curl -u <SR API KEY>:<SR API SECRET> https://<SR ENDPOINT>/subjects/test2-value/versions/1
Verify the schema information for subject
test2-value
.{"subject":"test2-value","version":1,"id":100001,"schema":"{\"type\":\"record\",\"name\":\"KsqlDataSourceSchema\",\"namespace\":\"io.confluent.ksql.avro_schemas\",\"fields\":[{\"name\":\"ordertime\",\"type\":[\"null\",\"long\"],\"default\":null},{\"name\":\"orderid\",\"type\":[\"null\",\"int\"],\"default\":null},{\"name\":\"itemid\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"orderunits\",\"type\":[\"null\",\"double\"],\"default\":null},{\"name\":\"address\",\"type\":[\"null\",{\"type\":\"record\",\"name\":\"KsqlDataSourceSchema_address\",\"fields\":[{\"name\":\"city\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"state\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"zipcode\",\"type\":[\"null\",\"long\"],\"default\":null}]}],\"default\":null}]}"}