Scripted Confluent Platform Demo

The scripted Confluent Platform demo (cp-demo) example builds a full Confluent Platform deployment with an Apache Kafka® event streaming application that uses ksqlDB and Kafka Streams for stream processing, and secures all of the components end-to-end. The tutorial includes a module that makes it a hybrid deployment that runs Cluster Linking and Schema Linking to copy data and schemas from a local on-premises Kafka cluster to Confluent Cloud, a fully-managed service for Kafka.

Follow the accompanying guided tutorial to learn how Kafka and Confluent Cloud work with Connect, Confluent Schema Registry, Confluent Control Center, and Cluster Linking with security enabled end-to-end.

Looking for a fully managed cloud-native service for Apache Kafka®?

Sign up for Confluent Cloud and get started for free using the Cloud quick start.

Use case

The use case for this application is an Kafka event streaming application that processes real-time edits to real Wikipedia pages. The following image shows the application topology:

image

The full event streaming platform based on Confluent Platform is described as follows:

  1. Wikimedia’s EventStreams publishes a continuous stream of real-time edits happening to real wiki pages.
  2. A Kafka source connector kafka-connect-sse streams the server-sent events (SSE) from https://stream.wikimedia.org/v2/stream/recentchange, and a custom Connect transform kafka-connect-json-schema extracts the JSON from these messages and then are written to a Kafka cluster.
  3. Data processing is done with ksqlDB and a Kafka Streams application.
  4. A Kafka sink connector kafka-connect-elasticsearch streams the data out of Kafka and is materialized into Elasticsearch for analysis by Kibana.

All data is in Avro format, uses Confluent Schema Registry, and Confluent Control Center is managing and monitoring the deployment.

Data pattern

The data pattern for this application is as follows:

Components Consumes from Produces to
SSE source connector Wikipedia wikipedia.parsed
ksqlDB wikipedia.parsed ksqlDB streams and tables
Kafka Streams application wikipedia.parsed wikipedia.parsed.count-by-domain
Elasticsearch sink connector WIKIPEDIABOT (from ksqlDB) Elasticsearch/Kibana

How to use this tutorial

We suggest following the cp-demo tutorial in order:

  1. Module 1: Deploy the Confluent Platform Demo Environment: bring up the on-premises Kafka cluster and explore the different technical areas of Confluent Platform
  2. Module 2: Deploy Hybrid Confluent Platform and Confluent Cloud Environment: create a cluster link to copy data from a local on-premises Kafka cluster to Confluent Cloud, and use the Metrics API to monitor both
  3. Troubleshooting and Stopping the Demo: troubleshoot issues with the demo and clean up your on-premises and Confluent Cloud environments