Confluent Platform Demo (cp-demo)¶
The cp-demo
example builds a full Confluent Platform deployment with an Apache Kafka® event streaming application using ksqlDB and Kafka Streams for stream processing, and all the components have security enabled end-to-end.
The tutorial includes a module to extend it into a hybrid deployment that runs Cluster Linking and Schema Linking to copy data and schemas from a local on-prem Kafka cluster to Confluent Cloud, a fully-managed service for Apache Kafka®.
Follow the accompanying guided tutorial, broken down step-by-step, to learn how Kafka and Confluent Cloud work with Connect, Confluent Schema Registry, Confluent Control Center, Cluster Linking, and security enabled end-to-end.
Overview¶
Use Case¶
The use case is an Apache Kafka® event streaming application that processes real-time edits to real Wikipedia pages.
The full event streaming platform based on Confluent Platform is described as follows. Wikimedia’s EventStreams publishes a continuous stream of real-time edits happening to real wiki pages. A Kafka source connector kafka-connect-sse streams the server-sent events (SSE) from https://stream.wikimedia.org/v2/stream/recentchange, and a custom Connect transform kafka-connect-json-schema extracts the JSON from these messages and then are written to a Kafka cluster. This example uses ksqlDB and a Kafka Streams application for data processing. Then a Kafka sink connector kafka-connect-elasticsearch streams the data out of Kafka and is materialized into Elasticsearch for analysis by Kibana. All data is using Confluent Schema Registry and Avro, and Confluent Control Center is managing and monitoring the deployment.
Data Pattern¶
Data pattern is as follows:
Components | Consumes From | Produces To |
---|---|---|
SSE source connector | Wikipedia | wikipedia.parsed |
ksqlDB | wikipedia.parsed |
ksqlDB streams and tables |
Kafka Streams application | wikipedia.parsed |
wikipedia.parsed.count-by-domain |
Elasticsearch sink connector | WIKIPEDIABOT (from ksqlDB) |
Elasticsearch/Kibana |
How to use this tutorial¶
We suggest following the cp-demo
tutorial in order:
- Module 1: On-Prem Tutorial: bring up the on-prem Kafka cluster and explore the different technical areas of Confluent Platform
- Module 2: Hybrid Deployment to Confluent Cloud Tutorial: create a cluster link to copy data from a local on-prem Kafka cluster to Confluent Cloud, and use the Metrics API to monitor both
- Teardown: clean up your on-prem and Confluent Cloud environment