KSQL Tutorials and Examples¶
This tutorial demonstrates a simple workflow using KSQL to write streaming queries against messages in Kafka.
Get started with these instructions:
Stream Processing Cookbook¶
The Stream Processing Cookbook contains KSQL recipes that provide in-depth tutorials and recommended deployment scenarios.
Clickstream Data Analysis Pipeline¶
Clickstream analysis is the process of collecting, analyzing, and reporting aggregate data about which pages a website visitor visits and in what order. The path the visitor takes though a website is called the clickstream.
This tutorial focuses on building real-time analytics of users to determine:
- General website analytics, such as hit count and visitors
- Bandwidth use
- Mapping user-IP addresses to actual users and their location
- Detection of high-bandwidth user sessions
- Error-code occurrence and enrichment
- Sessionization to track user-sessions and understand behavior (such as per-user-session-bandwidth, per-user-session-hits etc)
The tutorial uses standard streaming functions (i.e., min, max, etc) and enrichment using child tables, stream-table join, and different types of windowing functionality.
Get started now with these instructions:
If you do not have Docker, you can also run an automated version of the Clickstream tutorial designed for local Confluent Platform installs. Running the Clickstream demo locally without Docker requires that you have Confluent Platform installed locally, along with Elasticsearch and Grafana.
These examples provide common KSQL usage operations.
You can configure Java streams applications to deserialize and ingest data in multiple ways, including Kafka console producers, JDBC source connectors, and Java client producers. For full code examples, see connect-streams-pipeline.
KSQL in a Kafka Streaming ETL¶
To learn how to deploy a Kafka streaming ETL using KSQL for stream processing, you can run the Confluent Platform demo. All components in the Confluent Platform demo have encryption, authentication, and authorization configured end-to-end.
Level Up Your KSQL Videos¶
|KSQL Introduction||Intro to Kafka stream processing, with a focus on KSQL.|
|KSQL Use Cases||Describes several KSQL uses cases, like data exploration, arbitrary filtering, streaming ETL, anomaly detection, and real-time monitoring.|
|KSQL and Core Kafka||Describes KSQL dependency on core Kafka, relating KSQL to clients, and describes how KSQL uses Kafka topics.|
|Installing and Running KSQL||How to get KSQL, configure and start the KSQL server, and syntax basics.|
|KSQL Streams and Tables||Explains the difference between a STREAM and TABLE, shows a detailed example, and explains how streaming queries are unbounded.|
|Reading Kafka Data from KSQL||How to explore Kafka topic data, create a STREAM or TABLE from a Kafka topic, identify fields. Also explains metadata like ROWTIME and TIMESTAMP, and covers different formats like Avro, JSON, and Delimited.|
|Streaming and Unbounded Data in KSQL||More detail on streaming queries, how to read topics from the beginning, the differences between persistent and non-persistent queries, how do streaming queries end.|
|Enriching data with KSQL||Scalar functions, changing field types, filtering data, merging data with JOIN, and rekeying streams.|
|Aggregations in KSQL||How to aggregate data with KSQL, different types of aggregate functions like COUNT, SUM, MAX, MIN, TOPK, etc, and windowing and late-arriving data.|
|Taking KSQL to Production||How to use KSQL in streaming ETL pipelines, scale query processing, isolate workloads, and secure your entire deployment.|
|Monitoring KSQL in Confluent Control Center||Monitor performance and end-to-end message delivery of your KSQL queries.|