KSQL Tutorials and Examples¶
This tutorial demonstrates a simple workflow using KSQL to write streaming queries against messages in Apache Kafka®.
Get started with these instructions:
Clickstream Data Analysis Pipeline¶
Clickstream analysis is the process of collecting, analyzing, and reporting aggregate data about which pages a website visitor visits and in what order. The path the visitor takes though a website is called the clickstream.
This tutorial focuses on building real-time analytics of users to determine:
- General website analytics, such as hit count and visitors
- Bandwidth use
- Mapping user-IP addresses to actual users and their location
- Detection of high-bandwidth user sessions
- Error-code occurrence and enrichment
- Sessionization to track user-sessions and understand behavior (such as per-user-session-bandwidth, per-user-session-hits etc)
The tutorial uses standard streaming functions (i.e., min, max, etc) and enrichment using child tables, stream-table join, and different types of windowing functionality.
Get started now with these instructions:
If you do not have Docker, you can also run an automated version of the Clickstream tutorial designed for local Confluent Platform installs. Running the Clickstream demo locally without Docker requires that you have Confluent Platform installed locally, along with Elasticsearch and Grafana.
These examples provide common KSQL usage operations.
You can configure Java streams applications to deserialize and ingest data in multiple ways, including Kafka console producers, JDBC source connectors, and Java client producers. For full code examples, see connect-streams-pipeline.