Tutorials and Resources

Table of Contents

Demos

Using Confluent Platform to monitor Kafka streaming ETL deployments
This Docker-based demo shows you how to monitor Kafka streaming ETL deployments using Control Center. You can follow along with the playbook and watch the video tutorials.
Confluent Platform Demos and Code Example Repository
Demo applications and code examples for Confluent Platform and Apache Kafka.
Hybrid Kafka Clusters from Self-Hosted to Confluent Cloud
This Confluent Cloud demo is the automated version of the KSQL Tutorial, but instead of KSQL stream processing on your local install, it runs on your Confluent Cloud cluster. You can follow along with the playbook.
Kafka Streams Example Repository
Demo applications and code examples for Apache Kafka’s Streams API.
Streaming ETLs using KSQL and Confluent Platform
End-to-end demo applications using KSQL, Confluent Replicator, Confluent Control Center, and more. Requires Confluent Platform local install.
Docker images for Confluent Platform
Docker images for deploying and running the Confluent Platform.
Ansible playbooks for Confluent Platform
Ansible playbooks for deploying the Confluent Platform (No Confluent support, community support only)
Security Tutorial
This tutorial is a step-by-step guide to configuring the Confluent Platform with SSL encryption, SASL authentication, and authorization
Designing Event Driven Systems (Microservices) Demo
This project goes hand in hand with the book ‘Designing Event Driven Systems’, demonstrating how you can build a small microservices application with Kafka and Kafka Streams.

Technical Blogs

The Confluent blog contains many technical blog posts, including:

Introducing Kafka Streams: Stream Processing Made Simple
Describes the major new features in Apache Kafka v0.10: Kafka’s Streams API. The Streams API, available as a Java library that is part of the official Kafka project, is the easiest way to write mission-critical real-time applications and microservices with all the benefits of Kafka’s server-side cluster technology.
Putting Apache Kafka To Use: A Practical Guide to Building a Streaming Platform
A two part post that discusses our experience with real-time data streams and what is required to be successful with this new area of technology. All of this is based on real experience: we spent the last five years building Apache Kafka, transitioning LinkedIn to a fully stream-based architecture, and helping a number of Silicon Valley tech companies do the same thing.
Introducing the Kafka Consumer: Getting Started with the New Apache Kafka 0.9 Consumer Client
This blog describes how we redesigned the producer and consumer clients to support many use cases that were hard or impossible with the old clients and establish a set of APIs we could support over the long haul.
How to choose the number of topics/partitions in a Kafka cluster?
This blog explains a few important determining factors and provides simple formulas to use when choosing the number of topics and partitions in a Kafka cluster.
Exactly-once Semantics are Possible: Here’s How Kafka Does it
This blog discusses what exactly-once semantics mean in Apache Kafka, why it is a hard problem, and how the new idempotence and transactions features in Kafka enable correct exactly-once stream processing using Kafka’s Streams API.

Video Tutorials and Screencasts

The Confluent YouTube channel provides a great resource, including tutorials, how-tos, customer use cases, and general Confluent education. This content is frequently updated and so it is recommended that you subscribe.

Intro to Streams | Apache Kafka Streams API
The Streams API of Apache Kafka is the easiest way to write mission-critical real-time applications and microservices with all the benefits of Kafka’s server-side cluster technology. It allows you to build standard Java or Scala applications.
Intro to KSQL | Streaming SQL for Apache Kafka
KSQL is an open source streaming SQL engine that implements continuous, interactive queries against Apache Kafka. KSQL makes it easy to read, write and process streaming data in real-time, at scale, using SQL-like semantics.
Neha Narkhede | Kafka Summit 2018 Keynote (The Present and Future of the Streaming Platform)
Neha Narkhede is co-founder and CTO at Confluent. Prior to founding Confluent, Neha led streams infrastructure at LinkedIn, where she was responsible for LinkedIn’s streaming infrastructure built on top of Apache Kafka® and Apache Samza.
Microservices Explained by Confluent
Microservices architectures enable organizations to evolve their systems away from the slow and unresponsive shared-state architectures of the past. Confluent provides a streaming platform for incorporating data in flight into a lightweight, efficient, and responsive microservices architecture.
Introducing the Confluent CLI
The Confluent Command Line Interface (CLI) helps you speed up development by making it easy to get up and running. You will be able to iterate more quickly when implementing your apps and interact with the entire Confluent ecosystem.

White Papers

You can find the Confluent white papers at https://www.confluent.io/resources/, including:

Confluent Enterprise Reference Architecture
This white paper describes the reference architecture of Confluent Platform, which is the most complete platform to build enterprise-scale streaming pipelines using Apache Kafka and to simplify the development of stream processing applications.
Optimizing Your Apache Kafka Deployment
This white paper discusses how to optimize your Apache Kafka deployment for various services goals including throughput, latency, durability and availability. It is intended for Kafka administrators and developers planning to deploy Kafka in production.
Microservices in the Apache Kafka Ecosystem
This white paper provides a brief overview of how microservices can be built in the Apache Kafka ecosystem.
Disaster Recovery for Multi-Datacenter Kafka Deployments
This white paper provides a practical guide to configuring multiple Apache Kafka clusters so that if a disaster scenario strikes, you have a plan for failover, failback, and ultimately successful recovery.

KSQL

Introducing KSQL: Open Source Streaming SQL for Apache Kafka
Learn about KSQL, the Confluent Platform‘s streaming SQL engine for Apache Kafka. KSQL runs continuous queries, which are transformations that run persistently as new data passes through them, on streams of data in Kafka topics.
Real-Time Streaming ETL from Oracle Transactional Data
Replace batch extracts with event streams, and batch transformations with in-flight transformation of event streams. Take a stream of data from a transactional system built on Oracle, transform it, and stream it into Elasticsearch. Use KSQL to filter streams of events in real-time from a database and join between events from two database tables to create rolling aggregates on the data.
Write a User Defined Function (UDF) for KSQL
Build, deploy, and test a user-defined function (UDF) to extend the set of available functions in your KSQL code. Write Java code within the UDF to convert a timestamp from String to BigInt.
Monitoring Kafka in Confluent Control Center
Use the KSQL CLI and Confluent Control Center to view streams and throughput of incoming records for persistent KSQL queries.