Confluent Cloud ksqlDB Quick Start

Easily build stream processing applications with a simple and lightweight SQL syntax. Continuously transform, enrich, join together, and aggregate your Kafka events without writing any complex application code. As a fully managed service with a 99.9% uptime SLA, Confluent Cloud ksqlDB eliminates the operational overhead of running and operating infrastructure, empowering you to focus on application development.

This quick start gets you up and running with Confluent Cloud ksqlDB. It shows how to create streams and tables, and how to write streaming queries on cloud-hosted ksqlDB.

Important

This quick start assumes that you have completed Quick Start for Apache Kafka using Confluent Cloud.

Tip

In Quick Start for Apache Kafka using Confluent Cloud, you installed a Datagen connector to produce data to the users topic in your Confluent Cloud cluster.

In this quick start, you perform the following steps.

  1. Create a ksqlDB application in Confluent Cloud.
  2. Create the pageviews topic.
  3. Produce pageview data to Confluent Cloud.
  4. Create a stream in the ksqlDB editor.
  5. Create a table in the ksqlDB editor.
  6. Write a persistent query.
  7. Monitor persistent queries.

Note

This quick start uses the Cloud Console to create a ksqlDB application. For an introduction that uses the ksqlDB CLI exclusively, see ksqlDB Quickstart for Confluent Cloud.

Create a ksqlDB application in Confluent Cloud

To write queries against streams and tables, create a new ksqlDB application in Confluent Cloud.

Note

You can also create a ksqlDB application by using the Confluent CLI. For more information, see the CLI installation guide.

  1. In the navigation menu, click ksqlDB.

    Screenshot of Confluent Cloud showing the ksqlDB Add Application page
  2. Click Create application myself. On the New application page, ensure that Global access is selected, and click Continue.

    Screenshot of Confluent Cloud showing the ksqlDB Add Application wizard

    Note

    To enable stricter access control for the new ksqlDB application, click Granular access and follow these steps.

  3. On the Configuration page, enter ksqldb-app1 for the Application name and click Launch application.

    Screenshot of Confluent Cloud showing the ksqlDB Add Application wizard

    The All ksqlDB Applications page opens, and The new ksqlDB application appears in the applications list. The ksqlDB app has a Provisioning status.

    Note

    It may take a few minutes to provision the ksqlDB application. When ksqlDB is ready, its Status changes from Provisioning to Up.

  4. The new ksqlDB application appears in the applications list.

    Screenshot of Confluent Cloud showing the ksqlDB Applications page

Create the pageviews topic

In Quick Start for Apache Kafka using Confluent Cloud, you created the users topic by using the Confluent Cloud UI. In this step, you create the pageviews topic the same way.

  1. In the navigation bar, click Topics, and in the Topics page, click Add a topic.

    Create topic page Confluent Cloud
  2. In the Topic name field, type “pageviews”. Click Create with defaults.

    Topic page in Confluent Cloud showing a newly created topic

The pageviews topic is created on the Kafka cluster and is available for use by producers and consumers.

Produce pageview data to Confluent Cloud

In this step, you create a Datagen connector for the pageviews topic, using the same procedure that you used to create DatagenSourceConnector_users.

  1. In the navigation bar, click Data integration > Connectors, and click Add connector.

  2. The connector requires an API key and secret to access your cluster. In the Kafka Cluster credentials section, click Generate Kafka API key & secret.

    Screenshot of Confluent Cloud showing the Create API key page

    Copy the key and secret to a local file and check I have saved my API key and secret and am ready to continue. The key and secret are also required for the Confluent CLI to access your cluster.

  3. Click the Datagen Source Connector tile and fill in the form with the following values.

    Field Value
    Name enter “DatagenSourceConnector_pageviews”
    Which topic do you want to send data to? select pageviews
    Output Messages select JSON
    Quickstart select PAGEVIEWS
    Max interval between messages enter “100” for 0.1-second interval
    Number of tasks for this connector enter “1”

    When the form is filled in, it should resemble the following.

    Screenshot of Confluent Cloud showing the Add Datagen Source Connector page
  4. At the bottom of the form, click Next to review the details for your connector, and click Launch to start it.

    Screenshot of Confluent Cloud showing two running Datagen Source Connectors

    When the status of the new connector changes from Provisioning to Running, you have two producers sending event streams to topics in your Confluent Cloud cluster.

Create a stream and a table

To write streaming queries against the pageviews and users topics, register the topics with ksqlDB as a stream and a table. You can use the CREATE STREAM and CREATE TABLE statements in the ksqlDB Editor.

These examples query records from the pageviews and users topics using the following schema.

ER diagram showing a pageviews stream and a users table with a common userid column

Create a stream in the ksqlDB editor

You can create a stream or table by using the CREATE STREAM and CREATE TABLE statements in the ksqlDB Editor, similar to how you use them in the ksqlDB CLI.

  1. In the navigation bar, click ksqlDB.

  2. In the ksqlDB applications list, click ksqldb-app1.

  3. Copy the following code into the editor window and click Run query.

    CREATE STREAM pageviews_original (viewtime bigint, userid varchar, pageid varchar) WITH
    (kafka_topic='pageviews', value_format='JSON');
    

    Your output should resemble:

    Screenshot of the ksqlDB CREATE STREAM statement in Confluent Cloud
  4. In the editor window, use a SELECT query to inspect records in the pageviews stream.

    SELECT * FROM PAGEVIEWS_ORIGINAL EMIT CHANGES;
    

    Your output should resemble:

    Screenshot of a ksqlDB SELECT query in Confluent Cloud
  5. The query continues until you end it explicitly. Click Stop to end the query.

Create a table in the ksqlDB editor

Use the CREATE TABLE statement to register a table on a topic.

  1. Copy the following code into the editor window and click Run query.

    CREATE TABLE users (userid VARCHAR PRIMARY KEY, registertime BIGINT, gender VARCHAR, regionid VARCHAR) WITH
    (KAFKA_TOPIC='users', VALUE_FORMAT='JSON');
    

    Your output should resemble:

    Screenshot of the ksqlDB CREATE TABLE statement in Confluent Cloud
  2. In the editor window, use a SELECT query to inspect records in the users table.

    SELECT * FROM users EMIT CHANGES;
    

    Your output should resemble:

    Screenshot of a ksqlDB SELECT query on a table in Confluent Cloud
  3. The query continues until you end it explicitly. Click Stop to end the query.

  4. Click Tables, and in the list, click USERS to open the details page.

    Screenshot of the ksqlDB Table summary page in Confluent Cloud

Write a persistent query

With the pageviews topic registered as a stream, and the users topic registered as a table, you can write a streaming join query that runs until you end it with the TERMINATE statement.

  1. Copy the following code into the editor and click Run query.

    CREATE STREAM pageviews_enriched AS
    SELECT users.userid AS userid, pageid, regionid, gender
    FROM pageviews_original
    LEFT JOIN users
      ON pageviews_original.userid = users.userid
    EMIT CHANGES;
    

    Your output should resemble:

    Screenshot of the ksqlDB CREATE STREAM AS SELECT statement in Confluent Cloud
  2. To inspect your persistent queries, navigate to the Persistent queries page, which shows details about the pageviews_enriched stream that you created in the previous query.

    Screenshot of the ksqlDB Persistent Queries page in Confluent Cloud
  3. Click Explain query to see the schema and query properties for the persistent query.

    Screenshot of the ksqlDB Explain Query dialog in Confluent Cloud

Use Flow View to inspect data

To visualize data flow in your ksqlDB application, click the Flow tab to open the Flow View page.

Use Flow View to:

  • View the topology of your ksqlDB application.
  • Inspect the details of streams, tables, and the SQL statements that create them.
  • View events as they flow through your application.
ksqlDB application topology on the Flow View page in Confluent Cloud

Click the CREATE-STREAM node to see the query that you used to create the PAGEVIEWS_ENRICHED stream.

Details of a CREATE STREAM statement in the ksqlDB Flow View in Confluent Cloud

Click the PAGEVIEWS_ENRICHED node to see stream’s events and schema.

Details of a stream in the ksqlDB Flow page in Confluent Cloud

Monitor persistent queries

You can monitor your persistent queries visually by using Confluent Cloud.

In the Data integration menu, choose Clients and click the Consumer lag tab. Find the group that corresponds with your pageviews_enriched stream, for example _confluent-ksql-pksqlc-lgwpnquery_CSAS_PAGEVIEWS_ENRICHED_5. This view shows how well your persistent query is keeping up with the incoming data.

Screenshot of the Consumer Lag page in Confluent Cloud