Quick Start for Confluent Cloud

Confluent Cloud is a resilient, scalable, streaming data service based on Apache Kafka®, delivered as a fully managed service. Confluent Cloud has a web interface called the Confluent Cloud Console, a local command line interface, and REST APIs. You can manage cluster resources, settings, and billing with the Cloud Console. You can use the Confluent CLI and REST APIs to create and manage Kafka topics and more.

This quick start gets you up and running with Confluent Cloud using a Basic Kafka cluster. The first section shows how to use Confluent Cloud to create topics, and produce and consume data to and from the cluster. The second section walks you through how to add ksqlDB to the cluster and perform queries on the data using a SQL-like syntax.

Section 1: Create a cluster, add a topic

Follow the steps in this section to set up a Kafka cluster on Confluent Cloud and produce data to Kafka topics on the cluster.

Note

Confluent Cloud Console includes an in-product tutorial that guides you through the basic steps for setting up your environment. To start the tutorial, log in and choose Learn.

Prerequisites

Step 1: Create a Kafka cluster in Confluent Cloud

In this step, you create an environment, select a cloud provider, and then create and launch a basic Kafka cluster inside your new environment.

  1. Sign in to Confluent Cloud at https://confluent.cloud.

  2. Click Add cluster.

  3. On the Create cluster page, for the Basic cluster, select Begin configuration.

    Screenshot of Confluent Cloud showing the Create cluster page

    This example creates a Basic Kafka cluster, which supports single zone availability. For information about other cluster types, see Kafka Cluster Types in Confluent Cloud.

  4. On the Region/zones page, choose a cloud provider, region, and select single availability zone.

    Screenshot of Confluent Cloud showing the Create Cluster workflow
  5. Select Continue.

    Note

    If you haven’t set up a payment method, you see the Set payment page. Enter payment method and select Review or select Skip payment.

  6. Specify a cluster name, review the configuration and cost information, and then select Launch cluster.

    Screenshot of Confluent Cloud showing the Create Cluster workflow

Depending on the chosen cloud provider and other settings, it may take a few minutes to provision your cluster, but after the cluster has provisioned, the Cluster Overview page displays.

Screenshot of Confluent Cloud showing Cluster Overview page

Now you can get started configuring apps and data on your new cluster.

Step 2: Create a Kafka topic

In this step, you create a users topic by using the Cloud Console. A Kafka topic is a unit of organization for a cluster, and is essentially an append-only log. For more about topics, see What is Apache Kafka.

  1. From the navigation menu, click Topics, and then click Create topic.

    Create topic page Confluent Cloud
  2. In the Topic name field, type “users” and then select Create with defaults.

    Topic page in Confluent Cloud showing a newly created topic

The users topic is created on the Kafka cluster and is available for use by producers and consumers.

The success message may prompt you to take an action, but you should continue with Step 3: Create a sample producer.

Step 3: Create a sample producer

You can produce example data to your Kafka cluster by using the hosted Datagen Source Connector for Confluent Cloud.

  1. From the navigation menu, select Connectors.

    To open Confluent Cloud at Connectors: https://confluent.cloud/go/connectors.

  2. In the Search box, type “datagen”.

    Screenshot that shows searching for the datagen connector
  3. From the search results, select the Datagen Source connector.

  4. On the Topic selection pane, select the users topic you created in the previous section and then select Continue.

  5. In the Kafka credentials pane, leave Global access selected, and click Generate API key & download. This creates an API key and secret that allows the connector to access your cluster, and downloads the key and secret to your computer.

    The key and secret are required for the connector and also for the Confluent CLI and ksqlDB CLI to access your cluster.

    Note

    An API key and associated secret apply to the active Kafka cluster. If you add a new cluster, you must create a new API key for producers and consumers on the new Kafka cluster. For more information, see Use API Keys to Control Access in Confluent Cloud.

  6. Enter “users” as the description for the key, and click Continue.

  7. On the Configuration page, select JSON for the output record value format, Users for template, and then click Continue.

  8. For Connector sizing, leave the slider at the default of 1 task and click Continue

  9. On the Review and launch page, select the text in the Connector name box and replace it with “DatagenSourceConnector_users”.

  10. Click Continue to start the connector.

    The status of your new connector should read Provisioning, which lasts for a few seconds. When the status changes to Running, your connector is producing data to the users topic.

    Screenshot of Confluent Cloud showing a running Datagen Source Connector

Step 4: Consume messages

Your new users topic is now consuming messages. Use Confluent Cloud Console to see the data.

  1. From the navigation menu, select Topics to show the list of topics in your cluster.

    Screenshot of Confluent Cloud showing the Topics page
  2. Select the users topic.

  3. In the users topic detail page, select the Messages tab to view the messages being produced to the topic. The message viewer shows messages produced since the page was loaded, but it doesn’t show a historical view.

    Animated image of Confluent Cloud showing the Messages page

Step 5: Enable a Stream Governance package

Now you will enable a Stream Governance package so that you can track the movement of data through your cluster. This enables you to see sources, sinks, and topics and monitor messages as they move from one to another.

If your organization already has Stream Governance enabled, you can skip to Step 6: Inspect the data stream.

  1. Click Stream Lineage in the navigation menu (it will be grayed out) and click Stream Governance Package in the popup to go to Stream Governance Packages.

    To open Confluent Cloud at Stream Governance Packages: https://confluent.cloud/go/schema-registry.

  2. Under the Essentials package, choose Begin configuration.

  3. On the Enable Stream Governance Essentials page, choose a cloud provider and a free region for that provider, and then click Enable.

    For example, choose AWS and Ohio (us-east-2) for $0/hr.

  4. Now, click the cluster you created for the Quick Start.

  5. From the navigation menu in the cluster, click Topics and then click the users topic. In the users topic, click See in Stream Lineage at the top right. The stream lineage for the users topic is shown.

    Screenshot of Confluent Cloud showing the data streams page

Step 6: Inspect the data stream

Use Stream Lineage to track data movement through your cluster.

  1. Click Stream Lineage in the navigation menu.

  2. Click the node labeled DatagenSourceConnector_users, which is the connector that you created in Step 3. The details view opens, showing graphs for total production and other data.

    Screenshot of Confluent Cloud showing details for a source connector
  3. Dismiss the details view and select the topic labeled users. The details view opens, showing graphs for total throughput and other data.

    Screenshot of Confluent Cloud showing details for a topic
  4. Click the arrow on the left border of the canvas to open the navigation menu.

(optional) Step 7: Delete the connector and topic

Skip this step if you plan to move on to Section 2: Add ksqlDB to the cluster and learn how to use SQL statements to query your data.

If you don’t plan to complete Section 2 and you’re ready to quit the Quick Start, delete the resources you created to avoid unexpected charges to your account.

  • Delete the connector:
  1. From the navigation menu, select Connectors.
  2. Click DatagenSourceConnector_users and choose the Settings tab.
  3. Click Delete connector, enter the connector name (DatagenSourceConnector_users), and click Confirm.
  • Delete the topic:
  1. From the navigation menu, click Topics, select the users topic, and then choose the Configuration tab.
  2. Click Delete topic, enter the topic name (users), and select Continue.

Section 2: Add ksqlDB to the cluster

In Section 1, you installed a Datagen connector to produce data to the users topic in your Confluent Cloud cluster.

In this section, you will create a ksqlDB cluster, and a stream and a table in that cluster, and write queries against them.

Note

This section uses the Cloud Console to create a ksqlDB cluster. For an introduction that uses the Confluent CLI exclusively, see ksqlDB Quickstart for Confluent Cloud.

Step 1: Create a ksqlDB cluster in Confluent Cloud

To write queries against streams and tables, create a new ksqlDB cluster in Confluent Cloud.

  1. Select the cluster you created in Section 1, and in the navigation menu, click ksqlDB.

    Screenshot of Confluent Cloud showing the ksqlDB Add Cluster page
  2. Click Create cluster myself.

  3. On the New cluster page, ensure that Global access is selected, and click Continue.

    Screenshot of Confluent Cloud showing the ksqlDB Add Cluster wizard
  4. On the Configuration page, enter ksqldb-app1 for the Cluster name. In the Cluster size dropdown, select 1 and keep the configuration options in Default state. For more information on cluster sizes and configuration options, see Manage Billing in Confluent Cloud and Configuration Options.

    Screenshot of Confluent Cloud showing the ksqlDB Add Cluster wizard
  5. Click Launch cluster. The ksqlDB clusters page opens, and the new cluster appears in the list. The new cluster has a Provisioning status. It may take a few minutes to provision the ksqlDB cluster. When the ksqlDB cluster is ready, its Status changes from Provisioning to Up.

  6. The new ksqlDB cluster appears in the clusters list.

    Screenshot of the Confluent Cloud console showing the ksqlDB Clusters page

Step 2: Create the pageviews topic

In Section 1, you created the users topic by using the Cloud Console. In this step, you create the pageviews topic the same way.

  1. In the navigation menu, click Topics, and in the Topics page, click Add topic.

    Create topic page Confluent Cloud
  2. In the Topic name field, type “pageviews”. Click Create with defaults.

    Topic page in Confluent Cloud showing a newly created topic

The pageviews topic is created on the Kafka cluster and is available for use by producers and consumers.

The pageviews topic is created on the Kafka cluster and is available for use by producers and consumers.

Step 3: Produce pageview data to Confluent Cloud

In this step, you create a Datagen connector for the pageviews topic, using the same procedure that you used to create DatagenSourceConnector_users.

  1. In the navigation menu, select Connectors.

  2. In the Search connectors box, enter “datagen”.

  3. From the search results, select the Datagen Source connector.

    Screenshot that shows search results for the datagen connector
  4. On the Topic selection pane, select the pageviews topic you created in the previous section and click Continue.

  5. In the API credentials pane, leave Global access selected, and click Generate API key & download. This creates an API key and secret that allows the connector to access your cluster, and downloads the key and secret to your computer.

    The key and secret are required for the connector and also for the Confluent CLI and ksqlDB CLI to access your cluster.

    Note

    An API key and associated secret apply to the active Kafka cluster. If you add a new cluster, you must create a new API key for producers and consumers on the new Kafka cluster. For more information, see Use API Keys to Control Access in Confluent Cloud.

  6. Enter “pageviews” as the description for the key, and click Continue.

  7. On the Configuration page, select JSON_SR for the output record value format, Pageviews for template, and then click Continue.

    Selecting JSON_SR configures the connector to associate a schema with the pageviews topic and register it with Schema Registry. Currently, importing a topic as a stream works only for the JSON_SR format.

  8. For Connector sizing, leave the slider at the default of 1 task and click Continue

  9. On the Review and launch page, select the text in the Connector name box and replace it with “DatagenSourceConnector_pageviews”.

  10. Click Continue to start the connector.

    The status of your new connector should read Provisioning, which lasts for a few seconds. When the status of the new connector changes from Provisioning to Running, you have two producers sending event streams to topics in your Confluent Cloud cluster.

    Screenshot of Confluent Cloud showing two running Datagen Source Connectors

Step 4: Create tables and streams

In the next two steps, you create a table for the users topic and a stream for the pageviews topic by using familiar SQL syntax. When you register a stream or a table on a topic, you can use the stream/table in SQL statements.

  • A table is a mutable collection that models change over time. Tables work by leveraging the keys of each row. If a sequence of rows shares a key, the last row for a given key represents the most up-to-date information for that key’s identity. A background process periodically runs and deletes all but the newest rows for each key.
  • A stream is an immutable append-only collection that represents a series of historical facts, or events. Once a row is inserted into a stream, the row can never change. You can append new rows at the end of the stream, but you can’t update or delete existing rows.

Together, tables and streams comprise a fully realized database. For more information, see Stream processing

These examples query records from the pageviews and users topics using the following schema.

ER diagram showing a pageviews stream and a users table with a common userid column

Step 5: Create a table in the ksqlDB editor

You can create a stream or table by using the CREATE STREAM and CREATE TABLE statements in the ksqlDB Editor, similar to how you use them in the ksqlDB CLI.

Use the CREATE TABLE statement to register a table on a topic.

  1. In the navigation menu, click ksqlDB.

  2. In the ksqlDB clusters list, click ksqldb-app1.

  3. Make sure the Editor tab is selected, copy the following code into the editor window, and click Run query.

    CREATE TABLE users (userid VARCHAR PRIMARY KEY, registertime BIGINT, gender VARCHAR, regionid VARCHAR) WITH
    (KAFKA_TOPIC='users', VALUE_FORMAT='JSON');
    

    Your output should resemble:

    Screenshot of the ksqlDB CREATE TABLE statement in Confluent Cloud

    To create a tab in the editor window on a Mac, press Option+Tab.

  4. Clear the editor window, and use the following SELECT query to inspect records in the users table. Click Run query.

    SELECT * FROM users EMIT CHANGES;
    

    Your output should resemble:

    Screenshot of a ksqlDB SELECT query on a table in Confluent Cloud
  5. The query continues until you end it explicitly. Click Stop to end the query.

  6. Click Tables, and in the list, click USERS to open the details page.

    Screenshot of the ksqlDB Table summary page in Confluent Cloud

Step 6: Create a stream in the ksqlDB editor

The Cloud Console automates registering a stream on a topic.

  1. Click Streams to view the currently registered streams.

  2. Click Import topics as streams.

    The Import dialog opens.

  3. Ensure that pageviews is selected, and then click Import to register a stream on the pageviews topic.

    The PAGEVIEWS stream is created and appears in the STREAM list.

    Note

    You can use the CREATE STREAM statement in the editor window to register a stream on a topic manually and specify the stream’s name.

    CREATE STREAM pageviews (viewtime bigint, userid varchar, pageid varchar) WITH
    (kafka_topic='pageviews', value_format='JSON_SR');
    
  4. Click Editor, and clear the editor window.

    To create a tab in the editor window on a Mac, press Option+Tab.

  5. Copy the following SELECT query to inspect records in the PAGEVIEWS stream into the editor, and click Run query.

    SELECT * FROM PAGEVIEWS EMIT CHANGES;
    

    Your output should resemble:

    Screenshot of a ksqlDB SELECT query in Confluent Cloud
  6. The query continues until you end it explicitly. Click Stop to end the query.

Step 7: Write a persistent query

With the pageviews topic registered as a stream, and the users topic registered as a table, you can write a streaming join query that runs until you end it with the TERMINATE statement.

  1. Click Editor and copy the following code into the editor, clearing its previous contents, and click Run query.

    CREATE STREAM pageviews_enriched AS
    SELECT users.userid AS userid, pageid, regionid, gender
    FROM PAGEVIEWS
    LEFT JOIN users
      ON PAGEVIEWS.userid = users.userid
    EMIT CHANGES;
    

    Your output should resemble:

    Screenshot of the ksqlDB CREATE STREAM AS SELECT statement in Confluent Cloud

    To create a tab in the editor window on a Mac, press Option+Tab.

  2. To inspect your persistent queries, click the Persistent queries tab, which shows details about the pageviews_enriched stream that you created in the previous query.

    Screenshot of the ksqlDB Persistent Queries page in Confluent Cloud
  3. Click Explain query to see the schema and query properties for the persistent query.

    Screenshot of the ksqlDB Explain Query dialog in Confluent Cloud

Step 8: Inspect data with the Flow view

Use the Flow view to monitor the topology of your application, inspect the details of streams, tables, and the SQL statements that create them. Additionally, you can track the flow of events through your application in real-time, making it easier to identify any potential issues and optimize your application’s performance.

  1. To visualize data flow in your ksqlDB application, click the Flow tab.

    ksqlDB application topology on the Flow View page in Confluent Cloud
  2. Click the CREATE-STREAM node to see the query that you used to create the PAGEVIEWS_ENRICHED stream.

    Details of a CREATE STREAM statement in the ksqlDB Flow View in Confluent Cloud
  3. Click the PAGEVIEWS_ENRICHED node to see the stream’s events and schema.

    Details of a stream in the ksqlDB Flow page in Confluent Cloud

Step 9: Monitor persistent queries

You can monitor your persistent queries visually using the Cloud Console.

  1. In the navigation menu, select Clients and click the Consumer lag tab.

  2. Find the group that corresponds with your pageviews_enriched stream, for example _confluent-ksql-pksqlc-lgwpnquery_CSAS_PAGEVIEWS_ENRICHED_5. This view shows how well your persistent query is keeping up with the incoming data.

    Screenshot of the Consumer Lag page in Confluent Cloud

Step 10: Delete the connectors and topics

When you are finished with the Quick Start, delete the resources you created to avoid unexpected charges to your account.

  • Delete the connectors:
  1. From the navigation menu, select Connectors.
  2. Click DatagenSourceConnector_users and choose the Settings tab.
  3. Click Delete connector, enter the connector name (DatagenSourceConnector_users), and click Confirm.
  4. Repeat these steps with the DatagenSourceConnector_pageviews connector.
  • Delete the topics:
  1. From the navigation menu, click Topics, select the users topic, and choose the Configuration tab.
  2. Click Delete topic, enter the topic name (users), and click Continue.
  3. Repeat these steps with the pageviews topic.