Quick Start for Confluent Cloud

This quick start gets you up and running with Confluent Cloud using a basic cluster. The first section shows how to use Confluent Cloud to create topics, how to produce data to the Confluent Cloud cluster, and how to consume data from the cluster. The second section walks you through how to add ksqlDB to the cluster and perform queries on the data using a SQL-like syntax.

Section 1: Create a cluster, add a topic

Follow the steps in this section to set up a Kafka cluster on Confluent Cloud and produce data to Kafka topics on the cluster.

Confluent Cloud is a resilient, scalable streaming data service based on Apache Kafka®, delivered as a fully managed service. Confluent Cloud has a web interface and local command line interface. You can manage cluster resources, settings, and billing with the web interface. You can use Confluent CLI to create and manage Kafka topics. Sign up for Confluent Cloud to get started.

For a list of interfaces and features of Confluent Cloud, see the Confluent Cloud documentation.

Note

Confluent Cloud Console includes an in-product tutorial that guides you through the basic steps for setting up your environment. This tutorial enables you to practice configuring Confluent Cloud components from directly within the console. Log in to Confluent Cloud and follow the tutorial link or click the LEARN button in the console to start the tutorial.

Prerequisites

Step 1: Create a Kafka cluster in Confluent Cloud

In this step, you create an environment, select a cloud provider, and then create and launch a basic Kafka cluster inside your new environment.

  1. Sign in to Confluent Cloud at https://confluent.cloud.

  2. Click Add cluster.

  3. On the Create cluster page, for the Basic cluster, select Begin configuration.

    Screenshot of Confluent Cloud showing the Create cluster page

    This example creates a Basic cluster, which supports single zone availability. For information about Standard and Dedicated cluster types, see Confluent Cloud Features and Limits by Cluster Type.

  4. On the Region/zones page, choose a cloud provider, region, and select single availability zone.

    Screenshot of Confluent Cloud showing the Create Cluster workflow
  5. Select Continue.

  6. On the Set payment page, take one of the following actions:

    • Enter payment card information and select Review
    • Select Skip payment
  7. Specify a cluster name, review the configuration and cost information, and then select Launch cluster.

    Screenshot of Confluent Cloud showing the Create Cluster workflow

Depending on the chosen cloud provider and other settings, it may take a few minutes to provision your cluster, but once the cluster has provisioned, the Cluster Overview page displays.

Now you can get started configuring apps and data on your new cluster.

Screenshot of Confluent Cloud showing Cluster Overview page

Tip

You can also create a cluster using a REST API or the Confluent CLI. To create a cluster with the REST API, see Create a cluster, to use Confluent CLI use confluent kafka cluster commands.

Step 2: Create a Kafka topic

In this step, you create a users topic by using the Cloud Console. A Kafka topic is a unit of organization for a cluster, and is essentially an append-only log. For more about topics, see What is Apache Kafka.

Tip

You can also create topics by using the Confluent CLI or REST API. See Create a Topic.

  1. From the navigation menu, click Topics, and then click Create topic.

    Create topic page Confluent Cloud
  2. In the Topic name field, type “users” and then click Create with defaults.

    Topic page in Confluent Cloud showing a newly created topic

The users topic is created on the Kafka cluster and is available for use by producers and consumers.

Step 3: Create a sample producer

You can produce example data to your Kafka cluster by using the hosted Datagen Source Connector for Confluent Cloud.

  1. From the navigation menu, select Connectors.

  2. In the Search box, type “datagen”.

    Screenshot that shows searching for the datagen connector
  3. From the search results, select the Datagen Source connector.

  4. On the Topic selection pane, select the users topic you created in the previous section and then click Continue.

  5. In the Kafka credentials pane, leave Global access selected, and click Generate API key & download. This creates an API key and secret that allows the connector to access your cluster, and downloads the key and secret to your computer.

    The key and secret are required for the connector and also for the Confluent CLI and ksqlDB CLI to access your cluster.

    Note

    An API key and associated secret apply to the active Kafka cluster. If you add a new cluster, you must create a new API key for producers and consumers on the new Kafka cluster. For more information, see Use API Keys to Control Access in Confluent Cloud.

  6. Enter “users” as the description for the key, and click Continue.

  7. On the Configuration page, select JSON for the output record value format, Users for template, and then click Continue.

  8. For Connector sizing, leave the slider at the default of 1 task and click Continue

  9. On the Review and launch page, select the text in the Connector name box and replace it with “DatagenSourceConnector_users”.

  10. Click Continue to start the connector.

    The status of your new connector should read Provisioning, which lasts for a few seconds. When the status changes to Running, your connector is producing data to the users topic.

    Screenshot of Confluent Cloud showing a running Datagen Source Connector

Step 4: Consume messages

Your new users topic is now consuming messages. Use Confluent Cloud Console to see the data.

  1. From the navigation menu, select Topics to show the list of topics in your cluster.

    Screenshot of Confluent Cloud showing the Topics page
  2. Select the users topic.

  3. In the users topic detail page, select the Messages tab to view the messages being produced to the topic. The message viewer shows messages produced since the page was loaded, but it doesn’t show a historical view.

    Animated image of Confluent Cloud showing the Messages page

Step 5: Inspect the data stream

Now you will enable a Stream Governance package so that you can track the movement of data through your cluster. This enables you to see sources, sinks, and topics and monitor messages as they move from one to another.

  1. Click Stream Lineage in the navigation menu (it will be grayed out), and click Stream Governance Package in the popup to go to the Stream Governance Packages page.

  2. Under the Essentials package, choose Begin configuration.

  3. On the Enable Stream Governance Essentials page, choose a cloud provider and a free region for that provider, and then click Enable.

    For example, choose AWS and Ohio (us-east-2) for $0/hr.

  4. Now, navigate back to the users topic page in your cluster and click See in Stream Lineage at the top right. The stream lineage for the users topic is shown.

    Screenshot of Confluent Cloud showing the data streams page
  5. Click the node labeled DatagenSourceConnector_users, which is the connector that you created in Step 3. The details view opens, showing graphs for total production and other data.

    Screenshot of Confluent Cloud showing details for a source connector
  6. Dismiss the details view and select the topic labeled users. The details view opens, showing graphs for total throughput and other data.

    Screenshot of Confluent Cloud showing details for a topic
  7. Click the arrow on the left border of the canvas to open the navigation menu.

(optional) Step 6: Delete the connector and topic

Skip this step if you plan to move on to Section 2: Add ksqlDB to the cluster and learn how to use SQL statements to query your data.

If you don’t plan to complete Section 2 and you’re ready to quit the Quick Start, delete the resources you created to avoid unexpected charges to your account.

  • Delete the connector:
    1. From the navigation menu, select Connectors.
    2. Click DatagenSourceConnector_users and choose the Settings tab.
    3. Click Delete connector, enter the connector name (DatagenSourceConnector_users), and click Confirm.
  • Delete the topic:
    1. From the navigation menu, click Topics, select the users topic, and then choose the Configuration tab.
    2. Click Delete topic, enter the topic name (users), and select Continue.

Section 2: Add ksqlDB to the cluster

In Section 1, you installed a Datagen connector to produce data to the users topic in your Confluent Cloud cluster.

In this section, you will create a ksqlDB cluster, and a stream and a table in that cluster, and write queries against them.

Note

This section uses the Cloud Console to create a ksqlDB cluster. For an introduction that uses the Confluent CLI exclusively, see ksqlDB Quickstart for Confluent Cloud.

Step 1: Create a ksqlDB cluster in Confluent Cloud

To write queries against streams and tables, create a new ksqlDB cluster in Confluent Cloud.

  1. Select the cluster you created in Section 1, and in the navigation menu, click ksqlDB.

    Screenshot of Confluent Cloud showing the ksqlDB Add Cluster page
  2. Click Create cluster myself.

  3. On the New cluster page, ensure that Global access is selected, and click Continue.

    Screenshot of Confluent Cloud showing the ksqlDB Add Cluster wizard

    Note

    • To enable stricter access control for the new ksqlDB cluster, click Granular access and follow these steps.
    • If you see an alert that reads, “IMPORTANT: Confirm that the user or service account has the required privileges to access Schema Registry”, follow the steps in Enable ksqlDB integration with Schema Registry to configure your ksqlDB cluster to access Schema Registry.
  4. On the Configuration page, enter ksqldb-app1 for the Cluster name. In the Cluster size dropdown, select 1 and keep the configuration options in default state. For more information on cluster sizes and configuration options, see Confluent Cloud Billing and Configuration Options.

    Screenshot of Confluent Cloud showing the ksqlDB Add Cluster wizard
  5. Click Launch cluster. The ksqlDB clusters page opens, and the new cluster appears in the list. The new cluster has a Provisioning status. It may take a few minutes to provision the ksqlDB cluster. When the ksqlDB cluster is ready, its Status changes from Provisioning to Up.

  6. The new ksqlDB cluster appears in the clusters list.

    Screenshot of the Confluent Cloud console showing the ksqlDB Clusters page

Step 2: Create the pageviews topic

In Section 1, you created the users topic by using the Cloud Console. In this step, you create the pageviews topic the same way.

  1. In the navigation menu, click Topics, and in the Topics page, click Add topic.

    Create topic page Confluent Cloud
  2. In the Topic name field, type “pageviews”. Click Create with defaults.

    Topic page in Confluent Cloud showing a newly created topic

The pageviews topic is created on the Kafka cluster and is available for use by producers and consumers.

Step 3: Produce pageview data to Confluent Cloud

In this step, you create a Datagen connector for the pageviews topic, using the same procedure that you used to create DatagenSourceConnector_users.

  1. In the navigation menu, select Connectors, and click Add connector.

  2. In the Search connectors box, enter “datagen”.

    Screenshot that shows searching for the datagen connector
  3. From the search results, select the Datagen Source connector.

  4. On the Topic selection pane, select the pageviews topic you created in the previous section and click Continue.

  5. In the API credentials pane, leave Global access selected, and click Generate API key & download. This creates an API key and secret that allows the connector to access your cluster, and downloads the key and secret to your computer.

    The key and secret are required for the connector and also for the Confluent CLI and ksqlDB CLI to access your cluster.

    Note

    An API key and associated secret apply to the active Kafka cluster. If you add a new cluster, you must create a new API key for producers and consumers on the new Kafka cluster. For more information, see Use API Keys to Control Access in Confluent Cloud.

  6. Enter “pageviews” as the description for the key, and click Continue.

  7. On the Configuration page, select JSON_SR for the output record value format, Pageviews for template, and then click Continue.

    Note

    Selecting JSON_SR configures the connector to associate a schema with the pageviews topic and register it with Schema Registry. Currently, importing a topic as a stream works only for the JSON_SR format.

  8. For Connector sizing, leave the slider at the default of 1 task and click Continue

  9. On the Review and launch page, select the text in the Connector name box and replace it with “DatagenSourceConnector_pageviews”.

  10. Click Continue to start the connector.

    The status of your new connector should read Provisioning, which lasts for a few seconds. When the status of the new connector changes from Provisioning to Running, you have two producers sending event streams to topics in your Confluent Cloud cluster.

    Screenshot of Confluent Cloud showing two running Datagen Source Connectors

Step 4: Create a table and a stream

In this step, you create a stream for the pageviews topic and a table for the users topic by using familiar SQL syntax. When you register a stream or a table on a topic, you can use the stream/table in SQL statements.

  • A stream is an immutable append-only collection that represents a series of historical facts, or events. Once a row is inserted into a stream, the row can never change. You can append new rows at the end of the stream, but you can’t update or delete existing rows.
  • A table is a mutable collection that models change over time. Tables work by leveraging the keys of each row. If a sequence of rows shares a key, the last row for a given key represents the most up-to-date information for that key’s identity. A background process periodically runs and deletes all but the newest rows for each key.

Together, streams and tables comprise a fully realized database. For more information, see Stream processing

These examples query records from the pageviews and users topics using the following schema.

ER diagram showing a pageviews stream and a users table with a common userid column

Step 5: Create a table in the ksqlDB editor

You can create a stream or table by using the CREATE STREAM and CREATE TABLE statements in the ksqlDB Editor, similar to how you use them in the ksqlDB CLI.

Use the CREATE TABLE statement to register a table on a topic.

  1. In the navigation menu, click ksqlDB.

  2. In the ksqlDB clusters list, click ksqldb-app1.

  3. Make sure the Editor tab is selected, copy the following code into the editor window, and click Run query.

    CREATE TABLE users (userid VARCHAR PRIMARY KEY, registertime BIGINT, gender VARCHAR, regionid VARCHAR) WITH
    (KAFKA_TOPIC='users', VALUE_FORMAT='JSON');
    

    Your output should resemble:

    Screenshot of the ksqlDB CREATE TABLE statement in Confluent Cloud

    Note

    To create a tab in the editor window on a Mac, press Alt+Tab.

  4. Clear the editor window, and use the following SELECT query to inspect records in the users table. Click Run query.

    SELECT * FROM users EMIT CHANGES;
    

    Your output should resemble:

    Screenshot of a ksqlDB SELECT query on a table in Confluent Cloud
  5. The query continues until you end it explicitly. Click Stop to end the query.

  6. Click Tables, and in the list, click USERS to open the details page.

    Screenshot of the ksqlDB Table summary page in Confluent Cloud

Step 6: Create a stream in the ksqlDB editor

The Cloud Console automates registering a stream on a topic.

  1. Click Streams to view the currently registered streams.

  2. Click Import topics as streams.

    The Import dialog opens.

  3. Ensure that pageviews is selected, and then click Import to register a stream on the pageviews topic.

    The PAGEVIEWS stream is created and appears in the STREAM list.

    Note

    You can use the CREATE STREAM statement in the editor window to register a stream on a topic manually and specify the stream’s name.

    CREATE STREAM pageviews_original (viewtime bigint, userid varchar, pageid varchar) WITH
    (kafka_topic='pageviews', value_format='JSON_SR');
    
  4. Click Editor, and clear the editor window.

    Note

    To create a tab in the editor window on a Mac, press Alt+Tab.

  5. Copy the following SELECT query to inspect records in the PAGEVIEWS stream, and click Run query.

    SELECT * FROM PAGEVIEWS EMIT CHANGES;
    

    Your output should resemble:

    Screenshot of a ksqlDB SELECT query in Confluent Cloud
  6. The query continues until you end it explicitly. Click Stop to end the query.

Step 7: Write a persistent query

With the pageviews topic registered as a stream, and the users topic registered as a table, you can write a streaming join query that runs until you end it with the TERMINATE statement.

  1. Click Editor and copy the following code into the editor, clearing its previous contents, and click Run query.

    CREATE STREAM pageviews_enriched AS
    SELECT users.userid AS userid, pageid, regionid, gender
    FROM PAGEVIEWS
    LEFT JOIN users
      ON PAGEVIEWS.userid = users.userid
    EMIT CHANGES;
    

    Your output should resemble:

    Screenshot of the ksqlDB CREATE STREAM AS SELECT statement in Confluent Cloud

    Note

    To create a tab in the editor window on a Mac, press Alt+Tab.

  2. To inspect your persistent queries, click the Persistent queries tab, which shows details about the pageviews_enriched stream that you created in the previous query.

    Screenshot of the ksqlDB Persistent Queries page in Confluent Cloud
  3. Click Explain query to see the schema and query properties for the persistent query.

    Screenshot of the ksqlDB Explain Query dialog in Confluent Cloud

Step 8: Use Flow view to inspect data

To visualize data flow in your ksqlDB application, click the Flow tab.

Use the Flow view to:

  • View the topology of your ksqlDB application.
  • Inspect the details of streams, tables, and the SQL statements that create them.
  • View events as they flow through your application.
ksqlDB application topology on the Flow View page in Confluent Cloud

Click the CREATE-STREAM node to see the query that you used to create the PAGEVIEWS_ENRICHED stream.

Details of a CREATE STREAM statement in the ksqlDB Flow View in Confluent Cloud

Click the PAGEVIEWS_ENRICHED node to see the stream’s events and schema.

Details of a stream in the ksqlDB Flow page in Confluent Cloud

Step 9: Monitor persistent queries

You can monitor your persistent queries visually using the Cloud Console.

In the navigation menu, select Clients and click the Consumer lag tab.

Find the group that corresponds with your pageviews_enriched stream, for example _confluent-ksql-pksqlc-lgwpnquery_CSAS_PAGEVIEWS_ENRICHED_5. This view shows how well your persistent query is keeping up with the incoming data.

Screenshot of the Consumer Lag page in Confluent Cloud

Step 10: Delete the connectors and topics

When you are finished with the Quick Start, delete the resources you created to avoid unexpected charges to your account.

  • Delete the connectors:
    1. From the navigation menu, select Connectors.
    2. Click DatagenSourceConnector_users and choose the Settings tab.
    3. Click Delete connector, enter the connector name (DatagenSourceConnector_users), and click Confirm.
    4. Repeat these steps with the DatagenSourceConnector_pageviews connector.
  • Delete the topics:
    1. From the navigation menu, click Topics, select the users topic, and choose the Configuration tab.
    2. Click Delete topic, enter the topic name (users), and click Continue.
    3. Repeat these steps with the pageviews topic.

Next steps