Create a Join Pipeline for Stream Designer on Confluent Cloud

The following steps show how to create a pipeline in Stream Designer that joins data from a stream and a table.

Step 1: Create a pipeline project

A Stream Designer pipeline project defines all the components that are deployed for an application. In this step, you create a pipeline project and a canvas for designing the component graph.

  1. Log in to the Confluent Cloud Console and open the Cluster Overview page for the cluster you want to use for creating pipelines.

  2. In the navigation menu, click Stream Designer.

  3. Click Create pipeline.

    The Create a new pipeline page opens.

    Stream Designer Create New Pipeline page in Confluent Cloud Console

Step 2: Create a pageviews connector definition

Your pipeline starts with data produced by the Datagen source connector. In this step, you create a pipeline definition for a connector that produces mock pageview data to a Kafka topic.

  1. Click Start with Connector and then click Start building.

    The Stream Designer canvas opens, with the Source Connector details view visible.

    Stream Designer initial view in Confluent Cloud Console
  2. In the Source Connector page click the Datagen Source tile to open the Configuration page.

  3. In the Topic textbox, type “pageviews_topic”.

    Stream Designer Datagen source connector configuration in Confluent Cloud Console
  4. Click Continue to open Kafka credentials page.

    Stream Designer Datagen source connector configuration in Confluent Cloud Console
  5. Ensure that the the Global access tile is selected and click Generate API key & download to create the API key for the Datagen connector.

    A text file containing the newly generated API key and secret is downloaded to your local machine.

  6. Click Continue to configure the connector’s output.

  7. In the Select output record value format section, click JSON_SR, and in the Select a template section, click Pageviews.

    Stream Designer Datagen Source Connector configuration for mock pageviews data in Confluent Cloud Console
  8. Click Continue to open the Sizing page.

  9. In the Connector sizing section, leave the minimum number of tasks at 1.

  10. Click Continue to open the Review and launch page.

  11. In the Connector name textbox, enter “Datagen_pageviews” and click Continue.

    The Datagen source connector is configured and appears on the canvas with a corresponding topic component. The topic component is configured with the name you provided during connector configuration. Also, a stream is registered on the topic.

    Stream Designer Datagen Source Connector and topic in Confluent Cloud Console

Step 3: Register a stream on the pageviews topic

Stream Designer enables registering a stream on an underlying topic.

  1. In the Stream component, click Configure to open the stream configuration dialog.

  2. In the Name textbox, enter “pageviews_stream”.

  3. In the Value Format dropdown, select JSON_SR.

    Stream Designer and stream config dialog in Confluent Cloud Console
  4. Click Save.

    The Topic component updates with the stream name.

    Stream Designer and Datagen Source connector with topic and stream components in Confluent Cloud Console

Step 4: Create a users connector definition

In this step, you create a pipeline definition for a connector that produces mock user data to a Kafka topic.

In the Components menu, click Source Connector.

  1. In the Source Connector component, click Configure.

    The Source Connector page opens.

  2. In the search box, enter “datagen”.

    Stream Designer Datagen source connector search in Confluent Cloud Console
  3. Click the Datagen Source tile to open the Configuration page.

  4. In the Topic textbox, type “users_topic”.

    Stream Designer Datagen source connector configuration in Confluent Cloud Console
  5. Click Continue.

    The Kafka credentials page opens.

    Stream Designer Datagen source connector configuration in Confluent Cloud Console
  6. Ensure that the the Global access tile is selected and click Generate API key & download to create the API key for the Datagen connector.

    A text file containing the newly generated API key and secret is downloaded to your local machine.

  7. Click Continue.

  8. In the Select output record value format section, click JSON, and in the Select a template section, click Users.

    Stream Designer Datagen Source Connector configuration for mock users data in Confluent Cloud Console
  9. Click Continue.

  10. In the Connector sizing section, leave the minimum number of tasks at 1 and click Continue.

  11. In the Connector name textbox, enter “Datagen_users” and click Continue.

    The Datagen source connector is configured and appears on the canvas with a corresponding topic component. The topic component is configured with the name you provided during connector configuration.

Step 5: Register a table on the users topic

Stream Designer enables registering a table on an underlying topic. In this step, you create a table named “users_table” that corresponds with “users_topic”.

  1. Right-click the Stream component and click Remove.

    Stream Designer initial view in Confluent Cloud Console
  2. Hover over the Topic component and click the + icon that appears near the center.

    A context menu opens.

    Stream Designer and topic context menu in Confluent Cloud Console
  3. In the context menu, click Table.

    A Table component appears within the Topic component and, the Table Configuration dialog opens.

    Stream Designer topic and table in Confluent Cloud Console
  4. In the Table Configuration dialog, enter “users_table” in the Name textbox.

  5. In the Value Format dropdown, select JSON.

  6. In the Columns for the table textbox, enter the following SQL:

    id VARCHAR PRIMARY KEY
    
  7. Click Add Columns for the table and in the textbox, enter the following SQL:

    REGISTERTIME BIGINT
    
  8. Repeat the previous step for the following column definitions:

    USERID STRING
    REGIONID STRING
    GENDER STRING
    

    Your table configration should resemble:

    Stream Designer users table configuration in Confluent Cloud Console
  9. Click Save.

    The Topic component updates with the table name.

    The start of the pipeline is defined, with a pageviews stream and a users table fed by corresponding Datagen source connectors.

    Stream Designer with table and stream components in Confluent Cloud Console

Step 6: Join the stream and table

With pageviews_stream and users_table defined, you can join them to produce an enriched stream that has pageviews per user messages.

  1. In the Components menu, click Join.

    A Join query appears on the canvas, and the Configuration dialog opens.

  2. From users_table, drag an arrow to the Join component.

  3. From pageviews_stream, drag an arrow to the Join component.

    Stream Designer with a table/stream join in Confluent Cloud Console
  4. In the Join component, click Configure.

    The Join Configuration dialog opens.

  5. In the Reference to the left input source dropdown, Select pageviews_stream.

  6. In the Alias of the left input source field, enter “p”.

  7. In the Reference to the input source dropdown, select users_table.

  8. In the Alias of the input source field, enter “u”.

  9. In the Join Type dropdown, select LEFT OUTER.

  10. In the Join on clause textbox, enter the following SQL:

    p.userid = u.id
    

    Your join configuration should resemble:

    Stream Designer with a table/stream join config in Confluent Cloud Console
  11. Click Save.

    The query component displays a red error triangle because it requires a stream, table, or another query component for its output. In the next step, you add a sink topic with a corresponding stream for the filter output.

Step 7: Define the join topic

The join query requires a sink topic for the joined messages. In this step, you define a user_pageviews topic for the query results.

  1. Hover over the Join component and click +.

    A context menu appears showing the components that accept join results as an input.

    Stream Designer join query context menu in Confluent Cloud Console
  2. In the context menu, click Stream.

    A Topic component appears, and the stream configuration dialog opens.

  3. Name the stream “user_pageviews” and click Save.

  4. Click the topic component to open the configurtion dialog.

  5. Name the topic “user_pageviews_topic”. Click Save.

    The join pipeline is ready to activate.

    Stream Designer join pipeline in Confluent Cloud Console

Step 8: Activate the pipeline

  1. When all components show the Activated state, click the user_pageviews topic, and in the details view, click Messages.

    The joined messages appear, showing the pageviews stream enriched with data from the users table.

    Stream Designer and joined user-pageview messages in Confluent Cloud Console

Step 9: Deactivate the pipeline

To avoid incurring costs, click Deactivate pipeline to delete all resources created by the pipeline.

When you deactivate a pipeline, you have the option of retaining or deleting topics in the pipeline.

  1. Click the settings icon (settings-icon).

    The Pipeline Settings dialog opens.

  2. Click Deactivate pipeline to delete all resources created by the pipeline.

    The Revert pipeline to draft? dialog appears. Click the dropdowns to delete or retain the listed topics. For this example, keep the Delete settings.

    Stream Designer showing the Revert dialog in Confluent Cloud Console
  3. Click Confirm and revert to draft to deactivate the pipeline and delete topics.

Step 10: Delete the pipeline

When all components have completed deactivation, you can delete the pipeline safely.

  1. Click the settings icon.

    The Pipeline Settings dialog opens.

    Stream Designer showing filtered messages flowing in Confluent Cloud Console
  2. Click Delete pipeline. In the Delete pipeline dialog, enter “confirm” and click Confirm.

  3. The pipeline and associated resources are deleted. You are returned to the Pipelines list.