Stream Catalog User Guide

The key to unlocking the value within data in motion and increasing productivity across an organization is a self-service tool for data discovery.

Available through both the Cloud Console and API (Stream Catalog REST API and Stream Catalog GraphQL API), stream catalog allows users across teams to collaborate within a centralized, organized library designed for sharing, finding, and understanding the data needed to drive their projects forward quickly and efficiently. It’s like a digital library for data in motion allowing any user, experienced with Kafka or not, to search for what they need, find what’s already been built, and put it to use right away.

With classifications, teams can constantly increase the value of the business’s stream catalog by adding contextual details to data; for example, by labeling schema fields as “PII” (personal identifiable information) or “Sensitive”.

Searching Data and Schemas

Stream catalog in Confluent Cloud centralizes all schemas-related metadata and makes it available for discovery via the global search bar. In particular, with regard to schemas you can search for:

Search with Filters

Click See all.. or hit return on your initial search results to view a more detailed list, with an option apply more filters.

This view shows “All Results” for a search on stocks.

../_images/dg-global-search-list-all.png

You can filter this further by selecting an entity on the left menu; for example, “Schemas”.

../_images/dg-global-search-schemas-filtered-list.png

Also, you can use the filters across the top to filter by “Environment”, “Tag”, or “Entities”.

This example narrows the search result based on entity type to show only schema subject names that include stock.

../_images/dg-catalog-filter-by-schema-subject.png

This example filters on schema subjects, records, and fields in the demos environment that are tagged with my_stocks.

../_images/dg-catalog-filter-multiple.png

You can change the search on this detailed list view to search for other names of a selected entity type. For example, if you worked through the steps in the Quick Start to create a schema for the employees topic, search for employees in schemas. (Make sure you clear any other filters like tags that may not match the employees schema.)

../_images/dg-global-search-schemas-filtered-list-new.png

Tagging Data and Schemas

Stream catalog features enables entity tagging. Tags are searchable like any other entity.

A fundamental aspect of governance is the ability to organize data based on a shared vocabulary, including multiple concepts and categories. Confluent Cloud now provides the option to create and apply tags to schemas and fine-grained entities like data records and fields. On this version of Confluent Cloud, you can:

  • Create instances of provided tags (Public, Private, Sensitive, PII) and custom (“free form”) tags
  • Associate tags with schema versions, records, and fields
  • Apply multiple tags to a single field, record, or schema version

Important

Tag definitions created with the Confluent Cloud CATALOG API (V1) are currently not accessible through the Confluent Cloud Console for search, update, and so on. Tags created through the API must be managed through the API, as described in Stream Catalog REST API.

How tags work with schema versioning

When you apply tags, you always apply them to a particular version of a schema. As you modify schemas, they evolve to newer versions. Tags that you applied to previous versions of a schema are automatically propagated to new versions.

For example, if you applied the tag my_stocks starting with version 2 of a schema, that tag would propagate to versions 3, 4, and so on, but version 1, which never had the my_stocks applied, would not be tagged, unless you went back and explicitly added it to version 1.

Create tags

To create a tag:

  1. Navigate to the Schema Registry tab on the settings for a selected Environment.

    ../_images/dg-tags-view-manage.png
  2. Click View & manage tags.

  3. Click Create tag, select either free-form or commonly used tags.

    • If you chose commonly used tags, select one or more tags that you want to create. (PII, which is personally identifiable information, Sensitive, Private, Public).
    • If you chose free-form, provide a tag name and optional description. Tag names must start with an alphabetic character, and can then include alpha characters, numbers and underscores.
    ../_images/dg-tags-name-rules.png

    When the name and description is properly filled in, the Create button is active. Click Create to create the tag with the current name and description.

    ../_images/dg-tags-create.png
  4. View available tags.

    After a tag is created, it shows in the list under Tag management.

    ../_images/dg-tags-management-list.png

Apply tags

To apply a tag:

  1. Navigate to a schema, data record, or data field for which you want to apply a tag.

    There are a few of different ways to do this:

    • From the Search bar, start typing the name of a schema, record, field, or topic where you want to apply a tag or business metadata.
    • From the same environment level view on the Schema Registry tab, click View & manage schemas and select a schema from the list.
    • From within a cluster, navigate to a topic, then click the Schema tab for that topic.
  2. On the Schemas tab for a topic, you can add tags as follows.

    • To add tags to selected schema version as a whole, select Add tags to this version on the right panel, and select a tag from the drop-down list of available tags.
    ../_images/dg-tags-schema-version.png
    • To add tags to a selected record or field, expand the tree view of the schema (the default view), click the plus icon image_reference next to the entity, and select a tag from the drop-down list of available tags.
    ../_images/dg-tags-data-field.png
  3. View applied tags.

    Applied tags show next to the schema version, records, and fields with which they are associated.

Remove a tag from a schema version or data field

To remove a previously applied tag from an entity:

  1. Navigate to the schema that includes the tag.
  2. Click the x on the applied tag.

Edit a tag

To edit a tag description:

  1. Navigate to the Schema Registry tab on the settings for a selected Environment.
  2. Click View & manage tags.
  3. Select the tag you want to edit from the Tag management list.
  4. Click the edit icon next to the description, edit, and click Save.

Tip

You cannot edit the name of an existing tag, only its description. To rename a tag, remove it from any entities to which it is applied, delete the tag, and create a new one.

Delete a tag

If you want to delete a tag, first make sure that the tag is not currently applied to any entities. If the tag is in use, the delete operation will fail.

To delete a tag from an environment:

  1. Navigate to the Schema Registry tab on the settings for a selected Environment.
  2. Click View & manage tags.
  3. Select the tag you want to delete from the Tag management list.
  4. Click the trashcan icon in the upper right.
    • If the tag is in use (applied to one or more entities), you will get a warning and the tag will not be deleted.
    • If the tag is not in use, it is deleted.

Search for entities with a given tag

As shown in Searching Data and Schemas, tags are now discoverable through the global search.

This means that you can search for a tag name (or part of the tag name), and the search will return all entities that have that tag applied. From there, you can drill down into the resource as with any other search.

For example, searching on stock returns the my_stocks and buy_stocks tags in the results.

../_images/dg-global-search-tags-my_stocks.png

Click one of these, to get a list of all entities tagged accordingly.

For example, click my_stocks under Tag in the results to get a list of all entities tagged with my_stocks.

../_images/dg-global-search-tags-list-my_stocks.png

Click an entity to drill down. For example, click StockTrade to drill down into the schema that has a field tagged with my_stocks

Tip

You must switch to the tree view of the schema to see record and field level tags. The raw schema view, which is the default, does not show them. The tree and raw schema view buttons image_ref_schema_view_toggle are on the top left next to the schema search field, as highlighted in the illustration below.

../_images/dg-global-search-tags-drilldown-my_stocks.png

Here is another example showing a more specific search for entities tagged PII (personally identifiable information).

../_images/dg-global-search-tags-pii.png

A search for the tag PII provides these search results.

../_images/dg-global-search-tags-list-pii.png

Scroll down to find balance.

../_images/dg-global-search-tags-balance-pii.png

Drill down on balance, which is a tagged field in the account-value schema. (Remember to switch to the tree view image_ref_schema_view_toggle to see the record and field level tags.)

../_images/dg-global-search-tags-drilldown-pii.png

Business Metadata for Schemas

Important

This feature is available as a preview feature. A preview feature is a component of Confluent Cloud that is being introduced to gain early feedback. This feature can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. Your comments, questions, and suggestions are encouraged and can be submitted to stream-governance-preview@confluent.io.

Business metadata is a collection of attributes in the form of key-value pairs that provide more contextual information to entities across the platform. Suppose you want to document or find out:

  • Which team is responsible for a particular schema?
  • Which product domain does a schema belong to?
  • What is the GitHub location for a schema?

These are all examples of business metadata that owners can add to provide context around data, and that users can discover to augment their understanding of entities. You can assign business metadata to a schema.

For example, you could create a business metadata collection named “Domain” that includes the attributes Name, Team-owner, and Slack-contact. The business metadata collection is like a label for a group of attributes that you associate with stream catalog entities, either through the Confluent Cloud Console or the Stream Catalog REST API.

How business metadata works with schema versioning

When you apply business metadata, you always apply it to a particular version of a schema. As you modify schemas, they evolve to newer versions. Business metadata that you applied to previous versions of a schema is automatically propagated to new versions.

For example, if you applied a location label and attributes starting with version 2 of a schema, that location metadata would propagate to versions 3, 4, and so on, but version 1, which never had that label applied, would not have any metadata unless you went back and explicitly added it to version 1.

Examples

To learn more about using business metadata in context of a real-world use case, check out the Demo in the Stream Governance overview. You can tune in at about 6:00 minutes into the video for a cursory overview of the application being presented, followed by a discussion of how to add business metadata to the schemas.

Create business metadata and add attributes

To create business metadata:

  1. Navigate to the Schema Registry tab on the settings for a selected Environment.

    ../_images/dg-biz-metadata-view-manage.png
  2. Click View & manage business metadata.

  3. If this is the first time you’ve created business metadata on this cluster, click Get started.

    Otherwise, click Create business metadata.

    ../_images/dg-biz-metadata-create-new.png

    On already created metadata, there is also an option to add new attributes to the currently selected label. To do so, click Create attribute.

  4. Fill in values for the metadata label name, description, and attributes, then click Create.

    Like tags, naming rules for business metadata labels and attributes require that these names start with a letter and are followed by alphanumeric or _ characters

    ../_images/dg-biz-metadata-create.png

    The metadata you created is listed, with its label name on the left menu.

    ../_images/dg-biz-metadata-displayed.png

View available business metadata

To view all available business metadata:

  1. Navigate to the Schema Registry tab on the settings for a selected Environment.

    ../_images/dg-biz-metadata-view-manage.png
  2. Click View & manage business metadata.

    ../_images/dg-biz-metadata-all-available.png

Apply business metadata to a schema

To apply business metadata to a schema:

  1. Navigate to a schema for which you want to apply business metadata.

    There are a few of different ways to do this:

    • From the Search bar, start typing the name of a schema where you want to apply a tag or business metadata.
    • From the same environment level view on the Schema Registry tab, click View & manage schemas and select a schema from the list.
    • From within a cluster, navigate to a topic, then click the Schema tab for that topic.
  2. If needed, select the specific schema version to which you want to apply the business metadata.

    Tip

    The schema version is shown on the top left. By default, the latest (current) version is selected.

  3. Click Add business metadata on the schema Overview panel.

    ../_images/dg-biz-metadata-add-from-schema-overview.png
  4. On Add business metadata dialog, select the data and attributes to associate with the currently displayed schema version. Note that:

    • On this dialog, you have the option to apply multiple labels (business metadata) to this same schema version by clicking + Add business metadata at the bottom of the dialog.
    • You cannot create new business metadata from this dialog; only add already existing labels and attributes. If you want to create new labels and attributes, you must do so from the View & manage business metadata screen.

    When you have added all of the business metadata labels and attributes desired, click Continue to apply them to selected schema version.

    ../_images/dg-biz-metadata-add-to-schema.png

    The business metadata you applied to this schema version is displayed on the lower right.

    ../_images/dg-biz-metadata-applied.png

Edit a business metadata

To edit a existing metadata:

  1. Navigate to the Schema Registry tab on the settings for a selected Environment.
  2. Click View & manage business metadata.
  3. Select the label you want to edit from the list.
  4. Edit the description and/or add attributes.
  5. Click Save for each option.

Tip

You cannot delete attributes from an existing metadata definition/labels; only add them. Your other option for reconstructing a business metadata definition is to remove it from any entities to which it is applied, delete the definition/label, and create a new one, adding only the attributes you want it to include.

Delete business metadata

If you want to delete a metadata group, first make sure that the label is not currently applied to any entities. If the label is in use, the delete operation will fail.

To delete a metadata group from an environment:

  1. Navigate to the Schema Registry tab on the settings for a selected Environment.
  2. Click View & manage business metadata.
  3. Select the group label you want to delete from the list.
  4. Click the trashcan icon in the upper right.
    • If the label is in use (applied to one or more entities), you will get a warning and the label will not be deleted.
    • If the label is not in use, it is deleted.

Search for labels

If you have a long list of business metadata labels, you might want to search for label names in the Search bar above the list. The predictive search shows matching labels as you type.