Important

Data Discovery is currently available on Confluent Cloud in an Early Access Program for early adopters. An early access feature is a component of Confluent Cloud introduced to gain feedback. This feature should be used only for evaluation and non-production testing purposes or to provide feedback to Confluent, particularly as it becomes more widely available in follow-on preview editions. If you would like to participate in the Early Access Program, email data-governance-preview@confluent.io.

Early Access Program features are intended for evaluation use in development and testing environments only, and not for production use. The warranty, SLA, and Support Services provisions of your agreement with Confluent do not apply to Early Access Program features. Early Access Program features are considered to be a Proof of Concept as defined in the Confluent Cloud Terms of Service. Confluent may discontinue providing preview releases of the Early Access Program features at any time in Confluent’s sole discretion.

Data Discovery (Early Access)

Data discovery in Confluent allows you to organize, discover, and understand the data available on the streaming platform. Think of it as a central data knowledge area where you can locate and access data in motion in a self-service way. Then imagine this knowledge is available to you from anywhere on the platform through simple or fine-grained, predictive searches.

Data discovery tools can be leveraged by multiple data personas within an organization:

  • A developer uses data discovery to search and discover data to build new streaming applications.
  • A data steward uses data discovery to define data classification definitions and apply them to entities like fields, or entire schemas and topics (not available during early-access).

Besides data organization and discoverability, another key aspect of the data discovery is the ability to provide a layer of governance to improve security and compliance with various data regulations like the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA).

Confluent Cloud offers global search across environments and clusters for various entity types including Kafka topics, connectors, and now schemas and related metadata.

Searching Data and Schemas

In Early Access, data discovery in Confluent Cloud is centralizing all schemas-related metadata and making it available for discovery via the global search bar. In particular, with regard to schemas you can search for:

  • Schema subject name
  • Data record name
  • Data field name

To try it out, log into Confluent Cloud in an environment where you have Schema Registry enabled, and start typing the name of a schema subject, data record, or data field name into the search bar at the top. You will get results as you type. Hit enter to select an entity.

Note that you will get hits from other entities, such as topics, on the same search.

../_images/dg-global-search-schemas.png

Click a search result to navigate to the entity. For example:

  • Click stocks-value (listed as one of the subjects under Schema in the search hits) to navigate to the value schema for the stocks topic.
  • Click stocks (listed under Topic in the search hits) to navigate to the stocks Kafka topic.

If an entity of the same name exists on multiple clusters, all are shown with associated clusters.

Click Show more.. for any entity type on the search results to view a more detailed list, filtered per your search.

../_images/dg-global-search-schemas-filtered-list.png

You can change the search on this detailed list view to search for other names of this entity type. For example, if you worked through the steps in the Quick Start to create a schema for the employees topic, search for employees in schemas.

../_images/dg-global-search-schemas-filtered-list-new.png

Tagging Data and Schemas

Also available in Early Access, the first layer of data catalog features enables tagging.

A fundamental aspect of governance is the ability to organize data based on a shared vocabulary, including multiple concepts and categories. Confluent Cloud now provides the option to create and apply tags to schemas and fine-grained entities like data records and fields. On this version of Confluent Cloud, you can:

  • Create instances of provided tags (Public, Private, Sensitive, PII) and custom (“free form”) tags
  • Associate tags with schema versions, records, and fields
  • Apply multiple tags to a single field, record, or schema version

Create tags

To create a tag:

  1. Navigate to the Schema Registry tab on the settings for a selected Environment.

    ../_images/dg-tags-view-manage.png
  2. Click View & manage tags.

  3. Click Create tag, select either free-form or commonly used tags.

    • If you chose commonly used tags, select one or more tags that you want to create. (PII, which is personally identifiable information, Sensitive, Private, Public)
    • If you chose free-form, provide a tag name and optional description. Tag names must start with an alphabetic character, and can then include alpha characters, numbers and underscores.
    ../_images/dg-tags-name-rules.png

    When the name and description is properly filled in, the Create button is active. Click Create to create the tag with the current name and description.

    ../_images/dg-tags-create.png
  4. View available tags.

    After a tag is created, it shows in the list under Tag management.

    ../_images/dg-tags-management-list.png

Apply tags

To apply a tag:

  1. Navigate to a schema, data record, or data field for which you want to apply a tag.

    There are a few of different ways to do this:

    • From the Search bar, start typing the name of a schema, record, field, or topic where you want to apply a tag.
    • From the same environment level view on the Schema Registry tab, click View and manage schemas and select a schema from the list.
    • From within a cluster, navigate to a schema (for example, a topic schema), then click the Schema tab for that topic.
  2. On the Schemas tab for a topic, you can add tags as follows.

    • To add tags to selected schema version as a whole, select Add tags to this version on the right panel, and select a tag from the drop-down list of available tags.
    ../_images/dg-tags-schema-version.png
    • To add tags to a selected record or field, expand the tree view of the schema (the default view), click the plus icon image_reference next to the entity, and select a tag from the drop-down list of available tags.
    ../_images/dg-tags-data-field.png
  3. View applied tags.

    Applied tags show next to the schema version, records, and fields with which they are associated.

Remove a tag from a schema version or data field

To remove a previously applied tag from an entity:

  1. Navigate to the schema that includes the tag.
  2. Click the x on the applied tag.

Edit a tag

To edit a tag description:

  1. Navigate to the Schema Registry tab on the settings for a selected Environment.
  2. Click View & manage tags.
  3. Select the tag you want to edit from the Tag management list.
  4. Click the edit icon next to the description, edit, and click Save.

Tip

You cannot edit the name of an existing tag, only its description. To rename a tag, remove it from any entities to which it is applied, delete the tag, and create a new one.

Delete a tag

If you want to delete a tag, first make sure that the tag is not currently applied to any entities. If the tag is in use, the delete operation will fail.

To delete a tag from an environment:

  1. Navigate to the Schema Registry tab on the settings for a selected Environment.
  2. Click View & manage tags.
  3. Select the tag you want to delete from the Tag management list.
  4. Click the trashcan icon in the upper right.
    • If the tag is in use (applied to one or more entities), you will get a warning and the tag will not be deleted.
    • If the tag is not in use, it is deleted.

Current limitations

This Early Access release of tagging has limited functionality in these areas:

  • Currently, tags do not show up in searches.
  • There is no way to jump directly from the list of created or available tags (in the Environment level view under View and manage tags) to the entities where the tags are applied. The section “Entities with this tag” on tag cards is a place-holder for this.
  • There is no way to get an summary of entities associated with particular tags as applied throughout the environment (this would be one function of having tags show up in the global search).