Important

Data Discovery is currently available on Confluent Cloud in an Early Access Program for early adopters. An early access feature is a component of Confluent Cloud introduced to gain feedback. This feature can be used for evaluation and non-production testing purposes or to provide feedback to Confluent, particularly as it becomes more widely available in follow-on preview editions. If you would like to participate in the Early Access Program, email data-governance-preview@confluent.io.

Data Discovery (Early Access)

Data discovery in Confluent allows you to organize, discover, and understand the data available on the streaming platform. Think of it as a central data knowledge area where you can locate and access data in motion in a self-service way. Then imagine this knowledge is available to you from anywhere on the platform through simple or fine-grained, predictive searches.

Data discovery tools can be leveraged by multiple data personas within an organization:

  • A developer uses data discovery to search and discover data to build new streaming applications.
  • A data steward uses data discovery to define data classification definitions and apply them to entities like fields, or entire schemas and topics (not available during early-access).

Besides data organization and discoverability, another key aspect of the data discovery is the ability to provide a layer of governance to improve security and compliance with various data regulations like the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA).

Confluent Cloud offers global search across environments and clusters for various entity types including Kafka topics, connectors, and now schemas and related metadata.

First Look

In Early Access, data discovery in Confluent Cloud is centralizing all schemas-related metadata and making it available for discovery via the global search bar. In particular, with regard to schemas you can search for:

  • Schema subject name
  • Schema record name
  • Schema field name

To try it out, log into Confluent Cloud in an environment where you have Schema Registry enabled, and start typing the name of a schema subject, record, or data field name into the search bar at the top. You will get results as you type. Hit enter to specify a term.

Note that you will get hits from other entities, such as topics, on the same search.

../_images/dg-global-search-schemas.png

Click a search result to navigate to the entity. For example:

  • Click stocks-value (listed as one of the subjects under Schema in the search hits) to navigate to the value schema for the stocks topic.
  • Click stocks (listed under Topic in the search hits) to navigate to the stocks Kafka topic.

If an entity of the same name exists on multiple clusters, all are shown with associated clusters.

Click Show more.. for any entity type on the search results to view a more detailed list, filtered per your search.

../_images/dg-global-search-schemas-filtered-list.png

You can change the search on this detailed list view to search for other names of this entity type. For example, if you worked through the steps in the Quick Start to create a schema for the employees topic, search for employees in schemas.

../_images/dg-global-search-schemas-filtered-list-new.png