Frequently Asked Questions for Schema Registry

What RBAC roles are needed to change Confluent Cloud Schema Registry mode (IMPORT, ReadOnly, and so on)?

Mode can be set at the Schema Registry level or subject level.
OrganizationAdmin, Environment Admin, and DataSteward roles can set mode at the Schema Registry level.
For individual subjects, mode follows the compatibility mode RBAC.

To learn more, see Access control (RBAC) for Confluent Cloud Schema Registry.

What RBAC roles are available for Stream Catalog?

To learn about RBAC and Stream Catalog on Confluent Cloud, see Access control (RBAC) for Stream Catalog.

How does RBAC work with Schema Registry on Confluent Platform?

To learn about RBAC and Confluent Platform, see Configuring Role-Based Access Control for Schema Registry.

What is the Schema Registry endpoint and how is it surfaced on Confluent Cloud Console?

The Schema Registry endpoint (also referred to as schema-registry-url) is the API endpoint URL for your Confluent Cloud Schema Registry cluster in a specific environment. It’s used to make REST API calls to the Schema Registry service for operations such as:

Creating, reading, updating, and deleting schemas
Managing schema subjects and versions
Configuring compatibility settings
Performing schema validation and evolution operations

This is surfaced on Confluent Cloud Console as the Schema Registry endpoint. To find it, navigate to your Confluent Cloud environment, select a cluster, click Schema Registry on the left menu, and click the Endpoints tab.

To find the Schema Registry endpoints using Confluent CLI Command reference, run confluent schema-registry cluster describe (after selecting the appropriate environment with confluent environment use <environment-ID>). On the Confluent CLI, the flag for the Schema Registry URL is --schema-registry-endpoint, as described in confluent schema-registry cluster describe. You can also list endpoints using confluent schema-registry endpoint list. Note that in current versions, confluent schema-registry cluster describe returns only the PrivateLink Attachment private endpoints, whereas confluent schema-registry endpoint list lists all endpoints including the Confluent Cloud network.

To learn more about working with schemas on the Cloud Console, see the Schema Management Quick Start and Manage Schemas on Confluent Cloud.

To learn about working with Schema Registry endpoints using the APIs, see the Stream Catalog REST API Usage and Examples Guide and Confluent Cloud Schema Registry REST API Usage Examples.

How do you find and delete unused schemas?

To learn about managing storage, deleting schemas, and schema limits on Confluent Cloud see the following sections:

On Confluent Cloud: Delete Schemas and Manage Storage Space on Confluent Cloud
On Confluent Platform: Schema Deletion Guidelines

How do you find schema IDs?

There are several ways to get schema IDs, including:

View schema IDs on Confluent Cloud Console or Confluent Control Center (Legacy) in Confluent Platform:

For Confluent Cloud, see Manage Schemas on Confluent Cloud
For Confluent Platform, see Manage Schemas on Control Center on Confluent Platform.

Use the Confluent CLI:

Run the command confluent schema-registry schema describe, the output for which includes the ID of the specified schema.

Use the local Kafka scripts to print schema IDs with the consumer:

Confluent Cloud documentation: Print schema IDs with command line consumer utilities
Confluent Platform documentation: Print schema IDs with command line consumer utilities

Use API calls to show schema IDs:

In Confluent Cloud API usage examples, see List schemas with subject and ID for each and Get the latest version of a schema with its schema ID
See Confluent Platform API usage examples for several calls that show schema IDs.

Are there limits on the number of schemas you can maintain?

Confluent Cloud Schema Registry imposes limits on the number of schema versions supported in the registry, depending on the cluster type. When these limits are reached, you can identify unused schemas and free up storage space by deleting them. To learn more, see Delete Schemas and Manage Storage Space on Confluent Cloud.

There are no limits on schemas on self-managed Confluent Platform. To learn more about managing schemas on Confluent Platform, including soft and hard deletes, and schema versioning, see Schema Deletion Guidelines in the Confluent Platform documentation.

How do you delete schemas?

To learn about deleting schemas on Confluent Cloud, see Delete Schemas and Manage Storage Space on Confluent Cloud.

To learn how to delete schemas on Confluent Platform, see Schema Deletion Guidelines in the Confluent Platform documentation.

Can you recover deleted schemas?

You can recover soft-deleted schemas on both Confluent Cloud and Confluent Platform, as described in:

Recover a soft-deleted schema (Confluent Cloud)
Recovering a soft-deleted schema (Confluent Platform)

If you have still have the schema definition for a hard-deleted schema that you want to recover, you can essentially recover a the schema using subject-level schema migration as a workaround. To learn how to do this, see Migrate an Individual Schema to an Already Populated Schema Registry (subject-level migration).

What are schema contexts and when should you use them?

A schema context is an ad-hoc grouping of subject names and schema IDs. You can use a context name strategy to help you organize your schemas; using them to group together by name a set of logically related schemas into what can be thought of as a sub-registry.

Schema IDs and subject names without explicit contexts are maintained in the default context. Subject names and IDs are unique per context, so you could have an unqualified subject :.:my-football-teams in the default context (indicated by the . representing the default context) and a qualified subject :.my-cool-teams:my-football-teams: in the context :.my-cool-teams: and they can function as independent and unique subjects. The qualified and unqualified subjects could even have the same schema IDs, and still be unique by virtue of being in different contexts.

There are a few use cases for contexts beyond simple organization, and more concepts and strategies for using them. You can leverage multi-context APIs and set up a context name strategy for Schema Registry clients to use.

Schema contexts are useful for Schema Linking, where they are used in concert with exporters, but you can also use them outside of Schema Linking if so desired. To learn more about schema contexts and how they work, see:

What is the advantage of using qualified schemas over schemas under the default context?

Schema linking preserves schema IDs; therefore if you export schemas to another cluster, you can copy them into non-default contexts to avoid ID collision with schemas under existing contexts. Also, contexts provide a way to separate different environments for schemas. For example, you could develop with schemas in a “developer” context, and promote them to “production” context when development is done.

To learn more, see:

Which clients can consume against the schema context?

All clients (Java, .NET, Spring Boot, and so on), can specify an explicit context as part of the Schema Registry URL; for example, http://mysr:8081/contexts/mycontext. Currently only the Java client also passes the subject name when it looks up an ID. With the subject name, Schema Registry can find the correct context for the ID if it is not in the default context. This may be supported by .NET and Python clients in future releases.

Does Schema Linking support mTLS?

Source and destination schema registries provide support for mTLS. Does Schema Linking also provide this support? If so, how do you provide certificates to connect with the source Schema Registry?

On Confluent Platform 7.1 and later, Schema Registry clients can accept certificates for mTLS authentication in PEM format.

Can the schema exporter use any set of valid certificates to authenticate with source and destination schema registries, or only default certificates?

Yes, any certificates can be passed.

How do you monitor a schema link? Does it generate a JMX matrix?

Currently, only a count of exporters is available via JMX.

How do you troubleshoot if a schema link fails?

Query the status endpoint /exporters/{name}/status to get the exception stack trace of a failure.

How do you avoid any impact to schema links during maintenance and change windows?

Exporters can be paused at any time. Also they should be resilient to failure; they will simply pick up where they left off.

How will Schema Linking be maintained across Confluent Platform version updates?

Any future changes to Schema Linking will be done in a backward-compatible manner.

How do you implement bi-directional Schema Linking?

Schema Linking is implemented in “push” mode; therefore, to achieve bi-directional Schema Linking, each side must initiate a schema exporter to the other side.

What is the best way to set up a custom-managed application on Confluent Cloud Schema Registry with a proxy?

When custom-managed applications are configured with Confluent Cloud Schema Registry through a proxy, configure the Schema Registry endpoint with a fully qualified domain name (FQDN) and resolve the FQDN to proxy, if necessary.

What are the requirements for using Data Contracts with Schema Registry?

Data Contracts require specific versions and packages:

General requirements: - Schema rules are only available on Confluent Enterprise and Confluent Cloud with the Stream Governance “Advanced” package. - Schema rules are only supported in version 7.4 or above.

Confluent Cloud requirements: - Enable Schema Registry with the Advanced Stream Governance package. - See Choose a Stream Governance package and enable Schema Registry for Confluent Cloud.

Confluent Platform requirements: - Schema rules are only available on Confluent Enterprise (not on the Community edition). - Enable schema rules by adding the appropriate property to the Schema Registry configuration before starting.

To learn more, see Data Contracts for Schema Registry on Confluent Platform.

How do schema rules work with schema evolution and compatibility?

Schema rules are part of Data Contracts and work alongside schema evolution and compatibility checking. They allow you to:

Define domain validation rules for your schemas
Ensure data quality through field-level validation
Maintain data consistency across different systems
Support schema migration with validation rules

Rules are enforced during schema registration and can be configured to validate data at serialization or deserialization time.

To learn more about schema evolution with rules, see Schema Evolution and Compatibility for Schema Registry on Confluent Platform.

How much memory does Schema Registry typically use?

Schema Registry uses Kafka as a commit log to store schemas durably and maintains in-memory indices for fast lookups. Memory usage is typically very light:

Large organizations might have around 10,000 unique schemas
Each schema requires roughly 1000 bytes of heap overhead
Therefore, 1GB of heap memory is more than sufficient for most deployments

The in-memory indices are used to make schema lookups faster, while the actual durable storage happens through Kafka.

When should I use GraphQL API vs REST API for Stream Catalog?

Both APIs are available for Stream Catalog, but they serve different purposes:

GraphQL API: - Preferred for search operations - Can search across entity relationships - Provides declarative data fetching - More natural way to explore the catalog graph - Currently does not support business metadata attribute search

REST API: - Supports full CRUD operations (create, read, update, delete) - Required for managing business metadata attributes - Good for simple operations and integration with existing REST-based systems

Recommendation: Use GraphQL for search and exploration, REST for entity management and business metadata operations.

To learn more, see Stream Catalog GraphQL API Usage and Stream Catalog REST API Usage.

What are the different Stream Governance packages available?

Stream Governance packages determine which features are available in your Confluent Cloud environment:

Essentials package: Basic Stream Governance capabilities such as Stream Catalog, Stream Lineage, and Data Portal.
Advanced package: Full Stream Governance features, including Data Contracts, schema rules, and advanced Stream Catalog features

Different features are available depending on your package level. Data Contracts and schema rules require the Advanced package.

To learn more about packages and features, see Stream Governance Packages.

How does the Data Portal access request workflow work?

The Data Portal provides a self-service interface for discovering and accessing Kafka topics:

For data users: - Search and discover existing topics using metadata - Request access to topics through an approval workflow - View and use data once access is granted

For data owners/admins: - Receive and manage access requests - Approve or deny requests based on business rules - Maintain topic metadata to improve discoverability

Note: The topic access request workflow is not available for topics on Basic clusters.

To learn more, see Data Portal on Confluent Cloud.

How does Stream Lineage help track data?

Stream Lineage provides visibility into data flow and transformations across your streaming applications:

Track data origins: See where your data comes from
Understand transformations: View how data changes as it moves through your pipeline
Impact analysis: Understand downstream effects of schema or data changes
Compliance and governance: Maintain audit trails for data usage

Stream Lineage integrates with Stream Catalog and Data Portal to provide a comprehensive view of your data ecosystem.

To learn more, see Track Data with Stream Lineage.