Manage Schemas in Confluent Cloud¶
Looking for Confluent Platform Schema Management docs? You are currently viewing Confluent Cloud documentation. If you are looking for Confluent Platform docs, check out Schema Management on Confluent Platform.
Tip
Try out the embedded Confluent Cloud interactive tutorials! Want to jump right in? Take this link to sign up or sign in to Confluent Cloud, and try out the guided workflows directly in Confluent Cloud.
Schema Management is fully supported on Confluent Cloud with the per-environment, hosted Schema Registry, and is a key element of Stream Governance on Confluent Cloud.
Working with schemas¶
You can manage schemas for topics in Confluent Cloud.
The following sections assume you have Schema Registry enabled. If you are just getting started with schemas, first see Quick Start for Schema Management on Confluent Cloud to learn how to enable Schema Registry and try out example workflows for basic tasks.
- View a schema
- Create a topic schema
- Derive a schema from messages
- Working with schema references
- Editing schemas
- Comparing schema versions
- Changing subject level (per topic) compatibility mode of a schema
- Searching for schemas and fields
- Tagging schemas and fields
- Downloading a schema
- Deleting a schema
- Identifying and deleting unused schemas
- Recovering a soft-deleted schema
- Managing schemas for an environment
- Access Control (RBAC) for Confluent Cloud Schema Registry
- Supported features and limits for Confluent Cloud Schema Registry
View a schema¶
View the schema details for a specific topic.
Search for schemas¶
You can also find and view schemas by searching for them. Searches are global; that is, they span across environments and clusters.
Start typing the name of a schema subject, data record, or data field name into the search bar at the top. You will get results as you type, including for other entities like topics.
Hit enter to select an entity like a schema.
To learn more, see Searching Data, Schemas, and Topics.
Find schemas from the Environment view¶
Navigate to an environment you want to work with, and click to select it as the current environment.
Click Schemas on the right panel to get a list of all schemas in the environment.
Browse the list to find the schema you want to view.
Click a schema in the list to view it.
By default, the schema is shown in tree view. (To learn more, see Tree view and code view.)
Tree view and code view¶
Two different types of views are available for schemas:
- tree view
- editable code view
To switch between the views, click the buttons to the left of the schema level search box:
By default, schemas are displayed in a tree view which allows you to understand the structure of the schema and navigate the hierarchy of elements and sub-elements.
In the tree view you can:
- Use the arrows to the left of an element to expand it and view sub-elements.
- Apply and manage available tags as described in Tagging Data and Schemas.
In edit mode (the “code view”), you can create and edit schemas as described in the sections below.
Create a topic schema¶
Create key and value schemas. Value schemas are typically created more frequently than key schemas.
Best practices:
- Provide default values for fields to facilitate backward-compatibility if pertinent to your schema.
- Document at least the more obscure fields for human-readability of a schema.
Tip
You can also create schemas from the unified Confluent CLI, as described in the Create a Schema section in the Quick Start. A handy commands reference is here.
Create a topic value schema¶
From the navigation menu, click Topics, then click a topic to select it (or create a new one).
Click the Schema tab.
Click Set a schema. The Schema editor appears.
Select a schema type: JSON, Avro, or Protobuf. (The default is Avro.)
The basic structure of a schema appears prepopulated in the editor as a starting point. Enter the schema in the editor:
name
: Enter a name for the schema if you do not want to accept the default, which is determined by the subject name strategy. The default isschema_type_topic_name
. Required.type
: Eitherrecord
,enum
,union
,array
,map
, orfixed
. (The typerecord
is specified at the schema’s top level and can include multiple fields of different data types.) Required.namespace
: Fully-qualified name to prevent schema naming conflicts. String that qualifies the schemaname
. Optional but recommended.fields
: JSON array listing one or more fields for a record. Required.Each field can have the following attributes:
name
: Name of the field. Required.type
: Data type for the field. Required.doc
: Field metadata. Optional but recommended.default
: Default value for a field. Optional but recommended.order
: Sorting order for a field. Valid values are ascending, descending, or ignore. Default: Ascending. Optional.aliases
: Alternative names for a field. Optional.
For example, you could add the following simple schema.
{ "type": "record", "name": "value_my_new_widget", "fields": [ { "name": "name", "type": "string" } ] }
This will display in Confluent Cloud as shown below.
In edit mode, you have options to:
- Validate the schema for syntax and structure before you create it.
- Add schema references with a guided wizard.
Click Create.
If the entered schema is valid, you can successfully save it and a Schema updated message is briefly displayed in the banner area. The schema is saved and shown in tree view form.
If the entered schema is invalid, parse errors are highlighted in the editor (as in this example where a curly bracket was left off). If parse errors aren’t auto-highlighted, click the See error messages link on the warning banner to enable them.
If applicable, repeat the procedure as appropriate for the topic key schema.
Working with schema references¶
You can add a reference to another schema, using the wizard to help locate available schemas and versions.
The Reference name you provide must match the target schema, based on guidelines for the schema format you are using:
- In JSON Schema, the name is the value on the
$ref
field of the referenced schema - In Avro, the name is the full name of the referenced schema; this is the value in the
name
field of the referenced schema. - In Protobuf, the name is the value on the
Import
statement referenced schema
First, locate the schema you want to reference, and get the reference name for it.
Add a schema reference to the current schema in the editor
- Click Add reference.
- Provide a Reference name per the rules described above.
- Select the schema fro the Subject list.
- Select the Version of the schema you want to use.
- Click Validate to check if the reference will pass.
- Click Save to save the reference.
For example, to create a reference to the schema for the employees
topic (Employee
)
from the widget
schema, you can configure a reference to Employee
as shown.
To learn more, see Schema References in the Confluent Platform documentation.
View, edit, or delete schema references for a topic
Existing schema references show up on editable versions of the schema where they are configured.
Navigate to a topic; for example, the
widget-value
schema associated with thewidget
topic in the previous example.Click into the editor as if to edit the schema.
If there are references to other Schemas configured in this schema, they will display in the Schema references list below the editor.
You can also add more references to this schema, modify existing, or delete references from this view.
Create a topic key schema¶
Click the Key option. You are prompted to set a message key schema.
Click Set a schema.
Choose Avro format and/or delete the sample formatting and simply paste in a string UUID.
Enter the schema into the editor and click Save.
Here is an example of a schema appropriate for a key value.
{ "namespace": "io.confluent.examples.clients.basicavro", "name": "key_widget", "type": "string" }
Best Practices and Pitfalls for Key Values¶
Kafka messages are key-value pairs. Message keys and message values can be
serialized independently. For example, the value may be using an
Avro record
,
while the key may be a primitive (string
, integer
, and so forth).
Typically message keys, if used, are primitives. How you set the key is up to you
and the requirements of your implementation.
As a best practice, keep key value schema complexity to a minimum. Use either a simple, non-serialized data type such as a string UUID or long ID, or an Avro record that does not use maps or arrays as fields, as shown in the example below. Do not use Protobuf messages and JSON objects for key values. Avro does not guarantee deterministic serialization for maps or arrays, and Protobuf and JSON schema formats do not guarantee deterministic serialization for any object. Using these formats for key values will break topic partitioning. To learn more, see Partitioning gotchas in the Confluent Community Forum.
For detailed examples of key and value schemas, see the discussion under Schema Formats, Serializers, and Deserializers in the Schema Registry documentation.
Derive a schema from messages¶
As an alternative to manually creating a schema, you can generate a new schema from a given set of messages. To learn more, see the setup and examples provided in schema-registry:derive-schema in the Schema Registry Maven Plugin documentation.
Editing schemas¶
Edit an existing schema for a topic.
From the navigation menu, click Topics, then click a topic to select it.
Click the Schema tab.
Select the Key or Value option for the schema.
The tree view is shown by default.
Click Evolve Schema.
Make the changes in the schema editor.
For example, you could edit the previous schema by adding a new field called
region
.{ "fields": [ { "name": "name", "type": "string" }, { "name": "region", "type": "string", "default": "" } ], "name": "value_widgets", "type": "record" }
In edit mode, you have options to:
- Validate the schema for syntax and structure before you save it.
- View, add, update, or delete schema references.
Tip
When the compatibility mode is set to Backward Compatibility, you must provide a default for the new field. This ensures that consumer applications can read both older messages written to the Version 1 schema (with only a
name
field) and new messages constructed per the Version 2 schema (withname
andregion
fields). For messages that match the Version 1 schema and only have values forname
,region
is left empty. To learn more, see Passing Compatibility Checks in the Confluent Cloud Schema Registry Tutorial.Click Save.
If the schema update is valid and compatible with its prior versions (assuming a backward-compatible mode), the schema is updated and the version count is incremented. You can compare the different versions of a schema.
If the schema update is invalid or incompatible with an earlier schema version, parse errors are highlighted in the editor. If parse errors aren’t auto-highlighted, click the See error messages link on the warning banner to enable them.
For example, if you add a new field but do not include a default value as described in the previous step, you will get an incompatibility error. You can fix this by adding a default value for “region”.
Comparing schema versions¶
Compare versions of a schema to view its evolutionary differences.
From the navigation menu, click Topics, then click a topic to select it.
Click the Schema tab.
Select the Key or Value option for the schema. (The schema Value is displayed by default.)
Click Compare version.
The current version number of the schema is indicated on the version menu.
Select the Turn on version diff check box.
Select the versions to compare from each version menu. The differences are highlighted for comparison.
Changing subject level (per topic) compatibility mode of a schema¶
The default compatibility mode is Backward. The mode can be changed for the schema of any topic if necessary.
Caution
If you change the compatibility mode of an existing schema already in production use, be aware of any possible breaking changes to your applications.
This section describes how to change the compatibility mode at the subject level. You can also set compatibility globally for all schemas in an environment. However, the subject-level compatibility settings described below override those global settings.
Select an environment.
Select a cluster.
From the navigation menu, click Topics, then click a topic to select it.
Click the Schema tab for the topic.
Select the Key or Value option for the schema.
Click the ellipses (3 dots) on the upper right to get the menu, then select Compatibility settings.
The Compatibility settings are displayed.
Select a mode option:
- Backward (Confluent Schema Registry default)
- Transitive backward
- Forward and Transitive forward
- Full and Transitive full
- None (not recommended)
Descriptions indicate the compatibility behavior for each option. For more information, including the changes allowed for each option, see Schema Evolution and Compatibility.
Click Save.
Searching for schemas and fields¶
Confluent Cloud offers global search across environments and clusters for various entity types now including schemas and related metadata. To learn more, see Searching Data, Schemas, and Topics in Stream Catalog.
Tagging schemas and fields¶
Confluent Cloud provides the ability to tag schema versions and fields within schemas as a means of organizing and cataloging data based on both custom and commonly used tag names. To learn about tagging, see Tagging Data and Schemas in Data Discovery.
Downloading a schema from Confluent Cloud¶
From the navigation menu, click Topics, then click a topic to select it.
Click the Schema tab.
Select the Key or Value option for the schema.
Click the ellipses (3 dots) on the upper right to get the menu, then select Download.
A schema JSON file for the topic is downloaded into your Downloads directory.
For example, if you download the version 1 schema for the employees topic from the Quick Start, you get a file called
schema-employees-value-v1.avsc
with the following contents.{ "fields": [ { "name": "Name", "type": "string" }, { "name": "Age", "type": "int" } ], "name": "Employee", "namespace": "Example", "type": "record" }
Tip
The file extension indicates the schema format. For Avro schema the file extension is .avsc
; for Protobuf schema, .proto
; and for JSON Schema, .json
.
Deleting a schema from Confluent Cloud¶
From the navigation menu, click Topics, then click a topic to select it.
Click the Schema tab.
Select the Key or Value option for the schema.
Click the ellipses (3 dots) on the upper right to get the menu, then select Delete.
On the dialog, select whether to delete only a particular version of the schema or the entire subject (all versions).
Select Delete to carry out the action.
To learn more about deleting schemas, see Schema Deletion Guidelines .
Identifying and deleting unused schemas on Confluent Cloud¶
You can use the schema deletion tool to find and delete schemas in the Confluent Cloud Schema Registry.
The README provides full instructions on how to build and use the schema deletion tool independently, or as a Confluent CLI plugin.
The basic workflow is:
- You provide subject names, which are based on
TopicNameStrategy
, to analyze and detect unused schemas. - The tool then identifies topics using the
TopicNameStrategy
and analyzes messages to identify unused schemas in those topics.
For example, if you select a subject named stocks-value
, the tool analyzes messages from the
topic stocks
and identifies unused schemas (if any) in the subject stocks-value
.
Any schema IDs not present in the topic’s messages are identified as unused schemas, and listed as candidates for deletion.
You can then enter the schemas you want to be delete.
To learn more about subject naming strategies, see Subject Name Strategy.
Caution
This tool provides you with the ability to hard delete schemas in the Schema Registry, so use this option with caution.
Recovering a soft-deleted schema¶
You can recover a soft-deleted schema using the Schema Registry GET /subjects
and POST /subjects
APIs as follows:
- Retrieve the schema with
GET /subjects?deleted=true
, as described in GET /subjects API description. (A usage example is shown in List all subjects.) - Use POST/subjects/(string: subject)/versions to re-register the schema.
To learn more, see see Schema Deletion Guidelines, Schema Registry API Reference, and Schema Registry API Usage Examples.
Managing schemas for a Confluent Cloud environment¶
Schema Registry itself sits at the environment level and serves all clusters in an environment, therefore several tasks related to schemas are managed through the registry at this level.
To view and manage Schema Registry for a Confluent Cloud environment:
Select an environment from the Home page. (An environment list is available from the top right menu.)
Click the Schema Registry tab.
See Choose a Stream Governance package and enable Schema Registry for Confluent Cloud and Stream Governance Packages, Features, and Limits to learn about Stream Governance package options.
See Configure and Manage Schemas for an Environment to learn how to:
Access Control (RBAC) for Confluent Cloud Schema Registry¶
Note
A staged rollout of RBAC for Schema Registry is in progress in early December 2022. This feature may not be immediately available to all customers.
Role-Based Access Control (RBAC) enables administrators to set up and manage user access to Schema Registry subjects and topics. This allows for multiple users to collaborate on with different access levels to various resources.
ResourceOwner privileges on Schema Registry are automatically granted to all user and service accounts that have existing API keys for Schema Registry clusters or existing CloudClusterAdmin privileges on any cluster in the same environment as Schema Registry.
The following table describes how RBAC roles map to Schema Registry resources. For details on how to manage RBAC for these resources, see Manage RBAC using Confluent Cloud Console and Manage RBAC using the Confluent CLI. For more schema related RBAC information, see also Access Control (RBAC) for Stream Lineage and Access Control (RBAC) for Schema Linking.
Role | Scope | Read subject | Write subject | Delete subject | Read subject compatibility | Write subject compatibility | Grant permissions |
---|---|---|---|---|---|---|---|
OrganizationAdmin | Organization | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
EnvironmentAdmin | Environment | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
CloudClusterAdmin | Cluster | ||||||
Operator | Organization, Environment, Cluster | ||||||
MetricsViewer | Organization, Environment, Cluster | ||||||
ResourceOwner | Schema Subject | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
DeveloperManage | Schema Subject | ✔ | ✔ | ||||
DeveloperRead | Schema Subject | ✔ | ✔ | ||||
DeveloperWrite | Schema Subject | ✔ | ✔ | ✔ | |||
DataDiscovery | Environment | ✔ | ✔ | ||||
DataSteward | Environment | ✔ | ✔ | ✔ | ✔ | ✔ |
Table Legend:
- ✔ = Yes
- Blank space = No
Tip
“Global compatibility” does not apply to roles. To grant permission to a user
to manage global compatibility, grant the DeveloperManage
role on a subject resource named __GLOBAL
.
Supported features and limits for Confluent Cloud Schema Registry¶
A single Schema Registry is available per Environment.
Access Control to Schema Registry is based on API key and secret.
Your VPC must be able to communicate with the Confluent Cloud Schema Registry public internet endpoint. For more information, see Use Confluent Cloud Schema Registry in a VPC-peered environment.
Available on Amazon Web Services (AWS), Azure (Microsoft Azure), and GCP (Google Cloud Platform) for cloud provider geographies located in the US, Europe, and APAC. For each cloud provider, geographies are mapped under the hood to specific regions, as described in Choose a Stream Governance package and enable Schema Registry for Confluent Cloud.
High availability (HA) is achieved by having multiple nodes within a cluster always in running state, with each node running in a different availability zone (AZ).
Rate limits on number of API requests is 25 requests per second for each API key.
Note
Requests are identified using an API key that points to a tenant (LSRC). Requests from different API keys from the same tenant are still counted to the same limit for the tenant. So, with many API keys on the same LSRC (Schema Registry logical cluster ID) you still have the same limit of 25 requests per second mutualized on all keys.
Suggested Reading¶
- Using Broker-Side Schema Validation
- Schema Linking on Confluent Cloud
- Stream Governance on Confluent Cloud
- Confluent Cloud Schemas Quick Start
- Confluent Cloud Schema Registry Tutorial
- Quick Start for Confluent Cloud
- Manage Topics in Confluent Cloud
- Schema Management Overview
- Formats, Serializers, and Deserializers
- Schema References