Manage Schemas in Confluent Cloud

Looking for Confluent Platform Schema Management docs? You are currently viewing Confluent Cloud documentation. If you are looking for Confluent Platform docs, check out Schema Management on Confluent Platform.

Tip

Try out the embedded Confluent Cloud interactive tutorials! Want to jump right in? Take this link to sign up or sign in to Confluent Cloud, and try out the guided workflows directly in Confluent Cloud.

Schema Management is fully supported on Confluent Cloud with the per-environment, hosted Schema Registry, and is a key element of Stream Governance on Confluent Cloud.

View a schema

View the schema details for a specific topic.

Search for schemas

You can also find and view schemas by searching for them. Searches are global; that is, they span across environments and clusters.

  1. Start typing the name of a schema subject, data record, or data field name into the search bar at the top. You will get results as you type, including for other entities like topics.

    ../_images/cloud-02a-search-schema.png
  2. Hit enter to select an entity like a schema.

    ../_images/cloud-02b-search-schema.png

To learn more, see Searching Data, Schemas, and Topics.

Find schemas from the Environment view

  1. Navigate to an environment you want to work with, and click to select it as the current environment.

    ../_images/cloud-02c-view-manage-schemas.png
  2. Click Schemas on the right panel to get a list of all schemas in the environment.

    ../_images/cloud-02cc-view-manage-schemas.png
  3. Browse the list to find the schema you want to view.

    ../_images/cloud-02d-view-schemas-list.png
  4. Click a schema in the list to view it.

    By default, the schema is shown in tree view. (To learn more, see Tree view and code view.)

    ../_images/cloud-02e-view-schemas-list.png

Tree view and code view

Two different types of views are available for schemas:

  • tree view
  • editable code view

To switch between the views, click the buttons to the left of the schema level search box:

../_images/cloud-sr-schema-tree-toggle-icon-captions.png

By default, schemas are displayed in a tree view which allows you to understand the structure of the schema and navigate the hierarchy of elements and sub-elements.

../_images/cloud-sr-schema-tree-view.png

In the tree view you can:

  • Use the arrows to the left of an element to expand it and view sub-elements.
  • Apply and manage available tags as described in Tagging Data and Schemas.

In edit mode (the “code view”), you can create and edit schemas as described in the sections below.

../_images/cloud-sr-schema-code-view.png

Create a topic schema

Create key and value schemas. Value schemas are typically created more frequently than key schemas.

Best practices:

  • Provide default values for fields to facilitate backward-compatibility if pertinent to your schema.
  • Document at least the more obscure fields for human-readability of a schema.

Tip

You can also create schemas from the unified Confluent CLI, as described in the Create a Schema section in the Quick Start. A handy commands reference is here.

Create a topic value schema

  1. From the navigation menu, click Topics, then click a topic to select it (or create a new one).

  2. Click the Schema tab.

    ../_images/cloud-03-set-msg-value-schema.png
  3. Click Set a schema. The Schema editor appears.

    ../_images/cloud-04-schema-value-editor.png
  4. Select a schema type: JSON, Avro, or Protobuf. (The default is Avro.)

  5. The basic structure of a schema appears prepopulated in the editor as a starting point. Enter the schema in the editor:

    • name: Enter a name for the schema if you do not want to accept the default, which is determined by the subject name strategy. The default is schema_type_topic_name. Required.

    • type: Either record, enum, union, array, map, or fixed. (The type record is specified at the schema’s top level and can include multiple fields of different data types.) Required.

    • namespace: Fully-qualified name to prevent schema naming conflicts. String that qualifies the schema name. Optional but recommended.

    • fields: JSON array listing one or more fields for a record. Required.

      Each field can have the following attributes:

      • name: Name of the field. Required.
      • type: Data type for the field. Required.
      • doc: Field metadata. Optional but recommended.
      • default: Default value for a field. Optional but recommended.
      • order: Sorting order for a field. Valid values are ascending, descending, or ignore. Default: Ascending. Optional.
      • aliases: Alternative names for a field. Optional.

    For example, you could add the following simple schema.

    {
      "type": "record",
      "name": "value_my_new_widget",
      "fields": [
        {
          "name": "name",
          "type": "string"
        }
      ]
    }
    

    This will display in Confluent Cloud as shown below.

    ../_images/cloud-05-entered-schema.png

    In edit mode, you have options to:

    • Validate the schema for syntax and structure before you create it.
    • Add schema references with a guided wizard.
  6. Click Create.

    • If the entered schema is valid, you can successfully save it and a Schema updated message is briefly displayed in the banner area. The schema is saved and shown in tree view form.

      ../_images/cloud-06-schema-updated.png
    • If the entered schema is invalid, parse errors are highlighted in the editor (as in this example where a curly bracket was left off). If parse errors aren’t auto-highlighted, click the See error messages link on the warning banner to enable them.

      ../_images/cloud-schema-invalid-avro-warning-banner.png
      ../_images/cloud-07-schema-invalid-avro.png

If applicable, repeat the procedure as appropriate for the topic key schema.

Working with schema references

You can add a reference to another schema, using the wizard to help locate available schemas and versions.

../_images/cloud-05a-schema-references.png

The Reference name you provide must match the target schema, based on guidelines for the schema format you are using:

  • In JSON Schema, the name is the value on the $ref field of the referenced schema
  • In Avro, the name is the full name of the referenced schema; this is the value in the name field of the referenced schema.
  • In Protobuf, the name is the value on the Import statement referenced schema

First, locate the schema you want to reference, and get the reference name for it.

Add a schema reference to the current schema in the editor

  1. Click Add reference.
  2. Provide a Reference name per the rules described above.
  3. Select the schema fro the Subject list.
  4. Select the Version of the schema you want to use.
  5. Click Validate to check if the reference will pass.
  6. Click Save to save the reference.

For example, to create a reference to the schema for the employees topic (Employee) from the widget schema, you can configure a reference to Employee as shown.

../_images/cloud-05b-schema-references.png

To learn more, see Schema References in the Confluent Platform documentation.

View, edit, or delete schema references for a topic

Existing schema references show up on editable versions of the schema where they are configured.

  1. Navigate to a topic; for example, the widget-value schema associated with the widget topic in the previous example.

  2. Click into the editor as if to edit the schema.

    If there are references to other Schemas configured in this schema, they will display in the Schema references list below the editor.

    You can also add more references to this schema, modify existing, or delete references from this view.

Create a topic key schema

  1. Click the Key option. You are prompted to set a message key schema.

    ../_images/cloud-08-set-msg-key-schema.png
  2. Click Set a schema.

  3. Choose Avro format and/or delete the sample formatting and simply paste in a string UUID.

  4. Enter the schema into the editor and click Save.

    Here is an example of a schema appropriate for a key value.

    {
      "namespace": "io.confluent.examples.clients.basicavro",
      "name": "key_widget",
      "type": "string"
    }
    

Best Practices and Pitfalls for Key Values

Kafka messages are key-value pairs. Message keys and message values can be serialized independently. For example, the value may be using an Avro record, while the key may be a primitive (string, integer, and so forth). Typically message keys, if used, are primitives. How you set the key is up to you and the requirements of your implementation.

As a best practice, keep key value schema complexity to a minimum. Use either a simple, non-serialized data type such as a string UUID or long ID, or an Avro record that does not use maps or arrays as fields, as shown in the example below. Do not use Protobuf messages and JSON objects for key values. Avro does not guarantee deterministic serialization for maps or arrays, and Protobuf and JSON schema formats do not guarantee deterministic serialization for any object. Using these formats for key values will break topic partitioning. To learn more, see Partitioning gotchas in the Confluent Community Forum.

For detailed examples of key and value schemas, see the discussion under Schema Formats, Serializers, and Deserializers in the Schema Registry documentation.

Derive a schema from messages

As an alternative to manually creating a schema, you can generate a new schema from a given set of messages. To learn more, see the setup and examples provided in schema-registry:derive-schema in the Schema Registry Maven Plugin documentation.

Editing schemas

Edit an existing schema for a topic.

  1. From the navigation menu, click Topics, then click a topic to select it.

  2. Click the Schema tab.

  3. Select the Key or Value option for the schema.

  4. The tree view is shown by default.

    ../_images/cloud-schema-tree-view.png
  5. Click Evolve Schema.

    ../_images/cloud-schema-code-view.png
  6. Make the changes in the schema editor.

    For example, you could edit the previous schema by adding a new field called region.

    {
      "fields": [
        {
          "name": "name",
          "type": "string"
        },
        {
          "name": "region",
          "type": "string",
          "default": ""
        }
      ],
      "name": "value_widgets",
      "type": "record"
    }
    

    In edit mode, you have options to:

    Tip

    When the compatibility mode is set to Backward Compatibility, you must provide a default for the new field. This ensures that consumer applications can read both older messages written to the Version 1 schema (with only a name field) and new messages constructed per the Version 2 schema (with name and region fields). For messages that match the Version 1 schema and only have values for name, region is left empty. To learn more, see Passing Compatibility Checks in the Confluent Cloud Schema Registry Tutorial.

  7. Click Save.

    • If the schema update is valid and compatible with its prior versions (assuming a backward-compatible mode), the schema is updated and the version count is incremented. You can compare the different versions of a schema.

      ../_images/cloud-09-schema-version-updated.png
    • If the schema update is invalid or incompatible with an earlier schema version, parse errors are highlighted in the editor. If parse errors aren’t auto-highlighted, click the See error messages link on the warning banner to enable them.

      For example, if you add a new field but do not include a default value as described in the previous step, you will get an incompatibility error. You can fix this by adding a default value for “region”.

      ../_images/cloud-schema-invalid-avro-warning-banner.png
      ../_images/cloud-10-schema-incompatible.png

Comparing schema versions

Compare versions of a schema to view its evolutionary differences.

  1. From the navigation menu, click Topics, then click a topic to select it.

  2. Click the Schema tab.

  3. Select the Key or Value option for the schema. (The schema Value is displayed by default.)

    ../_images/cloud-11a-schema-version-newest.png
  4. Click Compare version.

    The current version number of the schema is indicated on the version menu.

    ../_images/cloud-11b-schema-version-history-choose.png
  5. Select the Turn on version diff check box.

  6. Select the versions to compare from each version menu. The differences are highlighted for comparison.

    ../_images/cloud-12-schema-compare.png

Changing subject level (per topic) compatibility mode of a schema

The default compatibility mode is Backward. The mode can be changed for the schema of any topic if necessary.

Caution

If you change the compatibility mode of an existing schema already in production use, be aware of any possible breaking changes to your applications.

This section describes how to change the compatibility mode at the subject level. You can also set compatibility globally for all schemas in an environment. However, the subject-level compatibility settings described below override those global settings.

  1. Select an environment.

  2. Select a cluster.

  3. From the navigation menu, click Topics, then click a topic to select it.

  4. Click the Schema tab for the topic.

  5. Select the Key or Value option for the schema.

  6. Click the ellipses (3 dots) on the upper right to get the menu, then select Compatibility settings.

    ../_images/cloud-13a-schema-compat-mode-menu.png

    The Compatibility settings are displayed.

    ../_images/cloud-13a-schema-compat-update.png
  7. Select a mode option:

    Descriptions indicate the compatibility behavior for each option. For more information, including the changes allowed for each option, see Schema Evolution and Compatibility.

  8. Click Save.

Searching for schemas and fields

Confluent Cloud offers global search across environments and clusters for various entity types now including schemas and related metadata. To learn more, see Searching Data, Schemas, and Topics in Stream Catalog.

Tagging schemas and fields

Confluent Cloud provides the ability to tag schema versions and fields within schemas as a means of organizing and cataloging data based on both custom and commonly used tag names. To learn about tagging, see Tagging Data and Schemas in Data Discovery.

Downloading a schema from Confluent Cloud

  1. From the navigation menu, click Topics, then click a topic to select it.

  2. Click the Schema tab.

  3. Select the Key or Value option for the schema.

  4. Click the ellipses (3 dots) on the upper right to get the menu, then select Download.

    ../_images/cloud-15-schema-download-menu.png

    A schema JSON file for the topic is downloaded into your Downloads directory.

    For example, if you download the version 1 schema for the employees topic from the Quick Start, you get a file called schema-employees-value-v1.avsc with the following contents.

    {
      "fields": [
        {
          "name": "Name",
          "type": "string"
        },
        {
          "name": "Age",
          "type": "int"
        }
      ],
      "name": "Employee",
      "namespace": "Example",
      "type": "record"
    }
    

Tip

The file extension indicates the schema format. For Avro schema the file extension is .avsc; for Protobuf schema, .proto; and for JSON Schema, .json.

Deleting a schema from Confluent Cloud

  1. From the navigation menu, click Topics, then click a topic to select it.

  2. Click the Schema tab.

  3. Select the Key or Value option for the schema.

  4. Click the ellipses (3 dots) on the upper right to get the menu, then select Delete.

    ../_images/cloud-14-schema-delete-menu.png
  5. On the dialog, select whether to delete only a particular version of the schema or the entire subject (all versions).

    ../_images/cloud-14-schema-delete-dialog.png
  6. Select Delete to carry out the action.

To learn more about deleting schemas, see Schema Deletion Guidelines .

Identifying and deleting unused schemas on Confluent Cloud

You can use the schema deletion tool to find and delete schemas in the Confluent Cloud Schema Registry.

The README provides full instructions on how to build and use the schema deletion tool independently, or as a Confluent CLI plugin.

The basic workflow is:

  1. You provide subject names, which are based on TopicNameStrategy, to analyze and detect unused schemas.
  2. The tool then identifies topics using the TopicNameStrategy and analyzes messages to identify unused schemas in those topics.

For example, if you select a subject named stocks-value, the tool analyzes messages from the topic stocks and identifies unused schemas (if any) in the subject stocks-value.

Any schema IDs not present in the topic’s messages are identified as unused schemas, and listed as candidates for deletion.

You can then enter the schemas you want to be delete.

To learn more about subject naming strategies, see Subject Name Strategy.

Caution

This tool provides you with the ability to hard delete schemas in the Schema Registry, so use this option with caution.

Recovering a soft-deleted schema

You can recover a soft-deleted schema using the Schema Registry GET /subjects and POST /subjects APIs as follows:

  1. Retrieve the schema with GET /subjects?deleted=true, as described in GET /subjects API description. (A usage example is shown in List all subjects.)
  2. Use POST/subjects/(string: subject)/versions to re-register the schema.

To learn more, see see Schema Deletion Guidelines, Schema Registry API Reference, and Schema Registry API Usage Examples.

Managing schemas for a Confluent Cloud environment

Schema Registry itself sits at the environment level and serves all clusters in an environment, therefore several tasks related to schemas are managed through the registry at this level.

To view and manage Schema Registry for a Confluent Cloud environment:

  1. Select an environment from the Home page. (An environment list is available from the top right menu.)

  2. Click the Schema Registry tab.

    Screenshot of Schema Registry settings

See Choose a Stream Governance package and enable Schema Registry for Confluent Cloud and Stream Governance Packages, Features, and Limits to learn about Stream Governance package options.

See Configure and Manage Schemas for an Environment to learn how to:

Access Control (RBAC) for Confluent Cloud Schema Registry

Note

A staged rollout of RBAC for Schema Registry is in progress in early December 2022. This feature may not be immediately available to all customers.

Role-Based Access Control (RBAC) enables administrators to set up and manage user access to Schema Registry subjects and topics. This allows for multiple users to collaborate on with different access levels to various resources.

ResourceOwner privileges on Schema Registry are automatically granted to all user and service accounts that have existing API keys for Schema Registry clusters or existing CloudClusterAdmin privileges on any cluster in the same environment as Schema Registry.

The following table describes how RBAC roles map to Schema Registry resources. For details on how to manage RBAC for these resources, see Manage RBAC using Confluent Cloud Console and Manage RBAC using the Confluent CLI. For more schema related RBAC information, see also Access Control (RBAC) for Stream Lineage and Access Control (RBAC) for Schema Linking.

Role Scope Read subject Write subject Delete subject Read subject compatibility Write subject compatibility Grant permissions
OrganizationAdmin Organization
EnvironmentAdmin Environment
CloudClusterAdmin Cluster            
Operator Organization, Environment, Cluster            
MetricsViewer Organization, Environment, Cluster            
ResourceOwner Schema Subject
DeveloperManage Schema Subject        
DeveloperRead Schema Subject        
DeveloperWrite Schema Subject      
DataDiscovery Environment        
DataSteward Environment  

Table Legend:

  • ✔ = Yes
  • Blank space = No

Tip

“Global compatibility” does not apply to roles. To grant permission to a user to manage global compatibility, grant the DeveloperManage role on a subject resource named __GLOBAL.

Supported features and limits for Confluent Cloud Schema Registry

  • A single Schema Registry is available per Environment.

  • Access Control to Schema Registry is based on API key and secret.

  • Your VPC must be able to communicate with the Confluent Cloud Schema Registry public internet endpoint. For more information, see Use Confluent Cloud Schema Registry in a VPC-peered environment.

  • Available on Amazon Web Services (AWS), Azure (Microsoft Azure), and GCP (Google Cloud Platform) for cloud provider geographies located in the US, Europe, and APAC. For each cloud provider, geographies are mapped under the hood to specific regions, as described in Choose a Stream Governance package and enable Schema Registry for Confluent Cloud.

  • High availability (HA) is achieved by having multiple nodes within a cluster always in running state, with each node running in a different availability zone (AZ).

  • Rate limits on number of API requests is 25 requests per second for each API key.

    Note

    Requests are identified using an API key that points to a tenant (LSRC). Requests from different API keys from the same tenant are still counted to the same limit for the tenant. So, with many API keys on the same LSRC (Schema Registry logical cluster ID) you still have the same limit of 25 requests per second mutualized on all keys.