Manage Schemas in Confluent Cloud

Schema Management is fully supported on Confluent Cloud with the per-environment, hosted Schema Registry, and is a key element of Data Governance Overview.

Working with schemas

You can manage schemas for topics in Confluent Cloud.

The following sections assume you have Schema Registry enabled. If you are just getting started with schemas, see Quick Start for Schema Management on Confluent Cloud in the quick start to enable Schema Registry and follow the examples to learn more about working with schemas in Confluent Cloud.

View a schema

View the schema details for a specific topic.

  1. If you have more than one environment, select an environment.

    ../_images/cloud-0-environment-select.png

    Tip

    • If you have not explicitly created any new environments, the default environment is automatically selected and the starting page is the cluster list (next step).
    • To view environments, click the hamburger menu top left, and select Environments.
  2. From the Clusters page, click the cluster you want to work with.

    ../_images/cloud-0a-cluster-select.png
  3. From the navigation menu, click Topics, then click a topic to select it.

    ../_images/cloud-01-topic-select.png
  4. Click the Schema tab.

    The Value schema is displayed by default.

    ../_images/cloud-02-view-value-schema.png
  5. Click Key to view the key schema if present.

Create a topic schema

Create key and value schemas. Value schemas are typically created more frequently than key schemas.

Best practices:

  • Provide default values for fields to facilitate backward-compatibility if pertinent to your schema.
  • Document at least the more obscure fields for human-readability of a schema.

Tip

You can also create schemas from the Confluent Cloud CLI, as described in the Create a Schema section in the Quick Start.

Create a topic value schema

  1. From the navigation menu, click Topics, then click a topic to select it (or create a new one).

  2. Click the Schema tab.

    ../_images/cloud-03-set-msg-value-schema.png
  3. Click Set a schema. The Schema editor appears.

    ../_images/cloud-04-schema-value-editor.png
  4. The basic structure of a schema appears prepopulated in the editor as a starting point. Enter the schema in the editor:

    • name: Enter a name for the schema if you do not want to accept the default, which is determined by the subject name strategy. The default is schema_type_topic_name. Required.

    • type: Either record, enum, union, array, map, or fixed. (The type record is specified at the schema’s top level and can include multiple fields of different data types.) Required.

    • namespace: Fully-qualified name to prevent schema naming conflicts. String that qualifies the schema name. Optional but recommended.

    • fields: JSON array listing one or more fields for a record. Required.

      Each field can have the following attributes:

      • name: Name of the field. Required.
      • type: Data type for the field. Required.
      • doc: Field metadata. Optional but recommended.
      • default: Default value for a field. Optional but recommended.
      • order: Sorting order for a field. Valid values are ascending, descending, or ignore. Default: Ascending. Optional.
      • aliases: Alternative names for a field. Optional.

    For example, you could add the following simple schema.

    {
      "type": "record",
      "name": "value_my_new_widget",
      "fields": [
        {
          "name": "name",
          "type": "string"
        }
      ]
    }
    

    This will display in Confluent Cloud as shown below.

    ../_images/cloud-05-entered-schema.png

    In edit mode, you have options to:

    • Validate the schema for syntax and structure before you create it.
    • Add schema references with a guided wizard.
  5. Click Create.

    • If the entered schema is valid, you can successfully save it and a Schema updated message is briefly displayed in the banner area.

      ../_images/cloud-06-schema-updated.png
    • If the entered schema is invalid, parse errors are highlighted in the editor (as in this example where a curly bracket was left off). If parse errors aren’t auto-highlighted, click the See error messages link on the warning banner to enable them.

      ../_images/cloud-schema-invalid-avro-warning-banner.png
      ../_images/cloud-07-schema-invalid-avro.png

If applicable, repeat the procedure as appropriate for the topic key schema.

Working with schema references

You can add a reference to another schema, using the wizard to help locate available schemas and versions.

../_images/cloud-05a-schema-references.png

The Reference name you provide must match the target schema, based on guidelines for the schema format you are using:

  • In JSON Schema, the name is the value on the $ref field of the referenced schema
  • In Avro, the name is the value on the type field of the referenced schema
  • In Protobuf, the name is the value on the Import statement referenced schema

First, locate the schema you want to reference, and get the reference name for it.

Add a schema reference to the current schema in the editor

  1. Click Add reference.
  2. Provide a Reference name per the rules described above.
  3. Select the schema fro the Subject list.
  4. Select the Version of the schema you want to use.
  5. Click Validate to check if the reference will pass.
  6. Click Save to save the reference.

For example, to reference to the schema for the employees topic (employees-value) from the widget schema, you can configure a reference to type, record as shown.

../_images/cloud-05b-schema-references.png

To learn more, see Schema References in the Confluent Platform documentation.

View, edit, or delete schema references for a topic

Existing schema references show up on editable versions of the schema where they are configured.

  1. Navigate to a topic; for example, the widget-value schema associated with the widget topic in the previous example.

  2. Click into the editor as if to edit the schema.

    If there are references to other Schemas configured in this schema, they will display in the Schema references list below the editor.

    You can also add more references to this schema, modify existing, or delete references from this view.

Create a topic key schema

  1. Click the Key option. You are prompted to set a message key schema.

    ../_images/cloud-08-set-msg-key-schema.png
  2. Click Set a schema.

  3. Enter the schema into the editor and click Save.

Tip

Kafka messages are key/value pairs. Message keys and message values can be serialized independently. For example, the value may be using an Avro record, while the key may be a primitive (string, integer, and so forth). Typically message keys, if used, are primitives, but they can be complex data types as well (for example, record or array). How you set the key is up to you and the requirements of your implementation. For detailed examples of key and value schemas, see the discussion under Schema Formats, Serializers, and Deserializers in the Schema Registry documentation.

Editing schemas

Edit an existing schema for a topic.

  1. From the navigation menu, click Topics, then click a topic to select it.

  2. Click the Schema tab.

  3. Select the Key or Value option for the schema.

  4. Click Edit schema.

  5. Make the changes in the schema editor.

    For example, you could edit the previous schema by adding a new field called region.

    {
      "fields": [
        {
          "name": "name",
          "type": "string"
        },
        {
          "name": "region",
          "type": "string",
          "default": ""
        }
      ],
      "name": "value_widgets",
      "type": "record"
    }
    

    In edit mode, you have options to:

    Tip

    When the compatibility mode is set to Backward Compatibility, you must provide a default for the new field. This ensures that consumer applications can read both older messages written to the Version 1 schema (with only a name field) and new messages constructed per the Version 2 schema (with name and region fields). For messages that match the Version 1 schema and only have values for name, region is left empty. To learn more, see Passing Compatibility Checks in the Confluent Cloud Schema Registry Tutorial.

  6. Click Save.

    • If the schema update is valid and compatible with its prior versions (assuming a backward-compatible mode), the schema is updated and the version count is incremented. You can compare the different versions of a schema.

      ../_images/cloud-09-schema-version-updated.png
    • If the schema update is invalid or incompatible with an earlier schema version, parse errors are highlighted in the editor. If parse errors aren’t auto-highlighted, click the See error messages link on the warning banner to enable them.

      For example, if you add a new field but do not include a default value as described in the previous step, you will get an incompatibility error. You can fix this by adding a default value for “region”.

      ../_images/cloud-schema-invalid-avro-warning-banner.png
      ../_images/cloud-10-schema-incompatible.png

Comparing schema versions

Compare versions of a schema to view its evolutionary differences.

  1. From the navigation menu, click Topics, then click a topic to select it.

  2. Click the Schema tab.

  3. Select the Key or Value option for the schema. (The schema Value is displayed by default.)

    ../_images/cloud-11a-schema-version-newest.png
  4. Click Compare version.

    The current version number of the schema is indicated on the version menu.

    ../_images/cloud-11b-schema-version-history-choose.png
  5. Select the Turn on version diff check box.

  6. Select the versions to compare from each version menu. The differences are highlighted for comparison.

    ../_images/cloud-12-schema-compare.png

Changing subject level (per topic) compatibility mode of a schema

The default compatibility mode is Backward. The mode can be changed for the schema of any topic if necessary.

Caution

If you change the compatibility mode of an existing schema already in production use, be aware of any possible breaking changes to your applications.

This section describes how to change the compatibility mode at the subject level. You can also set compatibility globally for all schemas in an environment. However, the subject-level compatibility settings described below override those global settings.

  1. Select an environment.

  2. Select a cluster.

  3. From the navigation menu, click Topics, then click a topic to select it.

  4. Click the Schema tab for the topic.

  5. Select the Key or Value option for the schema.

  6. Click the ellipses (3 dots) on the upper right to get the menu, then select Compatibility settings.

    ../_images/cloud-13a-schema-compat-mode-menu.png

    The Compatibility settings are displayed.

    ../_images/cloud-13a-schema-compat-update.png
  7. Select a mode option:

    Descriptions indicate the compatibility behavior for each option. For more information, including the changes allowed for each option, see Schema Evolution and Compatibility.

  8. Click Save.

Searching for schemas and fields

Confluent Cloud offers global search across environments and clusters for various entity types now including schemas and related metadata. To learn more, see Data Discovery (Early Access).

Deleting a schema from Confluent Cloud

  1. From the navigation menu, click Topics, then click a topic to select it.

  2. Click the Schema tab.

  3. Select the Key or Value option for the schema.

  4. Click the ellipses (3 dots) on the upper right to get the menu, then select Delete.

    ../_images/cloud-14-schema-delete-menu.png
  5. On the dialog, select whether to delete only a particular version of the schema or the entire subject (all versions).

    ../_images/cloud-14-schema-delete-dialog.png
  6. Select Delete to carry out the action.

To learn more about deleting schemas, see Schema Deletion Guidelines .

Using Broker-Side Schema Validation on Confluent Cloud

Schema Validation enables the broker to verify that data produced to a Kafka topic is using a valid schema ID in Schema Registry that is registered according to the subject naming strategy. (See also, Schemas, Subjects, and Topics.)

Prerequisites

  • Schema Validation on Confluent Cloud is only available on dedicated clusters through the hosted Schema Registry. Confluent Cloud brokers cannot use self-managed instances of Schema Registry, only the Confluent Cloud hosted Schema Registry. (Schema validation is available for on-premises deployments through Confluent Enterprise).
  • You must have a Schema Registry enabled for the environment in which you are using Schema Validation.
  • Schema Validation is bounded at the level of an environment. All dedicated clusters in the same environment share a Schema Registry. Clusters do not have visibility into schemas across different environments.

Schema Validation Configuration options on a topic

Schema Validation is set at the topic level with the following parameters.

Property Description
confluent.key.schema.validation When set to true, enables schema ID validation on the message key. The default is false.
confluent.value.schema.validation When set to true, enables schema ID validation on the message value. The default is false.
confluent.key.subject.name.strategy Set the subject name strategy for the message key. The default is io.confluent.kafka.serializers.subject.TopicNameStrategy.
confluent.value.subject.name.strategy Set the subject name strategy for the message value. The default is io.confluent.kafka.serializers.subject.TopicNameStrategy.

Tip

  • Value schema and key schema validation are independent of each other; you can enable either or both.
  • The subject naming strategy is tied to Schema Validation. This will have no effect when Schema Validation is not enabled.

Enable Schema Validation from the Confluent Cloud CLI

You can enable Schema Validation on a topic when you create a topic or modify an existing topic.

The command syntax to enable Schema Validation is as follows:

ccloud kafka topic <create|update> <topic-name> --config confluent.<key|value>.schema.validation=true

For example, this command creates a topic called flights with schema validation enabled on the value schema:

ccloud kafka topic create flights --config confluent.value.schema.validation=true

With this configuration, if a message is produced to the topic flights that does not have a valid schema for the value of the message, an error is returned to the producer, and the message is discarded.

If a batch of messages is sent and at least one is invalid, then the entire batch is discarded.

If you do not specify a different subject naming strategy, io.confluent.kafka.serializers.subject.TopicNameStrategy is used by default. You can modify the naming strategies used for either or both the message key and message value schemas. For example, the following command sets the subject naming strategy on the topic flights to use io.confluent.kafka.serializers.subject.RecordNameStrategy.

ccloud kafka topic update flights --config confluent.value.subject.name.strategy=io.confluent.kafka.serializers.subject.RecordNameStrategy

The following naming strategies are available as accepted values for confluent.value.subject.name.strategy.

Strategy Description
TopicNameStrategy Derives subject name from topic name. (This is the default.)
RecordNameStrategy Derives subject name from record name, and provides a way to group logically related events that may have different data structures under a subject.
TopicRecordNameStrategy Derives the subject name from topic and record name, as a way to group logically related events that may have different data structures under a subject.

Note

The full class names for the above strategies consist of the strategy name prefixed by io.confluent.kafka.serializers.subject., as shown in the examples in this section.

Enable Schema Validation on a topic from the Confluent Cloud web UI

To set Schema Validation on a topic from the Confluent Cloud web UI:

  1. Navigate to a topic.

  2. Click the Configuration tab.

  3. Click Edit Settings.

  4. Click Switch to expert mode.

  5. In Expert mode, change the settings for confluent.value.schema.validation and/or confluent.key.schema.validation from false to true.

    ../_images/cloud-sv-topic-enable.png

    Tip

    If you do not specify a different naming strategy, TopicNameStrategy is used by default.

    You can modify the naming strategies used for either or both the message key and message value schemas. These settings are also available in Expert mode on the selected topic. Set these now, if desired.

  6. Click Save changes.

To disable Schema Validation, set these same configuration options to false.

Schema Validation Demo

You can test Schema Validation by following along with this short demo.

  1. Create a test topic called players-maple either from the web UI or the Confluent Cloud CLI. Do not specify the Schema Validation setting, so that your topic defaults to false.

    Here is the command to use from the Confluent Cloud CLI:

    ccloud kafka topic create players-maple
    

    This creates a topic with no broker validation on records produced to the test topic, which is what you want for the first part of the demo.

  2. In a new command window for the producer (logged into Confluent Cloud and on the same environment and cluster), run this command to produce a serialized record (using the default string serializer) to the topic players-maple.

    ccloud kafka topic produce players-maple --parse-key=true --delimiter=,
    

    The command is successful because you currently have Schema Validation disabled for this topic. If broker Schema Validation had been enabled for this topic, the above command to produce to it would not be permitted.

    Type your first message at the producer prompt as follows:

    1,Pierre
    

    Keep this session of the producer running.

  3. Open a new command window for the consumer (logged into Confluent Cloud and on the same environment and cluster), and enter this command to read the messages:

    ccloud kafka topic consume players-maple --from-beginning --print-key=true
    

    The output of this command is 1    Pierre.

    Keep this session of the consumer running.

  4. Now, set Schema Validation for the topic players-maple to true.

    ccloud kafka topic update players-maple --config confluent.value.schema.validation=true
    

    Tip

    You can also update this setting on the Confluent Cloud web UI in expert mode for the configuration on the players-maple topic.

  5. Return to the producer session, and type a second message at the prompt.

    2,Frederik
    

    You will get an error because Schema Validation is enabled and the messages we are sending do not contain schema IDs: Error: producer has detected an INVALID_RECORD error for topic players-maple

    If you subsequently disable Schema Validation (use the same command to set it to false), then type and resend the same or another similarly formatted message, the message will go through. (For example, produce 3,Ben.)

    The messages that were successfully produced also show in your web browser in Topics > players-maple > Messages. You may have to select a partition or jump to a timestamp to see messages sent earlier.

What Schema Validation checks and how it works

When Schema Validation is enabled on a topic, it checks for the following on each message:

  • The message produced to the topic has an associated schema. (The message must have an associated schema ID, which indicates it has a schema.)
  • The schema must match the topic.

The demo above is a straight-forward way to demonstrate that Schema Validation is working, using the Confluent Cloud CLI.

In practice, you would typically send an Avro object, Protobuf object, or Jackson-serializable POJO as a function of a client application. In this case, Schema Validation derives the schema based on the object. The schema is sent to Schema Registry, which checks to see if the schema exists in the subject. If it does, Schema Registry uses the schema ID of that version. If it doesn’t, Schema Registry throws an error if the client has auto schema registration set to false, or will register the schema if the client has auto schema registration set to true.

Auto schema registration is set in the client application. By default, client applications automatically register new schemas. You can disable auto schema registration on your clients, which is typically recommended in production environments. To learn more, see Disabling Auto Schema Registration in the Confluent Platform documentation.

Monitoring metrics on Schema Validation

Confluent Cloud exposes the following metrics for monitoring Schema Validation from the Kafka brokers through the metrics API:

io.confluent.kafka.server/schema_validator/record_rate
Number of messages undergoing validation per second
io.confluent.kafka.server/schema_validator/rejected_rate
Number of messages that failed validation per second, for any reason
io.confluent.kafka.server/schema_validator/connection_error_rate
Number of messages failed per second due to connection issues
io.confluent.kafka.server/schema_validator/client_error_rate
Number of messages failed per second due to 4xx client issues
io.confluent.kafka.server/schema_validator/server_error_rate
Number of messages failed per second due to 5xx server issues

Downloading a schema from Confluent Cloud

  1. From the navigation menu, click Topics, then click a topic to select it.

  2. Click the Schema tab.

  3. Select the Key or Value option for the schema.

  4. Click the ellipses (3 dots) on the upper right to get the menu, then select Download.

    ../_images/cloud-15-schema-download-menu.png

    A schema JSON file for the topic is downloaded into your Downloads directory.

    For example, if you download the version 1 schema for the employees topic from the Quick Start, you get a file called schema-employees-value-v1.avsc with the following contents.

    {
      "fields": [
        {
          "name": "Name",
          "type": "string"
        },
        {
          "name": "Age",
          "type": "int"
        }
      ],
      "name": "Employee",
      "namespace": "Example",
      "type": "record"
    }
    

Tip

The file extension indicates the schema format. For Avro schema the file extension is .avsc; for Protobuf schema, .proto; and for JSON Schema, .json.

Managing schemas for a Confluent Cloud environment

Schema Registry itself sits at the environment level and serves all clusters in an environment, therefore several tasks related to schemas are managed through the registry at this level.

To view and manage Schema Registry for a Confluent Cloud environment:

  1. Select an environment from the Home page. (An environment list is available from the top right menu.)

  2. Click the Schema Registry tab.

    Screenshot of Schema Registry settings

See Configure and Manage Schemas for an Environment in the Confluent Cloud Quick Start to learn how to:

Supported features and limits for Confluent Cloud Schema Registry

  • A single Schema Registry is available per Environment.
  • Access Control to Schema Registry is based on API key and secret.
  • Each Environment must have at least one Apache Kafka® cluster to enable Schema Registry.
  • Your VPC must be able to communicate with the Confluent Cloud Schema Registry public internet endpoint. For more information, see Use Confluent Cloud Schema Registry in a VPC peered environment.
  • Available on Amazon Web Services (AWS), Azure (Microsoft Azure), and GCP (Google Cloud Platform) for cloud provider geographies located in the US, Europe, and APAC. For each cloud provider, geographies are mapped under the hood to specific regions, as described in Enable Schema Registry for Confluent Cloud.
  • Maximum number of schemas allowed is 1,000.
  • Rate limits on number of API requests is 25 requests per second for each API key.
  • High availability (HA) is achieved by having multiple nodes within a cluster always in running state, with each node running in a different availability zone (AZ).