Looking for Confluent Platform Schema Management docs? This page describes Schema Management on Confluent Cloud. If you are looking for Confluent Platform documentation, check out Schema Management on Confluent Platform.

Schema Linking on Confluent Cloud (Preview)

Important

This feature is available as a preview feature. A preview feature is a component of Confluent Cloud that is being introduced to gain early feedback from developers. This feature can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. Your comments, questions, and suggestions are encouraged and can be submitted to data-governance-preview@confluent.io.

Confluent Cloud Schema Registry now supports Schema Linking. The following sections explain the concepts, configurations, and tools you need to get started using Schema Linking in your Confluent Cloud deployments.

What is Schema Linking?

Schema Linking keeps schemas in sync across two Schema Registry clusters. Schema Linking can be used in conjunction with Cluster Linking to keep both schemas and topic data in sync across two Schema Registry and Kafka clusters.

../_images/schema-linking.png

Schema Registry introduces two new concepts to support Schema Linking:

  • Schema contexts - A schema context represents an independent scope in Schema Registry, and can be used to create any number of separate “sub-registries” within one Schema Registry cluster. Each schema context is an independent grouping of schema IDs and subject names, allowing the same schema ID in different contexts to represent completely different schemas. Any schema ID or subject name without an explicit context lives in the default context, denoted by a single dot .. An explicit context starts with a dot and can contain any parts separated by additional dots, such as .mycontext.subcontext. Context names operate similar to absolute Unix paths, but with dots instead of forward slashes (the default schema is like the root Unix path). However, there is no relationship between two contexts that share a prefix.
  • Schema exporters - A schema exporter is a component that resides in Schema Registry for exporting schemas from one Schema Registry cluster to another. The lifecycle of a schema exporter is managed through APIs, which are used to create, pause, resume, and destroy a schema exporter. A schema exporter is like a “mini-connector” that can perform change data capture for schemas.

The Quick Start below shows you how to get started using schema exporters and contexts for Schema Linking.

For in-depth descriptions of these concepts, see Schema Contexts and Schema Exporters

Quick Start

If you’d like to jump in and try out Schema Linking now, follow the steps below. At the end of the Quick Start, you’ll find deep dives on contexts, exporters, command options, and APIs, which may make more sense after you’ve experimented with some hands-on examples.

Get the latest version of the Confluent Cloud CLI

Update your Confluent Cloud CLI with ccloud update to be sure you have the latest version with the Schema Linking commands.:

   New Features
   -------------
     - Support for passing Schema Registry API key and secret in produce/consume commands
     - New Schema linking commands under ``ccloud schema-registry exporter``

- A list of |ccloud| CLI commands is available :ccloud-cli:`here|command-reference/index.html`.

- To get CLI help on the exporter, type  ``ccloud schema-registry exporter``

Set up two Schema Registry clusters in source and destination environments

  1. Log on to Confluent Cloud, and create two new environments: one named “SOURCE” and the other named “DESTINATION”.

    Navigate to Environments (top right menu), click Add cloud environment, and follow the steps.

  2. Add a Kafka cluster to each environment so that you can enable Schema Registry in each environment (for example “my-cluster-on-source” in the SOURCE environment, and “my-cluster-on-destination” in the DESTINATION environment.) These can be any cluster type.

    You can do this through the Confluent Cloud CLI or the web UI.

    • On the web UI, click Add cluster in an environment and follow the steps to name and create the cluster.

    • On the CLI, type ccloud environment list, then ccloud environment use <environment-id> to navigate to each environment, and then create the clusters in the appropriate environments.

      For example, in the SOURCE environment:

      ccloud kafka cluster create my-cluster-on-source --type basic --cloud gcp --region us-east1
      

      And in the DESTINATION environment:

      ccloud kafka cluster create my-cluster-on-destination --type basic --cloud gcp --region us-west1
      
  3. Set up a Schema Registry in each environment and take note of its API endpoint.

    • In each environment, navigate to the Schema Registry tab on the web UI, click Set up my own, and follow the prompts. The Schema Registry tab should display when the registry is created.
    • On the Schema Registry tabs for both source and destination, copy and save the API endpoints for each.
  4. Create an API key and secret for interacting with the Schema Registry clusters on both source and destination clusters.

    Navigate to the Schema Registry tab, click the edit icon under API credentials, and follow the prompts. Be sure to save your key and secret pairs for both source and destination. You will not be able to retrieve the secrets later.

  5. Review source and destination access information.

    By now you should have a triplet of Schema Registry access details for both source and destination clusters, which you will use in the following steps:

    • Schema Registry URL (the API endpoint)
    • API key
    • API secret

    Tip

    URLs (endpoints) for source and destination specify the physical Schema Registry clusters. The URLs can be the same if the schema registries are on the same physical cluster. They are different logical clusters, and will be identified by the different API key/secret pairs. The URLs will be different if the registries are on different physical clusters.

Create schemas on the source

Create at least two or three schemas in the source environment; at least one of which has a qualified subject name.

To create each schema from the Confluent Cloud web UI, follow these steps:

  1. From the Schema Registry tab in the SOURCE environment, click View and manage schemas, then Add schema.

  2. Fill in the subject name.

    • To create a schema with an unqualified subject name, simply provide a name such as coffee or donuts.
    • To create a schema with a qualified subject name in a specified context, use the syntax: :.<context-name>:<subject-name>. For example: :.snowcones:sales or :.burgers:locations
    ../_images/schema-link-qualified-subject.png
  3. Take the default schema in each case.

    For this Quick Start, you are not working with the schemas themselves, but rather learning how to work with subject names and contexts. Therefore, you can just take the default schema to save time, unless you want to use a specific schema for your own purposes.

  4. Click Create.

Create credentials for the exporter

Your schema exporter will be reading schemas in the source environment and exporting linked copies to the destination, so it needs credentials to access the destination.

Create ~/config.txt which you will use to create exporters, and fill in the URL and credentials the exporter needs to access the DESTINATION cluster:

schema.registry.url=<destination sr url>
basic.auth.credentials.source=USER_INFO
basic.auth.user.info=<destination api key>:<destination api secret>

Tip

These are the credentials the exporter will use to access the destination. You will still need to provide source and destination credentials to run commands in these environments, as noted at particular points in the steps below.

Create the exporter on the source

  1. For these Quick Start examples, you’ll want to create the exporters on the source, so make sure your current environment is SOURCE.

    You can use ccloud environment list and ccloud environment use <environment-id> to navigate.

  2. Create a new exporter using the ccloud schema-registry exporter create command.

    For this demo, you want the exporter to copy all schemas, including those in specific contexts (other than the default), so include the --subjects flag with the context wildcard to denote subjects under all contexts: --subjects ":*:":

    ccloud schema-registry exporter create <exporter-name> --subjects ":*:" --config-file ~/config.txt
    

    For example, this command creates an exporter called “my-first-exporter” that will export all schemas, including those in specific contexts as well as those in the default context:

    ccloud schema-registry exporter create my-first-exporter --subjects ":*:" --config-file ~/config.txt
    
  3. Enter your source API key and secret when prompted for them.

    You should get output verifying that the exporter was created: Created schema exporter "my-first-exporter".

More options for exporters

The exporter you just created is relatively basic, in that it just exports everything. As you’ll see in the next section, this is an efficient way to get an understanding of how you might organize, export, and navigate schemas with qualified and unqualified subject names.

Keep in mind that you can create exporters that specify to export only specific subjects and contexts using this syntax:

ccloud schema-registry exporter create <exporterName> --subjects <subjectName1>,<subjectName2> \
--context-type CUSTOM --context-name <contextName> \
--config-file ~/config.txt
  • Replace anything within <> with a name you like.
  • subjects are listed as a comma-separated string list, such as “pizzas,sales,customers”.
  • subjects, context-type, and context-name are all optional. context-name is specified if context-type is CUSTOM.
  • subjects defaults to *, and context-type defaults to AUTO.

Alternatively, if you take all the defaults and do not specify --subjects when you create an exporter, you will get an exporter that only exports schemas in the default context:

ccloud schema-registry exporter create my-first-exporter --config-file ~/config.txt

With this type of exporter, schemas on the source that have qualified subject names will not be exported to the destination.

You can create and run multiple exporters at once, so feel free to circle back at the end of the Quick Start to create and test more exporters with different parameters.

See Configuration Options for full details on schema exporter parameters.

Verify the exporter is running and view information about it

Still in the SOURCE environment, run the following commands.

  1. List available exporters.

    ccloud schema-registry exporter list
    

    Your exporter will show in the list.

  2. Describe the exporter.

    ccloud schema-registry exporter describe <exporterName>
    

    Your output should resemble:

    ccloud schema-registry exporter describe my-first-exporter
    +--------------------------------+--------------------------------------------------------------------------+
    | Name                           | my-first-exporter                                                        |
    | Subjects                       | *                                                                        |
    | Context Type                   | AUTO                                                                     |
    | Context                        |                                                                        . |
    | Remote Schema Registry Configs | basic.auth.credentials.source="USER_INFO"                                |
    |                                | basic.auth.user.info="[hidden]"                                          |
    |                                | schema.registry.url="https://psrc-0xx5p.us-central1.gcp.confluent.cloud" |
    +--------------------------------+--------------------------------------------------------------------------+
    
  3. Get configurations for the exporter.

    ccloud schema-registry exporter get-config <exporterName>
    
  4. Get the status of exporter.

    ccloud schema-registry exporter get-status <exporterName>
    
  5. Finally, as a check, get a list of schemas on the source.

    Use the prefix wildcard to list all schemas:

    ccloud schema-registry subject list --prefix ":*:"
    

    With the wildcard, this is effectively the same as the command: ccloud schema-registry subject list The command will return the list of subjects you’ve created on the source, for example:

            Subject
    +---------------------+
      :.burgers:locations
      :.snowcones:sales
      coffee
      customers
      donuts
    

Check that the schemas were exported

Now that you have verified that the exporter is running, and you know which schemas you created on the source, check to see that your schemas were exported to the destination.

  1. Switch to the DESTINATION.

    Use ccloud environment list and ccloud environment use <environment-id> to navigate.

  2. Run the following command to view all schemas, providing your API key and secret for the DESTINATION when prompted.

    ccloud schema-registry subject list --prefix ":*:"
    

    Notice how the context is passed as the subject prefix when querying. The example above uses the context wildcard :*: to match all contexts. It is, in effect, the same as the generic command ccloud schema-registry subject list, but a more explicit subject prefix could be used to pass a specific context.

    Your output list of schemas on the DESTINATION should match those on the SOURCE.

              Subject
    +--------------------------------+
      :.lsrc-3qy52.burgers:locations
      :.lsrc-3qy52.snowcones:sales
      :.lsrc-3qy52:coffee
      :.lsrc-3qy52:customers
      :.lsrc-3qy52:donuts
    
  3. List only schemas in particular contexts.

    One you have a list of all subjects on the destination with the prefixes (as in the above example), you can pass only the context name to see a narrowed list of subjects in a particular context.

    • For example, to list schemas in the burgers context:

      ccloud schema-registry subject list --prefix ":.lsrc-3qy52.burgers:"
      

      The output will be:

                   Subject
      +--------------------------------+
        :.lsrc-3qy52.burgers:locations
      
    • To list schemas in the snowcones context:

      ccloud schema-registry subject list --prefix ":.lsrc-3qy52.snowcones:"
      

      The output will be:

                   Subject
      +------------------------------+
        :.lsrc-3qy52.snowcones:sales
      

    Tip

    You do the same using curl commands to call the APIs. Here is an example:

    curl -u <destination api key>:<destination api secret> '<destination sr url>/subjects?subjectPrefix=:.context1:foo'
    

Pause the exporter and make changes

  1. Pause the exporter.

    Switch back to the SOURCE, and run the following command to pause the exporter.

    ccloud schema-registry exporter pause <exporterName>
    

    You should get output verifying that the command was successful. For example: Paused schema exporter "my-first-exporter".

    Check the status, just to be sure.

    ccloud schema-registry exporter get-status <exporterName>
    

    Your output should resemble:

    ccloud schema-registry exporter get-status my-first-exporter
    +--------------------+-------------------+
    | Name               | my-first-exporter |
    | Exporter State     | PAUSED            |
    | Exporter Offset    |          10011386 |
    | Exporter Timestamp |     1631107710822 |
    | Error Trace        |                   |
    +--------------------+-------------------+
    
  2. Reset schema exporter offset, then get the status.

    ccloud schema-registry exporter reset <exporterName>
    
    ccloud schema-registry exporter get-status <exporterName>
    

    The status will show that the offset is reset. For examples:

    ccloud schema-registry exporter get-status my-first-exporter
    +--------------------+-------------------+
    | Name               | my-first-exporter |
    | Exporter State     | PAUSED            |
    | Exporter Offset    |                -1 |
    | Exporter Timestamp |                 0 |
    | Error Trace        |                   |
    +--------------------+-------------------+
    
  3. Update exporter configurations or information.

    You can choose to update any of subjects, context-type, context-name, or config-file. For example:

    ccloud schema-registry exporter update <exporterName> --context-name <newContextName>
    
  4. Resume schema exporter.

    ccloud schema-registry exporter resume <exporterName>
    

Delete the exporter

When you are ready to wrap up your testing, pause and then delete the exporter(s) as follows.

  1. Pause the exporter.

    ccloud schema-registry exporter pause <exporterName>
    
  2. Delete the exporter.

    ccloud schema-registry exporter delete <exporterName>
    

This concludes the Quick Start. The next sections are a deep dive into Schema Linking concepts and tools you just tried out.

Schema Contexts

What is a schema context?

A schema context, or simply context, is essentially a grouping of subject names and schema IDs. A single Schema Registry cluster can host any number of contexts. Each context can be thought of as a separate “sub-registry”. A context can also be copied to another Schema Registry cluster, using a schema exporter.

How contexts work

Following are a few key aspects of contexts and how they help to organize schemas.

Schemas are scoped by context

Subject names and schema IDs are scoped by context so that two contexts in the same Schema Registry cluster can each have a schema with the same ID, such as 123, or a subject with the same name, such as mytopic-value, without any problem.

Default context

Any schema ID or subject name without an explicit context lives in the default context, which is represented as a single dot .. An explicit context starts with a dot and can contain any parts separated by additional dots, such as .mycontext.subcontext. You can think of context names as similar to absolute Unix paths, but with dots instead of forward slashes (in this analogy, the default schema context is like the root Unix path). However, there is no relationship between two contexts that share a prefix.

Qualified subjects

A subject name can be qualified with a context, in which case it is called a qualified subject. When a context qualifies a subject, the context must be surrounded by colons. An example is :.mycontext:mysubject. A subject name that is unqualified is assumed to be in the default context, so that mysubject is the same as :.:mysubject (the dot representing the default context).

There are two ways to pass a context to the REST APIs.

Using a qualified subject

A qualified subject can be passed anywhere that a subject name is expected. Most REST APIs take a subject name, such as POST /subjects/{subject}/versions.

There are a few REST APIs that don’t take a subject name as part of the URL path:

  • /schemas/ids/{id}
  • /schemas/ids/{id}/subjects
  • /schemas/ids/{id}/versions

The three APIs above can now take a query parameter named “subject” (written as ?subject), so you can pass a qualified subject name, such as /schemas/ids/{id}?subject=:.mycontext:mysubject, and the given context is then used to look up the schema ID.

Using a base context path

As mentioned, all APIs that specify an unqualified subject operate in the default context. Besides passing a qualified subject wherever a subject name is expected, a second way to pass the context is by using a base context path. A base context path takes the form /contexts/{context} and can be prepended to any existing Schema Registry path. Therefore, to look up a schema ID in a specific context, you could also use the URL /contexts/.mycontext/schemas/ids/{id}.

A base context path can also be used to operate with the default context. In this case, the base context path takes the form “/contexts/:.:/”; for example, /contexts/:.:/schemas/ids/{id}. A single dot cannot be used because it is omitted by some URL parsers.

Multi-Context APIs

All the examples so far operate in a single context. There are three APIs that return results for multiple contexts.

  • /contexts
  • /schemas?subjectPrefix=:*:
  • /subjects?subjectPrefix=:*:

The first API above, /contexts, returns a list of all contexts. The other two APIs, /schemas and /subjects, normally only operate in the default context. They can be used to query all contexts by passing a subjectPrefix with the value :*:, called the context wildcard. The context wildcard matches all contexts.

Specifying a context name for clients

When using a client to talk to Schema Registry, you may want the client to use a particular context. An example of this scenario is when migrating a client from communicating with one Schema Registry to another. You can achieve this by using a base context path, as defined above. To do this, simply change the Schema Registry URL used by the client from https://<host1> to https://<host2>/contexts/.mycontext.

Note that by using a base context path in the Schema Registry URL, the client will use the same schema context for every Schema Registry request. However, an advanced scenario might involve a client using different contexts for different topics. To achieve this, you can specify a context name strategy to the serializer or deserializer:

  • context.name.strategy=com.acme.MyContextNameStrategy

The context name strategy is a class that must implement the following interface:

/**
 * A {@link ContextNameStrategy} is used by a serializer or deserializer to determine
 * the context name used with the schema registry.
 */
public interface ContextNameStrategy extends Configurable {

  /**
   * For a given topic, returns the context name to use.
   *
   * @param topic The Kafka topic name.
   * @return The context name to use
   */
  String contextName(String topic);
}

Again, the use of a context name strategy should not be common. Specifying the base context path in the Schema Registry URL should serve most needs.

Schema Exporters

What is a Schema Exporter?

Previously, Confluent Replicator was the primary means of migrating schemas from one Schema Registry cluster to another, as long as the source Schema Registry cluster was on-premise. To support schema migration using this method, the destination Schema Registry is placed in IMPORT mode, either globally or for a specific subject.

The new schema exporter functionality replaces and extends the schema migration functionality of Replicator. Schema exporters reside within a Schema Registry cluster, and can be used to replicate schemas between two Schema Registry clusters in Confluent Cloud.

Schema Linking

You use schema exporters to accomplish Schema Linking, using contexts and/or qualified subject names to sync schemas across registries. Schema contexts provide the conceptual basis and namespace framework, while the exporter does the heavy-lift work of the linking.

Schemas export from the source default context to a new context on the destination

By default, a schema exporter exports schemas from the default context in the source Schema Registry to a new context in the destination Schema Registry. The destination context (or a subject within the destination context) is placed in IMPORT mode. This allows the destination Schema Registry to use its default context as usual, without affecting any clients of its default context.

The new context created by default in the destination Schema Registry will have the form .lsrc-xxxxxx, taken from the logical name of the source.

Schema Registry clusters can export schemas to each other

Two Schema Registry clusters can each have a schema exporter that exports schemas from the default context to the other Schema Registry. In this setup, each side can read from or write to the default context, and each side can read from (but not write to) the exported context. This allows you to match the setup of Cluster Linking on Confluent Cloud, where you might have a source topic and a read-only mirror topic on each side.

An exporter can copy schemas across contexts in the same Schema Registry

In addition, a schema exporter can copy schemas from one context to another within the same Schema Registry cluster. For example, you might create a “.staging” context, and then later copy the schemas from the “.staging” context to the default context when production-ready. When copying schemas to and from the same Schema Registry cluster, use the special URL local:///.

Customizing schema exports

There are various ways to customize which contexts are exported from the source Schema Registry, and which contexts are used in the destination Schema Registry. The full list of configuration properties is shown below.

Configuration Options

A schema exporter has five main configuration properties:

name
A unique name for the exporter.
subjects

This can take several forms:

  • A list of subject names and/or contexts, for example: [ "subject1", "subject2", ".mycontext1", ".mycontext2" ]
  • A singleton list containing a subject name prefix that ends in a wildcard, such as ["mytopic*"]
  • A singleton list containing a lone wildcard, ["*"], that indicates all subjects in the default context. This is the default.
  • A singleton list containing the context wildcard, [":*:"], that indicates all contexts.
contextType

One of:

  • AUTO - Prepends the source context with an automatically generated context, which is .lsrc-xxxxxx for Confluent Cloud. This is the default.
  • CUSTOM - Prepends the source context with a custom context name, specified in context below.
  • NONE - Copies the source context as-is, without prepending anything. This is useful to make an exact copy of the source Schema Registry in the destination.
context
A context name to be used with the CUSTOM contextType above.
config

A set of configurations for creating a client to talk to the destination Schema Registry. Typically, this includes:

  • schema.registry.url - The URL of the destination Schema Registry. This can also be local:/// to allow for more efficient copying if the source and destination are the same.
  • basic.auth.credentials.source - Typically “USER_INFO”
  • basic.auth.user.info - Typically of the form <api-key>:<api-secret>

Lifecycle and States

Schema Registry stores schemas in a Kafka topic. A schema exporter uses the topic offset to determine its progress.

When a schema exporter is created, it begins in the STARTING state. While in this state, it finds and exports all applicable schemas already written to the topic. After exporting previously registered schemas, the exporter then enters the RUNNING state, during which it will be notified of any new schemas, which it can export if applicable. As schemas are exported, the exporter will save its progress by recording the latest topic offset.

If you want to make changes to the schema exporter, you must first “pause” it, which causes it to enter the PAUSED state. The exporter can then be resumed after the proper changes are made. Upon resumption, the exporter will find and export any applicable schemas since the last offset that it recorded.

While an exporter is paused, it can also be “reset”, which will cause it to clear its saved offset and re-export all applicable schemas when it resumes. To accomplish this, the exporter starts off again in STARTING state after a reset, and follows the same lifecycle.

The states of a schema exporter at various stages in its lifecycle are summarized below.

State Description
STARTING The exporter finds and exports all applicable previously registered schemas for the topic. This is the starting state, or the state after a reset.
RUNNING The exporter is notified of new schemas, exports them if applicable, and tracks progress by recording last topic offset.
PAUSED An exporter can be paused; for example, to make configuration changes. When it resumes, the exporter finds and exports schemas since the last recorded offset.

REST APIs

Schema Registry supports the following REST APIs, as fully detailed in Exporters in the Schema Registry API documentation:

Task API
Gets a list of exporters for a tenant GET /exporters
Creates a new exporter POST /exporters
Gets info about an exporter GET /exporters/{name}
Gets the config for an exporter GET /exporters/{name}/config
Gets the status of an exporter GET /exporters/{name}/status
Updates the information for an exporter PUT /exporters/{name}/config
Pauses an exporter PUT /exporters/{name}/pause
Resumes an exporter PUT /exporters/{name}/resume
Resets an exporter, clears offsets PUT /exporters/{name}/reset
Deletes an exporter DELETE /exporters/{name}