Manage Schema Linking for Confluent Platform Using Confluent for Kubernetes Blueprints

Schema Linking is a Confluent feature for keeping schemas in sync between two Schema Registry clusters.

You can use Schema Linking in conjunction with Cluster Linking to keep both schemas and topic data in sync across two Schema Registry and Kafka clusters.

Schema Linking can also be used independently of Cluster Linking for replicating schemas between clusters for purposes of aggregation, backup, staging, and migration of schemas.

Schema Linking is supported using schema exporters that reside in Schema Registry and continuously export schemas from one context to another within the same Schema Registry cluster or across a different Schema Registry cluster.

A schema exporter can sync schemas in groups, referred to as schema context. Each schema context is an independent grouping of schema IDs and subject names. If schemas are exported without any context (contextType: NONE), those schemas are exported as is and go into the default context.

See Schema Linking for complete details of the Schema Linking feature.

The high-level workflow to run Schema Linking is:

  1. Deploy the source and the destination Schema Registry clusters.

  2. Set up the password encoder in the source and destination Schema Registry clusters.

  3. Define schemas in the source Schema Registry cluster.

    When you register schemas in the source cluster, you can specify a custom context by inserting the context in the schema name. If no context is given, the default context is used.

  4. Create a schema exporter in the source Schema Registry cluster.

    Exported schemas are placed in the IMPORT mode in the destination Schema Registry. Changes cannot be made to the schemas in the IMPORT mode.

  5. As needed:

Confluent for Kubernetes (CFK) Blueprints provides a declarative API, the SchemaExporter custom resource definition (CRD), to support the entire workflow of creating and managing schema exporters.

Create a schema exporter

A schema exporter is created and managed in the source Schema Registry cluster.

Note

When RABC is enabled in this Confluent Platform environment, the super user you configured for Kafka (spec.authorization.superUsers in the ConfluentPlatformaBlueprint CR) does not have access to resources in the Schema Registry cluster. If you want the superuser to be able to create schema exporters, grant the superuser permission on the Schema Registry cluster.

In the source Schema Registry cluster, create a SchemaExporter CR and apply the configuration with the kubectl apply -f command:

apiVersion: apps.cpc.platform.confluent.io/v1beta1
kind: SchemaExporter
spec:
  source:                        --- [1]
    schemaRegistryClusterRef:    --- [2]
       name:                     --- [3]
       namespace:                --- [4]
  destination:                   --- [5]
    schemaRegistryClusterRef:    --- [6]
       name:                     --- [7]
       namespace:                --- [8]
  subjects:                      --- [9]
  subjectRenameFormat:           --- [10]
  contextType:                   --- [11]
  contextName:                   --- [12]
  configs:                       --- [13]
  schemaExporterAction:          --- [14]
  • [1] Required. The source Schema Registry cluster. You can either specify the cluster name or the endpoint.

  • [2] Required. The reference to the source Schema Registry cluster.

  • [3] Required. The name of the source Schema Registry cluster.

  • [4] The namespace of the source Schema Registry cluster. If omitted, the namespace of this SchemaExporter CR is used.

  • [5] The destination Schema Registry cluster where the schemas will be exported. If not defined, the source cluster is used as the destination, and the schema exporter will be exporting schemas across contexts within the source cluster.

  • [6] Required. The reference to the destination Schema Registry cluster.

  • [7] Required. The name of the destination Schema Registry cluster.

  • [8] The namespace of the destination Schema Registry cluster. If omitted, the namespace of this SchemaExporter CR is used.

  • [9] The subjects to be exported to the destination. The default value is ["*"], which denotes all subjects in the default context.

  • [10] The rename format that defines how to rename the subject at the destination.

    For example, if the value is my-${subject}, subjects at the destination will become my-XXX where XXX is the original subject.

  • [11] Specify how to create the context at the destination when moving the subjects. The valid values are AUTO and NONE.

    The default value is AUTO, with which, the exporter will use an auto-generated context in the destination cluster. The auto-generated context name will be reported in the status.

    If set to NONE, the exporter copies the source schemas as-is.

  • [12] The name of the schema context on the destination Schema Registry cluster. If this is defined, spec.contextType is ignored. Schemas will be exported to the context in the destination based on its contextType.

  • [13] A map of string key and value pairs for additional configs not supported by the SchemaExporter CRD properties. For the list of properties, see Schema Linking properties.

  • [14] The type of action on the running schema exporter. Valid values are Pause, Reset, and Resume.

The following is a snippet of an example SchemaExporter CR:

apiVersion: apps.cpc.platform.confluent.io/v1beta1
kind: SchemaExporter
spec:
  schemaExporterAction: Auto
  source:
    schemaRegistryClusterRef:
      name: schemaregistry-ss
  destination:
    schemaRegistryClusterRef:
      name: schemaregistry-ss-dev
  contextName: control-plane-sr-exporter

Update a schema exporter

When you update the configuration of an existing exporter, CFK Blueprints pauses the exporter, updates the config, and resumes the exporter.

The following properties of the configuration cannot be changed for an existing exporter. The existing exporter should be deleted and re-created:

  • Source Schema Registry
  • Destination Schema Registry
  • Name of the schema exporter

Edit the SchemaExporter CR with desired configs and apply it with the kubectl apply -f <Schema Exporter CR> command.

The context type (contextType) defaults to AUTO only during creation. If you created a schema exporter with custom context and want to edit it to use an auto-generated context, contextType should be explicitly set to AUTO.

If the context name (contextName) is edited, only the new subjects/schema will be exported to the new context. Older schemas synced before the update will get synced in the earlier context. To migrate all the old schemas to the new context, you need to reset the exporter.

Similarly, if the subjectRenameFormat is edited, only the new schema will be migrated with the new name format. You need to reset the exporter to remigrate the already synced schemas with the new name format.

Reset a schema exporter

A schema exporter is in one of the STARTING, RUNNING, and PAUSED states.

Reset a schema exporter to clear its saved offset.

To reset a schema exporter, add the reset exporter annotation to the SchemaExporter CR with the command:

kubectl annotate schemaexporter schemaexporter platform.confluent.io/reset-schema-exporter="true"

Pause a schema exporter

To pause a schema exporter, add the pause exporter annotation to the schema exporter CR with the command:

kubectl annotate schemaexporter schemaexporter platform.confluent.io/pause-schema-exporter="true".

Resume a schema exporter

To resume a schema exporter, add the resume exporter annotation to the schema exporter CR with the command:

kubectl annotate schemaexporter schemaexporter platform.confluent.io/resume-schema-exporter="true".

Delete a schema exporter

Deleting the schema exporter does not delete the schemas already exported to the destination. The schemas exported to the destination Schema Registry stay in the last synced state.

Once the schema link is broken, exported schemas can be moved out of IMPORT mode using migration as explained in Migrate Schemas.

After the schemas are moved out of the IMPORT mode, to manage those schemas on the destination Schema Registry, create Schema CRs for those schemas on the destination cluster.

To delete a schema exporter:

kubectl delete schemaexporter schemaexporter.