Important

You are viewing documentation for an older version of Confluent Platform. For the latest, click here.

Schema Registry Tutorials

Overview

These tutorials provide a step-by-step workflow for using Confluent Schema Registry on-premises and in Confluent Cloud. You will learn how to enable client applications to read and write Avro data and check compatibility as schemas evolve.

Schema Registry Benefits

Apache Kafka® producers write data to Kafka topics and Kafka consumers read data from Kafka topics. There is an implicit “contract” that producers write data with a schema that can be read by consumers, even as producers and consumers evolve their schemas. Schema Registry helps ensure that this contract is met with compatibility checks.

It is useful to think about schemas as APIs. Applications depend on APIs and expect any changes made to APIs are still compatible and applications can still run. Similarly, streaming applications depend on schemas and expect any changes made to schemas are still compatible and they can still run. Schema evolution requires compatibility checks to ensure that the producer-consumer contract is not broken. This is where Schema Registry helps: it provides centralized schema management and compatibility checks as schemas evolve.

Target Audience

The target audience is a developer writing Kafka streaming applications who wants to build a robust application leveraging Avro data and Schema Registry. The principles in this tutorial apply to any Kafka client that interacts with Schema Registry.

Terminology Review

First let us levelset on terminology, and answer the question: What is a topic versus a schema versus a subject?

A Kafka topic contains messages, and each message is a key-value pair. Either the message key or the message value, or both, can be serialized as Avro, JSON, or Protobuf. A schema defines the structure of the data format. The Kafka topic name can be independent of the schema name. Schema Registry defines a scope in which schemas can evolve, and that scope is the subject. The name of the subject depends on the configured subject name strategy, which by default is set to derive subject name from topic name.

Starting with Confluent Platform 5.5.0, you can modify the subject name strategy on a per-topic basis. See Change the subject naming strategy for a topic to learn more.

As a practical example, let’s say a retail business is streaming transactions in a Kafka topic called transactions. A producer is writing data with a schema Payment to that Kafka topic transactions. If the producer is serializing the message value as Avro, then Schema Registry has a subject called transactions-value. If the producer is also serializing the message key as Avro, Schema Registry would have a subject called transactions-key, but for simplicity, in this tutorial consider only the message value. That Schema Registry subject transactions-value has at least one schema called Payment. The subject transactions-value defines the scope in which schemas for that subject can evolve and Schema Registry does compatibility checking within this scope. In this scenario, if developers evolve the schema Payment and produce new messages to the topic transactions, Schema Registry checks that those newly evolved schemas are compatible with older schemas in the subject transactions-value and adds those new schemas to the subject.

Choose Your Deployment

There are two Schema Registry tutorials you can choose from:

  1. On-Premises Schema Registry Tutorial: for on-premises Kafka deployments and self-managed Confluent Schema Registry
  2. Confluent Cloud Schema Registry Tutorial: for Confluent Cloud deployments and fully-managed Confluent Cloud Schema Registry

See also

For an example that shows this in action, see the Confluent Platform demo. Refer to the demo’s docker-compose.yml file for a configuration reference.