These tutorials provide a step-by-step workflow for using Confluent Schema Registry on-premises and in Confluent Cloud.
You will learn how to enable client applications to read and write Avro data and check compatibility as schemas evolve.
Schema Registry Benefits
Apache Kafka® producers write data to Kafka topics and Kafka consumers read data from Kafka topics.
There is an implicit “contract” that producers write data with a schema that can be read by consumers, even as producers and consumers evolve their schemas.
Schema Registry helps ensure that this contract is met with compatibility checks.
It is useful to think about schemas as APIs.
Applications depend on APIs and expect any changes made to APIs are still compatible and applications can still run.
Similarly, streaming applications depend on schemas and expect any changes made to schemas are still compatible and they can still run.
Schema evolution requires compatibility checks to ensure that the producer-consumer contract is not broken.
This is where Schema Registry helps: it provides centralized schema management and compatibility checks as schemas evolve.
First let us levelset on terminology, and answer the question: What is a topic versus a schema versus a subject?
A Kafka topic contains messages, and each message is a key-value pair. Either
the message key or the message value, or both, can be serialized as Avro, JSON,
or Protobuf. A schema defines the structure of the data format. The Kafka topic
name can be independent of the schema name. Schema Registry defines a scope in which
schemas can evolve, and that scope is the subject. The name of the subject
depends on the configured subject name strategy,
which by default is set to derive subject name from topic name.
Starting with Confluent Platform 5.5.0, you can modify the subject name strategy on a per-topic basis. See Change the subject naming strategy for a topic to learn more.
As a practical example, let’s say a retail business is streaming transactions in a Kafka topic called
A producer is writing data with a schema
Payment to that Kafka topic
If the producer is serializing the message value as Avro, then Schema Registry has a subject called
If the producer is also serializing the message key as Avro, Schema Registry would have a subject called
transactions-key, but for simplicity, in this tutorial consider only the message value.
That Schema Registry subject
transactions-value has at least one schema called
transactions-value defines the scope in which schemas for that subject can evolve and Schema Registry does compatibility checking within this scope.
In this scenario, if developers evolve the schema
Payment and produce new messages to the topic
transactions, Schema Registry checks that those newly evolved schemas are compatible with older schemas in the subject
transactions-value and adds those new schemas to the subject.