.. _sr-maven-plugin: |sr| Maven Plugin for |cp| -------------------------- A Maven plugin for |sr-long| is available to help throughout the development process, with configuration options as listed below. .. tip:: There is no official out-of-the-box Gradle plugin available for |sr|. However, you can reference any of these plugins in your Maven ``pom.xml`` (`Project Object Model file `__): - `Imflog Kafka Schema Registry Gradle Plugin `__ (link to GitHub repository for this plugin is `here `__) - `com.commercehub.gradle.plugin.avro `__ (link to GitHub repository for this plugin is `here `__) - `Maven plugin `__, used in the :devx-examples:`pom.xml|clients/avro/pom.xml` example file in the |sr| tutorial. configs for all goals ===================== Starting with |cp| 7.0.0, the ``configs`` option is available for the |sr| Maven plugin for all goals. You can use ``configs`` to add any valid configuration to the ``CachedSchemaRegistryClient``. The syntax details are: ``configs`` * Type: Map * Required: false For example, to set up SSL for the Maven plugin, specify the keystore/truststore location by adding the following configuration to the plugin: .. codewithvars:: bash path-to-keystore.jks password path-to-truststore.jks password Note that the ``schema.registry.`` prefix is needed, just like other :platform:`Schema Registry client configurations for HTTPS|schema-registry/security/index.html#sr-https-additional`. .. _schema-registry-download: schema-registry:download ======================== The ``download`` goal is used to pull down schemas from a |sr| server. This plugin is used to download schemas for the requested subjects and write them to a folder on the local file system. If the ``versions`` array is empty, then the latest schemas are downloaded for all subjects. If the array is populated, its length must match the length of ``subjectPatterns`` array. This allows you to compare a new schema against a specific schema and to get older versions. .. tip:: In |cp| version 7.2.0-0 and later, you can specify a version to download, which better supports :ref:`schema-registry-test-local-compatibility`. Previous to |cp| 7.2.0-0, you could only download the latest version. ``schemaRegistryUrls`` |sr| URLs to connect to. * Type: String[] * Required: true ``userInfoConfig`` User credentials for connecting to |sr|, of the form ``user:password``. This is required if connecting to |sr-ccloud|. * Type: String[] * Required: false * Default: null ``outputDirectory`` Output directory to write the schemas to. * Type: File * Required: true ``schemaExtension`` The file extension to use for the output file name. This must begin with a ``.`` character. * Type: File * Required: false * Default: .avsc ``subjectPatterns`` The subject patterns to download. This is a list of regular expressions. Patterns must match the entire subject name. * Type: String[] * Required: true Example 1: Download the latest version of a schema ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The following code uses the plugin to download the latest version of schemas with the subject pattern ``^TestSubject000-(key|value)$`` : .. codewithvars:: bash io.confluent kafka-schema-registry-maven-plugin |release| http://192.168.99.100:8081 src/main/avro ^TestSubject000-(key|value)$ Example 2: Specify a schema version to download ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The following code uses the plugin to download version ``1`` of schemas with the subject pattern ``^topic1-(key|value)$``, and the latest version of schemas with the subject pattern ``^topic2-(key|value)$``: .. codewithvars:: bash io.confluent kafka-schema-registry-maven-plugin |release| http://127.0.0.1:8081 outputDir/target/ ^topic1-(key|value)$ ^topic2-(key|value)$ 1 latest .. _schema-registry-set-compatibility: schema-registry:set-compatibility ================================== The goal, ``schema-registry:set-compatibility``, is available in |cp| version 7.2.0-0 and later. This goal is used to update the configuration of a subject or at a global level directly from the plugin. This enables you to change the compatibility levels with evolving schemas, and centralize your subject and schema management. ``schemaRegistryUrls`` |sr| URLs to connect to. * Type: String[] * Required: true ``compatibilityLevels`` Map of subjects and the compatibility types to be set in them, respectively. * Type: Map (Subject,CompatibilityLevel) * Required: true If subject is ``NULL`` or ``__GLOBAL``, then the Global level configuration is updated. If ``CompatibilityLevel`` is NULL, then the configuration is deleted. The following example uses the plugin to set the compatibility of the subject ``order`` to ``BACKWARD``, ``product`` to ``FORWARD_TRANSITIVE``, deleting the compatibility of ``customer`` and changing the Global compatibility to ``BACKWARD_TRANSITIVE``: .. codewithvars:: bash io.confluent kafka-schema-registry-maven-plugin |release| http://192.168.99.100:8081 BACKWARD FORWARD_TRANSITIVE null <__GLOBAL>BACKWARD_TRANSITIVE Example Usage ~~~~~~~~~~~~~ Example usage of ``schema-registry:test-compatibility``: * to a local |sr|, see |sr| tutorial. * to |sr-ccloud|, see :devx-examples:`GitHub example|clients/cloud/java/README.md#schema-evolution-with-confluent-cloud-schema-registry`. .. _schema-registry-test-local-compatibility: schema-registry:test-local-compatibility ======================================== This goal, available in |cp| version 7.2.0-0 and later, tests compatibility of a local schema with other existing local schemas during development and testing phases. Before the addition of ``schema-registry:test-local-compatibility``, if you wanted to check compatibility of a new schema you had to connect to the |sr|. This meant registering all the schemas for which you want to perform compatibility checks, resulting in a reduction in available free schemas. This new goal solves that problem and supports quick, efficient compatibility testing of local schemas as appropriate for development phases. .. tip:: For examples of testing schema compatibility using `GitHub Actions `__, see the :ref:`maven-example-workflows` provided below, and the `kafka-github-actions demo repo `__. ``schemas`` Map of schema and location of schemas for which compatibility test is performed * Type: Map * Required: true ``previousSchemaPaths`` Map of schema and location of previous schemas. The compatibility test is performed for a schema against the schemas in ``previousSchemaPaths``. The location can be a directory or file name. If it is a directory name, all files inside folder are added. Subdirectories are ignored. * Type: Map * Required: true ``compatibilityLevels`` Map of schema and the compatibility type for which check is performed. ``CompatibilityLevel`` is of type enum (one of ``NONE``, ``BACKWARD``, ``BACKWARD_TRANSITIVE``, ``FORWARD``, ``FORWARD_TRANSITIVE``, ``FULL``, or ``FULL_TRANSITIVE``) For compatibility level ``BACKWARD``, ``FORWARD``, or ``FULL``, exactly one ``previousSchema`` is expected per schema. * Type: Map * Required: true ``schemaTypes`` String that specifies the :ref:`schema type `. * Type: Map (one of ``AVRO`` (default), ``JSON``, ``PROTOBUF``) * Required: false * Default: AVRO The following example uses the plugin to configure three subjects (``order``, ``product``, and ``customer``) using schema type: ``AVRO`` .. codewithvars:: bash src/main/avro/order.avsc src/main/avro/product.avsc src/main/avro/customer.avsc AVRO AVRO AVRO BACKWARD FORWARD NONE src/main/avro/order.avsc src/main/avro/products/ src/main/avro/customer.avsc .. _schema-registry-test-compatibility: schema-registry:test-compatibility ================================== This goal is used to read schemas from the local file system and test them for compatibility against the |sr| server(s). This goal can be used in a continuous integration pipeline to ensure that schemas in the project are compatible with the schemas in another environment. .. tip:: For examples of testing schema compatibility using `GitHub Actions `__, see the :ref:`maven-example-workflows` provided below, and the `kafka-github-actions demo repo `__. ``schemaRegistryUrls`` |sr| URLs to connect to. * Type: String[] * Required: true ``userInfoConfig`` User credentials for connecting to |sr|, of the form ``user:password``. This is required if connecting to |sr-ccloud|. * Type: String[] * Required: false * Default: null ``subjects`` Map containing subject to schema path of the subjects to be registered. * Type: Map * Required: true .. tip:: Starting with |cp| 5.5.5, you can specify a slash ``/``, and other special characters, in a subject name. To do so, first URL-encode the subject name, and then replace non-valid characters in the output. For example, if you have a subject name such as ``path/to/my.proto``, the URL encoding would produce something like ```%2Fpath%2Fto%2Fmy.proto```, which you can then revise by replacing ``%`` with ``_x`` as follows: ``_x2Fpath_x2Fto_x2Fmy.proto`` (because ``%`` is not valid in an XML name). The reasoning behind this is that the Maven plugin requires subject names be specified as XML elements, but some characters, like slashes, are not valid characters in an XML name. You might want to use slashes to register a `Protobuf schema `__ that is referenced by another schema, such as ``/path/to/my.proto`` for the subject. This workaround enables you to do that. ``schemaTypes`` String that specifies the :ref:`schema type `. * Type: String (one of ``AVRO`` (default), ``JSON``, ``PROTOBUF``) * Required: false * Default: AVRO ``references`` Map containing a reference name and a subject. * Type: Map * Required: false ``metadata`` Map containing a subject and a Metadata object. * Type: Map * Required: false ``ruleSet`` Map containing a subject and a RuleSet object. * Type: Map * Required: false ``verbose`` Include in the output the reason the schema fails the compatibility test, in cases where it fails. * Type: Boolean * Required: false * Default: true The following example uses the plugin to configure three subjects (``order``, ``product``, and ``customer``) using schema type: ``AVRO`` .. codewithvars:: bash io.confluent kafka-schema-registry-maven-plugin |release| http://192.168.99.100:8081 src/main/avro/order.avsc src/main/avro/product.avsc src/main/avro/customer.avsc AVRO AVRO AVRO com.acme.Product product com.acme.Customer customer test-compatibility Example Usage ~~~~~~~~~~~~~ Example usage of ``schema-registry:test-compatibility``: * to a local |sr|, see the |sr| tutorial`. * to |sr-ccloud|, see :devx-examples:`GitHub example|clients/cloud/java/README.md#schema-evolution-with-confluent-cloud-schema-registry`. .. _schema-registry-derive-schema: schema-registry:derive-schema ============================= This goal is used to automatically generate a schema (Avro, JSON or ProtoBuf) from a given file containing messages in JSON format. The generated schema, provided as output to the ``derive-schema`` command, can be used as is or as a starting point for developers to build on. The derive-schema goal takes the following three parameters (inputs). ``messagePath`` Location of file containing messages. Each message must be in JSON format and on a new line. * Type: File * Required: true ``outputPath`` Location of file to which result is written. * Type: File * Required: true ``schemaType`` String that specifies the :ref:`schema type `. * Type: String (one of ``AVRO`` (default), ``JSON``, ``PROTOBUF``) * Required: false * Default: AVRO The output of the ``derive-schema`` command is a JSON with one or more generated schemas. The output includes the schema itself, along with messages matched, represented by the line numbers in the input file starting from 0. .. code:: bash { "schemas": [ { "schema": "SCHEMA", "messagesMatched": [0,3,5] }, { "schema": "SCHEMA", "messagesMatched": [1,2,4] } ] } Depending upon the schema format (type), the output will vary in terms of number of schemas returned, whether or not the format allows for optional fields, arrays with multiple data types, and so on. These `output rules` along with examples of input messages and resulting schemas for each format are shown below. Avro rules and examples ~~~~~~~~~~~~~~~~~~~~~~~ Avro output rules are as follows. - Multiple schemas can be returned. - Arrays are expected to have a single data type. Any message not following this will throw an error. - Avro does not support optional fields, so records cannot be combined together. - Unions are supported. - Arrays containing multiple data types using unions are supported. Example Avro messages are shown below: .. codewithvars:: bash {"name": "Foo", "Age": {"int": 12}} {"name": "Bar", "Age": {"string": "12"}} {"sport": "Football"} Here is an example of a generated Avro schema based on Avro message inputs. Message 0 and Message 1 both have field ``Age`` of type union and can be merged into one schema. Message 2 has a different field ``sport`` which cannot be merged with other messages, so we have a different schema for Message 2. .. codewithvars:: bash { "schemas" : [ { "schema" : { "type" : "record", "name" : "Schema", "fields" : [ { "name" : "Age", "type" : [ "int", "string" ] }, { "name" : "name", "type" : "string" } ] }, "messagesMatched" : [ 0, 1 ] }, { "schema" : { "type" : "record", "name" : "Schema", "fields" : [ { "name" : "sport", "type" : "string" } ] }, "messagesMatched" : [ 2 ] } ] } JSON Schema rules and examples ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ JSON Schema output rules are as follows. - Exactly one schema is returned for all inputs. - The data type for an array is chosen by combining the data types of all its elements. - If there are multiple datatypes for the same name, they are combined together using oneOf. For records with the same name, all their fields are combined into one record. For arrays with the same name, their data types are combined. Example JSON Schema messages are shown below: .. codewithvars:: bash {"name": "Foo", "Age": 12} {"name": "Bar", "Age": "12"} {"sport": "Football"} Here is an example of a generated JSON Schema based on JSON Schema message inputs. In JSON Schema, all fields are optional. The field ``sport`` is merged with other messages. For the field ``Age``, its types string and number are combined using ``oneOf``. .. codewithvars:: bash { "schemas" : [ { "schema" : { "type" : "object", "properties" : { "Age" : { "oneOf" : [ { "type" : "number" }, { "type" : "string" } ] }, "name" : { "type" : "string" }, "sport" : { "type" : "string" } } }, "messagesMatched" : [ 0, 1, 2 ] } ] } ProtoBuf rules and examples ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Protobuf output rules are as follows. - Multiple schemas can be returned. - Arrays are expected to have a single data type. Any message not following this will throw an error. - For records with the same name, all their fields are combined into one record. Example Protobuf messages are shown below: .. codewithvars:: bash {"name": "Foo", "Age": 12} {"name": "Bar", "Age": "12"} {"sport": "Football"} Here is an example of a generated Protobuf based on Protobuf message inputs. In ``proto3``, all fields are optional. Message 0 and Message 2 can be merged, assuming field ``sport`` to be optional. The same applies to Message 1 and Message 2. Message 0 and Message 1 have conflicting data types for field ``Age``, so two different schemas are generated. .. codewithvars:: bash { "schemas" : [ { "schema" : "syntax = \"proto3\";\n\nmessage Schema {\n int32 Age = 1;\n string name = 2;\n string sport = 3;\n}\n", "messagesMatched" : [ 0, 2 ] }, { "schema" : "syntax = \"proto3\";\n\nmessage Schema {\n string Age = 1;\n string name = 2;\n string sport = 3;\n}\n", "messagesMatched" : [ 1, 2 ] } ] } Primitive Data Types Mapping ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The following table shows how data types are interpreted for each schema type during schema creation. .. csv-table:: :header: "Actual Class", "Avro Datatype Chosen", "JSON Datatype Chosen", "ProtoBuf Datatype Chosen" "Long", "long", "number", "int64" "Short/Integer", "int", "number", "int32" "Null", "null", "null", "google.protobuf.Any" "BigInteger/BigDecimal", "double", "number", "double" "Float/Double", "double", "number", "double" "Boolean", "boolean", "boolean", "boolean" "String", "string", "string", "string" Workflow and POM.xml example ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #. Configure the POM.xml for a schema, including all required configuration parameters. (This example specifies Avro.) .. codewithvars:: bash io.confluent kafka-schema-registry-maven-plugin |release| messanges/my-message-01.txt avro new-schema-01.json #. Run the derive-schema command. .. code:: bash mvn io.confluent:kafka-schema-registry-maven-plugin:derive-schema #. View the generated schema. schema-registry:validate ======================== This goal is used to read schemas from the local file system and validate them locally, before registering them. If you find syntax errors, you can examine and correct them before submitting schemas to |sr| with ``schema-registry:register``. ``schemaRegistryUrls`` |sr| URLs to connect to. * Type: String[] * Required: true ``userInfoConfig`` User credentials for connecting to |sr|, of the form ``user:password``. This is required if connecting to |sr-ccloud|. * Type: String[] * Required: false * Default: null ``subjects`` Map containing subject to schema path of the subjects to be registered. * Type: Map * Required: true ``schemaTypes`` String that specifies the :ref:`schema type `. * Type: String (one of ``AVRO`` (default), ``JSON``, ``PROTOBUF``) * Required: false * Default: AVRO ``references`` Map containing a reference name and a subject. (The referenced schema must be registered.) * Type: Map * Required: false ``metadata`` Map containing a subject and a Metadata object. * Type: Map * Required: false ``ruleSet`` Map containing a subject and a RuleSet object. * Type: Map * Required: false .. _schema-registry-register: schema-registry:register ======================== This goal is used to read schemas from the local file system and register them on the target |sr| server(s). This goal can be used in a continuous deployment pipeline to push schemas to a new environment. ``schemaRegistryUrls`` |sr| URLs to connect to. * Type: String[] * Required: true ``userInfoConfig`` User credentials for connecting to |sr|, of the form ``user:password``. This is required if connecting to |sr-ccloud|. * Type: String[] * Required: false * Default: null ``subjects`` Map containing subject to schema path of the subjects to be registered. * Type: Map * Required: true ``schemaTypes`` String that specifies the :ref:`schema type `. * Type: String (one of ``AVRO`` (default), ``JSON``, ``PROTOBUF``) * Required: false * Default: AVRO ``normalizeSchemas`` Normalizes schemas based on semantic equivalence during registration or lookup. To learn more, see :ref:`schema-normalization` in `Formats, Serializers, and Deserializers `__. * Type: Boolean * Required: false * Default: false ``references`` Map containing a reference name and a subject. * Type: Map * Required: false ``metadata`` Map containing a subject and a Metadata object. * Type: Map * Required: false ``ruleSet`` Map containing a subject and a RuleSet object. * Type: Map * Required: false The following example uses the plugin to configure three subjects (``order``, ``product``, and ``customer``) using schema type: ``AVRO`` .. codewithvars:: bash io.confluent kafka-schema-registry-maven-plugin |release| http://192.168.99.100:8081 src/main/avro/order.avsc src/main/avro/product.avsc src/main/avro/customer.avsc AVRO AVRO AVRO com.acme.Product product com.acme.Customer customer register The following plugin example uses ``metadata`` and ``ruleSet`` parameters to add some metadata and perform some checks on the data as a part of registering the schema. .. codewithvars:: bash io.confluent kafka-schema-registry-maven-plugin 7.4.0 http://192.168.99.100:8081 src/main/avro/customer.avsc AVRO PII PHI Bob Jones bob@acme.com checkSsnLen Check the SSL length. CONDITION WRITE CEL PII PHI value1 value2 size(message.ssn) == 9 NONE DLQ false register .. _maven-example-workflows: Workflows and examples ====================== You can integrate Maven Plugin goals with `GitHub Actions `__ into a continuous integration/continuous deployment (CI/CD) pipleline to manage schemas on |sr|. A general example for developing and validating an |ak-tm| client application with a Python producer and consumer is provided in the `kafka-github-actions demo repo `__. Here is an alternative sample `pom.xml `__ with project configurations for more detailed validate and register steps. .. codewithvars:: bash 4.0.0 io.confluent GitHub-Actions-Demo 1.0 confluent https://packages.confluent.io/maven/ <$CONFLUENT_SCHEMA_REGISTRY_URL> <$CONFLUENT_BASIC_AUTH_USER_INFO> |release| io.confluent kafka-schema-registry-maven-plugin ${confluent.version} ${schemaRegistryUrl} ${schemaRegistryBasicAuthUserInfo} validate validate validate src/main/resources/order.avsc src/main/resources/flight.proto PROTOBUF set-compatibility validate set-compatibility FORWARD_TRANSITIVE FORWARD_TRANSITIVE test-local validate test-local-compatibility src/main/resources/order.avsc src/main/resources/flight.proto ProtoBuf src/main/resources/flightSchemas src/main/resources/orderSchemas FORWARD_TRANSITIVE FORWARD_TRANSITIVE test-compatibility validate test-compatibility src/main/resources/order.avsc src/main/resources/flight.proto PROTOBUF register register src/main/resources/order.avsc src/main/resources/flight.proto PROTOBUF The following workflows can be coded as GitHub actions to accomplish CICD for schema management. #. When a pull request is created to merge a new schema to master, validate the schema, check local schema compatibility, set compatibility of subject, and test schema compatibility with subject. .. sourcecode:: properties run: mvn validate The validate step would include: .. code:: bash mvn schema-registry:validate@validate mvn schema-registry:test-local-compatibility@test-local mvn schema-registry:set-compatibility@set-compatibility mvn schema-registry:test-compatibility@test-compatibility Integrated with GitHub Actions, the ``pull-request.yaml`` for this step might look like this: .. codewithvars:: bash name: Testing branch for compatibility before merging on: pull_request: branches: [ master ] paths: [src/main/resources/*] jobs: validate: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - uses: actions/setup-java@v2 with: java-version: '11' distribution: 'temurin' cache: maven - name: Validate if schema is valid run: mvn schema-registry:validate@validate test-local-compatibility: needs: validate runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - uses: actions/setup-java@v2 with: java-version: '11' distribution: 'temurin' cache: maven - name: Test schema with locally present schema run: mvn schema-registry:test-local-compatibility@test-local set-compatibility: needs: test-local-compatibility runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - uses: actions/setup-java@v2 with: java-version: '11' distribution: 'temurin' cache: maven - name: Set compatibility of subject run: mvn schema-registry:set-compatibility@set-compatibility test-compatibility: needs: set-compatibility runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - uses: actions/setup-java@v2 with: java-version: '11' distribution: 'temurin' cache: maven - name: Test schema with subject run: mvn schema-registry:test-compatibility@test-compatibility If compatibility checking passes a new pull request is created for approval. #. Register schema when a pull request is approved and merged to master. Run the action to register the new schema on the |sr|: .. code:: bash run: mvn schema-registry:register@register The ``push.yaml`` for this step would look like this: .. codewithvars:: bash name: Registering Schema on merge of pull request on: push: branches: [ master ] paths: [src/main/resources/*] jobs: register-schema: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - uses: actions/setup-java@v2 with: java-version: '11' distribution: 'temurin' cache: maven - name: Register Schema run: mvn io.confluent:kafka-schema-registry-maven-plugin:register@register Related content =============== - Blog Post: `4 Must-Have Tests for Your Apache Kafka CI/CD with GitHub Actions `__ - `Formats, Serializers, and Deserializers `__ - `Avro Schema `__ - `JSON Schema `__ - `Protobuf schema `__