Secure Schema Registry for Confluent Platform

This page describes security considerations, configuration, and management details for Schema Registry on Confluent Platform.

Features

Confluent Schema Registry currently supports all Kafka security features, including:

For configuration details, check the configuration options.

See also

For a configuration example that uses Schema Registry configured with security to a secure Kafka cluster, see the Confluent Platform demo.

Schema Registry to Kafka Cluster

Kafka Store

Kafka is used as Schema Registry storage backend. The special Kafka topic <kafkastore.topic> (default _schemas), with a single partition, is used as a highly available write ahead log. All schemas, subject/version and ID metadata, and compatibility settings are appended as messages to this log. A Schema Registry instance therefore both produces and consumes messages under the _schemas topic. It produces messages to the log when, for example, new schemas are registered under a subject, or when updates to compatibility settings are registered. Schema Registry consumes from the _schemas log in a background thread, and updates its local caches on consumption of each new _schemas message to reflect the newly added schema or compatibility setting. Updating local state from the Kafka log in this manner ensures durability, ordering, and easy recoverability.

Tip

The Schema Registry topic is compacted and therefore the latest value of every key is retained forever, regardless of the Kafka retention policy. You can validate this with kafka-configs:

kafka-configs --bootstrap-server localhost:9092 --entity-type topics --entity-name _schemas --describe

Your output should resemble:

Configs for topic '_schemas' are cleanup.policy=compact

All Kafka security features are supported by Schema Registry.

Relatively few services need access to Schema Registry, and they are likely internal, so you can restrict access to the Schema Registry itself via firewall rules and/or network segmentation.

ZooKeeper

Important

Schema Registry supports both unauthenticated and SASL authentication to ZooKeeper.

Setting up ZooKeeper SASL authentication for Schema Registry is similar to Kafka’s setup. Namely, create a keytab for Schema Registry, create a JAAS configuration file, and set the appropriate JAAS Java properties.

In addition to the keytab and JAAS setup, be aware of the zookeeper.set.acl setting. This setting, when set to true, enables ZooKeeper ACLs, which limits access to znodes.

Important: if zookeeper-set-acl is set to true, Schema Registry’s service name must be the same as Kafka’s, which is kafka by default. Otherwise, Schema Registry will fail to create the _schemas topic, which will cause a “leader not available” error in the DEBUG log. Schema Registry log will show org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata when Kafka does not set ZooKeeper ACLs but Schema Registry does. Schema Registry’s service name can be set either with kafkastore.sasl.kerberos.service.name or in the JAAS file.

If Schema Registry has a different service name tha Kafka, zookeeper.set.acl must be set to false in both Schema Registry and Kafka.

Clients to Schema Registry

Configuring the REST API for HTTP or HTTPS

By default Schema Registry allows clients to make REST API calls over HTTP. You may configure Schema Registry to allow either HTTP or HTTPS or both at the same time.

The following configuration determines the protocol used by Schema Registry:

listeners

Comma-separated list of listeners that listen for API requests over HTTP or HTTPS or both. If a listener uses HTTPS, the appropriate TLS/SSL configuration parameters need to be set as well.

On the clients, configure schema.registry.url to match the configured Schema Registry listener.

Additional configurations for HTTPS

If you configure an HTTPS listener, there are several additional configurations for Schema Registry.

First, configure the appropriate TLS/SSL configurations for the keystore and optionally truststore for the Schema Registry cluster (for example, in schema-registry.properties). The truststore is required only when ssl.client.auth is set to true.

ssl.truststore.location=/etc/kafka/secrets/kafka.client.truststore.jks
ssl.truststore.password=<password>
ssl.keystore.location=/etc/kafka/secrets/kafka.client.keystore.jks
ssl.keystore.password=<password>
ssl.key.password=<password>

You may specify which protocol to use while making calls between the instances of Schema Registry. The secondary to primary node calls for writes and deletes will use the specified protocol.

inter.instance.protocol

The protocol used while making calls between the instances of Schema Registry. The secondary to primary node calls for writes and deletes will use the specified protocol. The default value is http. When https is set, ssl.keystore. and ssl.truststore. configs are used while making the call. The schema.registry.inter.instance.protocol name is deprecated; use inter.instance.protocol instead.

  • Type: string
  • Default: “http”
  • Importance: low

Starting with 5.4, Confluent Platform provides the Schema Registry dedicated client configuration properties, as shown in the example.

Tip

Clients to Schema Registry include both:

  • Client applications created or used by developers.
  • Confluent Platform components such as Confluent Control Center, Kafka Connect, ksqlDB, and so forth.

To configure clients to use HTTPS to Schema Registry, set the following properties or environment variables:

  1. On the client, configure the schema.registry.url to match the configured listener for HTTPS.

    <client>.schema.registry.url: "<schema-registry-url>:<port>"
    
  2. On the client, configure the environment variables to set the TLS/SSL keystore and truststore in one of two ways:

    • (Recommended) Use the Schema Registry dedicated properties to configure the client:

      <client>.schema.registry.ssl.truststore.location=/etc/kafka/secrets/kafka.client.truststore.jks
      <client>.schema.registry.ssl.truststore.password=<password>
      <client>.schema.registry.ssl.keystore.location=/etc/kafka/secrets/kafka.client.keystore.jks
      <client>.schema.registry.ssl.keystore.password=<password>
      <client>.schema.registry.ssl.key.password=<password>
      

      The naming conventions for Confluent Control Center configuration differ slightly from the other clients. To configure Control Center as an HTTPS client to Schema Registry, specify these dedicated properties in the Control Center config file:

      confluent.controlcenter.schema.registry.schema.registry.ssl.truststore.location=/etc/kafka/secrets/kafka.client.truststore.jks
      confluent.controlcenter.schema.registry.schema.registry.ssl.truststore.password=<password>
      confluent.controlcenter.schema.registry.schema.registry.ssl.keystore.location=/etc/kafka/secrets/kafka.client.keystore.jks
      confluent.controlcenter.schema.registry.schema.registry.ssl.keystore.password=<password>
      confluent.controlcenter.schema.registry.schema.registry.ssl.key.password=<password>
      

      See also

      Configure TLS proxy server access to Schema Registry under Configuring TLS/SSL for Control Center provides a detailed explanation of the naming conventions used in this configuration.

    • (Legacy, on client) Set environment variables depending on the client (one of KAFKA_OPTS, SCHEMA_REGISTRY_OPTS, KSQL_OPTS):

      export JAVA_OPTS: "-Djavax.net.ssl.trustStore=/etc/kafka/secrets/kafka.client.truststore.jks \
                  -Djavax.net.ssl.trustStorePassword=<password> \
                  -Djavax.net.ssl.keyStore=/etc/kafka/secrets/kafka.client.keystore.jks \
                  -Djavax.net.ssl.keyStorePassword=<password>"
      

Important

  • If you use the legacy method of defining TLS/SSL values in system environment variables, TLS/SSL settings will apply to every Java component running on this JVM. For example on Connect, every connector will use the given truststore. Consider a scenario where you are using an Amazon Web Servces (AWS) connector such as S3 or Kinesis, and do not have the AWS certificate chain in the given truststore. The connector will fail with the following error:

    com.amazonaws.SdkClientException: Unable to execute HTTP request:
    sun.security.validator.ValidatorException: PKIX path building failed
    

    This does not apply if you use the dedicated Schema Registry client configurations.

  • For the kafka-avro-console-producer and kafka-avro-console-consumer, you must pass the Schema Registry properties on the command line. Here is an example for the producer:

    ./kafka-avro-console-producer --broker-list localhost:9093 --topic myTopic \
    --producer.config ~/ect/kafka/producer.properties --property value.schema={“type”:“record”,“name”:“myrecord”,“fields”:[{“name”:“f1”,“type”:“string”}]} \
    --property schema.registry.url=https://localhost:8081 --property schema.registry.ssl.truststore.location=/etc/kafka/security/schema.registry.client.truststore.jks --property schema.registry.ssl.truststore.password=myTrustStorePassword
    

    For more examples of using the producer and consumer command line utilities, see Test drive Avro schema, Test drive JSON Schema, Test drive Protobuf schema, and the demo in Validate Schemas Broker-side in Confluent Platform.

See also

To learn more, see these demos and examples: Scripted Confluent Platform Demo, Kafka Client Application Examples, and Using Schema Registry over HTTPS in the API Usage Examples.

Migrating from HTTP to HTTPS

To upgrade Schema Registry to allow REST API calls over HTTPS in an existing cluster:

  • Add/Modify the listeners config to include HTTPS. For example: http://0.0.0.0:8081,https://0.0.0.0:8082
  • Configure Schema Registry with appropriate TLS/SSL configurations to setup the keystore and optionally truststore
  • Do a rolling bounce of the cluster

This process enables HTTPS, but still defaults to HTTP so Schema Registry instances can still communicate before all nodes have been restarted. They will continue to use HTTP as the default until configured not to. To switch to HTTPS as the default and disable HTTP support, perform the following steps:

  • Enable HTTPS as mentioned in first section of upgrade (both HTTP & HTTPS will be enabled)
  • Configure inter.instance.protocol to https in all the nodes
  • Do a rolling bounce of the cluster
  • Remove http listener from the listeners in all the nodes
  • Do a rolling bounce of the cluster

Configuring the REST API for Basic HTTP Authentication

Schema Registry can be configured to require users to authenticate using a username and password via the Basic HTTP authentication mechanism.

Note

If you’re using Basic authentication, we recommended that you configure Schema Registry to use HTTPS for secure communication, because the Basic protocol passes credentials in plain text.

Use the following settings to configure Schema Registry to require authentication:

authentication.method=BASIC
authentication.roles=<user-role1>,<user-role2>,...
authentication.realm=<section-in-jaas_config.conf>

The authentication.roles configuration defines a comma-separated list of user roles. To be authorized to access Schema Registry, an authenticated user must belong to at least one of these roles.

For example, if you define admin, developer, user, and sr-user roles, the following configuration assigns them for authentication:

authentication.roles=admin,developer,user,sr-user

The authentication.realm configuration must match a section within jaas_config.conf, which defines how the server authenticates users and should be passed as a JVM option during server start:

export SCHEMA_REGISTRY_OPTS=-Djava.security.auth.login.config=/path/to/the/jaas_config.conf
schema-registry-start ${CONFLUENT_HOME}/etc/schema-registry/schema-registry.properties

An example jaas_config.conf is:

SchemaRegistry-Props {
  org.eclipse.jetty.jaas.spi.PropertyFileLoginModule required
  file="/path/to/password-file"
  debug="false";
};

Assign the SchemaRegistry-Props section to the authentication.realm configuration setting:

authentication.realm=SchemaRegistry-Props

The example jaas_config.conf above uses the Jetty PropertyFileLoginModule, which authenticates users by checking for their credentials in a password file.

You can also use other implementations of the standard Java LoginModule interface, such as the LdapLoginModule, or the JDBCLoginModule for reading credentials from a database.

The file parameter is the location of the password file. The format is:

<username>: <password-hash>,<role1>[,<role2>,...]

Here’s an example:

fred: OBF:1w8t1tvf1w261w8v1w1c1tvn1w8x,user,admin
barney: changeme,user,developer
betty: MD5:164c88b302622e17050af52c89945d44,user
wilma: CRYPT:adpexzg3FUZAk,admin,sr-user

Get the password hash for a user by using the org.eclipse.jetty.util.security.Password utility:

schema-registry-run-class org.eclipse.jetty.util.security.Password fred letmein

Your output should resemble:

letmein
OBF:1w8t1tvf1w261w8v1w1c1tvn1w8x
MD5:0d107d09f5bbe40cade3de5c71e9e9b7
CRYPT:frd5btY/mvXo6

Each line of the output is the password encrypted using different mechanisms, starting with plain text.

Once Schema Registry is configured to use Basic authentication, clients must be configured with suitable valid credentials, for example:

basic.auth.credentials.source=USER_INFO
basic.auth.user.info=fred:letmein

Tip

The schema.registry prefixed versions of these properties were deprecated in Confluent Platform 5.0.

  • schema.registry.basic.auth.credentials.source is deprecated.
  • schema.registry.basic.auth.user.info is deprecated.

Governance

To provide data governance with the Confluent Schema Registry :

  1. disable auto schema registration
  2. restrict access to the _schemas topic
  3. restrict access to Schema Registry operations

Disabling Auto Schema Registration

By default, client applications automatically register new schemas. If they produce new messages to a new topic, then they will automatically try to register new schemas. This is convenient in development environments, but in production environments it’s recommended that client applications do not automatically register new schemas. Best practice is to register schemas outside of the client application to control when schemas are registered with Schema Registry and how they evolve.

Within the application, you can disable automatic schema registration by setting the configuration parameter auto.register.schemas=false, as shown in the following example.

props.put(AbstractKafkaAvroSerDeConfig.AUTO_REGISTER_SCHEMAS, false);

Tip

  • If you want to enable use.latest.version for producers, you must disable auto schema registration by setting auto.register.schemas to false, and use.latest.version to true. (The opposite of their defaults.) The option auto.register.schemas must be set to false in order for use.latest.version to work.

    Setting auto.register.schemas to false disables auto-registration of the event type, so that it does not override the latest schema in the subject. Setting use.latest.version to true causes the serializer to look up the latest schema version in the subject and use that for serialization. If use.latest.version is set to false (which is the default), the serializer will look for the event type in the subject and fail to find it.

    You can also set use.latest.version on consumers, which causes the consumer to look up the latest schema version in the subject and use that as the target schema during deserialization.

  • See also, Schema Registry Configuration Options for Kafka Connect.

  • The configuration option auto.register.schemas is a Confluent Platform feature; not available in Apache Kafka®.

Once a client application disables automatic schema registration, it will no longer be able to dynamically register new schemas from within the application. However, it will still be able to retrieve existing schemas from the Schema Registry, assuming proper authorization.

Authorizing Access to the Schemas Topic

If you enable Kafka authorization, you must grant the Schema Registry service principal the ability to perform the following operations on the specified resources:

  • Read and Write access to the internal _schemas topic. This ensures that only authorized users can make changes to the topic.
  • DescribeConfigs on the schemas topic to verify that the topic exists
  • describe topic on the schemas topic, giving the Schema Registry service principal the ability to list the schemas topic
  • DescribeConfigs on the internal consumer offsets topic
  • Access to the Schema Registry cluster (group)
  • Create permissions on the Kafka cluster
export KAFKA_OPTS="-Djava.security.auth.login.config=<path to JAAS conf file>"

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --producer --consumer --topic _schemas --group schema-registry

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation DescribeConfigs --topic _schemas

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Describe --topic _schemas

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Read --topic _schemas

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Write --topic _schemas

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Describe --topic __consumer_offsets

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Create --cluster kafka-cluster

If you are using the Schema Registry ACL Authorizer for Confluent Platform, you also need permissions to Read, Write, and DescribeConfigs on the internal _schemas_acl topic:

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --producer --consumer --topic _schemas_acl --group schema-registry

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Read --topic _schemas_acl

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Write --topic _schemas_acl

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation DescribeConfigs --topic _schemas_acl

Tip

  • The group, which serves at the Schema Registry cluster ID, defaults to schema-registry. To customize, specify a value for schema-registry-group-id.
  • The internal topic that holds schemas defaults to topic name _schemas. To customize, specify a value for kafkastore.topic.
  • The internal topic that holds ACL schemas defaults to topic name _schemas_acl (or, {{kafkastore.topic}}_acls). To customize, specify a value for confluent.schema.registry.acl.topic.

Note

  • Removing world-level permissions: In previous versions of Schema Registry, we recommended making the _schemas topic world readable and writable. Now that Schema Registry supports SASL, the world-level permissions can be dropped.

Authorizing Schema Registry Operations with the Security Plugin

The Schema Registry security plugin provides authorization for various Schema Registry operations. It authenticates the incoming requests and authorizes them via the configured authorizer. This allows schema evolution management to be restricted to administrative users, with application users provided with read-only access only.