Use Client-Side Field Level Encryption in Confluent Cloud

For steps on using Confluent Cloud Console to configure client-side field level encryption, see Manage Client-Side Field Level Encryption using Confluent Cloud Console.

Client-side field level encryption (CSFLE) uses a technique called envelope encryption, where a key encryption key (KEK) is used to encrypt data encryption keys (DEKs), which are the actual keys used to encrypt fields. The KEK is typically a key from an external Key Management System (KMS), such as AWS KMS, Azure Key Vault, or Google Cloud KMS.

Schema Registry exposes a subcomponent called the DEK Registry, which provides APIs for managing KEKs and DEKs. The key from the KMS is registered as a KEK to the DEK Registry, and the DEK Registry will also hold the encrypted DEKs used for CSFLE. DEKs are scoped by subject.

In Confluent Cloud, for a given KEK, the DEK Registry can optionally be given permission to make direct calls to the KMS in order to encrypt and decrypt DEKs. The main advantage of granting this permission is to enable other Confluent Cloud offerings, such as Flink and ksqlDB, to process data that was previously encrypted by CSFLE.

If permission is granted, then the DEK Registry will generate the DEK, encrypt it, and make the decrypted DEK available to clients that have the proper authorization. If permission is not granted, then the client will typically generate the DEK, encrypt it, and store it to the DEK Registry.

At a high level, CSFLE requires the following steps:

  • Creating a KEK in an external KMS and registering the KEK to the DEK Registry.
  • Specifying tags on fields.
  • Declaring rules that specify which tags should be encrypted.
  • Either authorizing the DEK Registry to access the KMS in Confluent Cloud, or configuring the client to access the KMS.

CSFLE works with the latest versions of the existing Java serializers and deserializers for Avro, JSON Schema, and Protobuf.

Requirements

To use client-side field level encryption (CSFLE) in Confluent Cloud, the following requirements must be met:

Configure client-side field level encryption

The following steps show how to configure client-side field level encryption.

Step 1 - Registering KEKs and DEKs

The first step is to create a KEK in an external KMS, such as AWS KMS, Azure Key Vault, or Google Cloud KMS. Before using CSFLE, the KEK can be registered with the DEK Registry using Confluent Cloud Console, Confluent CLI, or REST APIs. Note that if the KEK is not registered beforehand, you can register it on demand by the client, assuming that the client has the appropriate permissions with the DEK Registry.

KEK parameters

A KEK registered to the DEK Registry has the following parameters:

Parameter Description
name A meaningful name for the KEK. The name is used when referring to the KEK elsewhere, such as in RBAC.
kmsType The type of KMS, typically one of “aws-kms”, “azure-kms”, and “gcp-kms”.
kmsKeyId The key identifier for the KEK. When using AWS KMS, this is the ARN.
kmsProps Additional key-value properties used to access the KMS.
doc (optional) A meaningful description for the KEK.
shared (optional) Whether the DEK Registry has shared access to the KMS.
ts (optional) The timestamp indicating when the KEK was registered or updated.

DEK parameters

During encryption, the client asks the DEK Registry for an existing DEK for a specified KEK name and subject (when using automtic DEK rotation, version is required). If a DEK does not exist, then depending on whether the DEK Registry has access to the KMS, either the DEK Registry or the client will generate and encrypt the DEK, and then register it with the DEK Registry. If the DEK Registry generates the DEK, the decrypted DEK is sent to the client.

A DEK registered to the DEK Registry has the following parameters:

Parameter Description
kekName The name of the KEK used to encrypt this DEK.
subject The subject for the DEK.
version The version of the DEK.
algorithm The encryption algorithm being used. Valid values include: AES128_GCM, AES256_GCM (default), or AES256_SIV.
encryptedKeyMaterial The encrypted key material for the DEK.
ts The timestamp indicating when the DEK was registered.

Step 2 – Add tags to the schema fields.

Tags can either be inline or external. Here is an example of an inline tag.

{
  "type":"record",
  "name":"MyRecord",
  "fields":[{
    "name":"ssn",
    "type":"string",
    "confluent:tags": [ "PII", "PRIVATE" ]
  }]
}

Important

Before using the tags, you must add the tag definitions in Stream Catalog.

Step 3 - Define an encryption policy

After adding the tags, you need to define an encryption policy that specifies rules for which tags use for encryption. The encryption policy is defined in a JSON file that is then uploaded to Confluent Cloud. Here is an example of an encryption policy:

{
  "schema": "...",
  "metadata": {...},
  "ruleSet": {
    "domainRules": [
      {
        "name": "encryptPII",
        "kind": "TRANSFORM",
        "type": "ENCRYPT",
        "mode": "WRITEREAD",
        "tags": ["PII"],
        "params": {
           "encrypt.kek.name": "<kekName>"
        }
      }
    ]
  }
}

Note that you specified the name of the KEK in step 1. If the KEK has not yet been registered, you can optionally specify the KMS key ID and KMS type in the rule. The client automatically registers the KEK before registering any DEKs.

{
  "schema": "...",
  "metadata": {...},
  "ruleSet": {
    "domainRules": [
      {
        "name": "encryptPII",
        "kind": "TRANSFORM",
        "type": "ENCRYPT",
        "mode": "WRITEREAD",
        "tags": ["PII"],
        "params": {
           "encrypt.kek.name": "<kekName>",
           "encrypt.kms.key.id": "<kmsKeyId>",
           "encrypt.kms.type": "aws-kms"
        }
      }
    ]
  }
}

During registration, if the schema is omitted, then the ruleset attaches to the latest schema in the subject.

After registration, include the following in the client:

  • auto.register.schemas=false
  • use.latest.version=true

If not included, the client attempts to register, or look up, a schema without any existing rules.

Encryption is supported for fields of type string or bytes. Type integer is currently not supported. If a client does not have a rule executor for the ENCRYPT rule type and attempts to consume a message, then the message is not decrypted and the client receives the encrypted values.

The following two additional properties can be specified for DEKs in the rule parameters:

Parameter Description
encrypt.dek.algorithm The encryption algorithm being used. Valid values include AES128_GCM, AES256_GCM (default), or AES256_SIV. You can use AES265_SIV for deterministic encryption of keys.
encrypt.dek.expiry.days If specified, automatic DEK rotation occurs. When a DEK is older than the expiration period, a new DEK is generated and used for new messages, while previous DEKs are available to decrypt older messages. There is a limit to the number of existing DEKs (10,000) that can be retained in the DEK Registry, so use this property cautiously.
preserve.source.fields For performance reasons, the fields of a message are updated during field-level transforms. For field-level encryption, this results in the field values being replaced with the encrypted field values. If the original field values should be retained in the message, then set this property to true.

Step 4 – Configure the KMS Key Encryption Key

For Confluent Cloud, the DEK Registry can be granted permission to access your KMS.

Note: Currently, only AWS KMS is supported when allowing the DEK Registry to access your KMS.

To provide proper access for the DEK Registry, obtain the KMS key policy that needs to be added to the key, either through the Confluent Cloud Console or by using the following REST endpoint:

https://psrc-xxxxx.<region>.<provider>.confluent.cloud/dek-registry/v1/policy

If the DEK Registry has not been granted permission to access the KMS, or if you are using Confluent Platform instead of Confluent Cloud, then the credentials to access the KMS must be specified on the client.

The following KMS options are supported on the client:

  • AWS KMS
  • Azure Key Vault
  • Google Cloud KMS
  • Hashicorp Vault
  • Local key (for testing only)

For example, if you’re using the AWS KMS, you would configure the Java clients with the following dependency:

<dependency>
   <groupId>io.confluent</groupId>
   <artifactId>kafka-schema-registry-client-encryption-aws</artifactId>
   <version>7.6.0</version>
</dependency>

Next, you need to configure the following parameters on the clients.

rule.executors._default_.param.access.key.id=<AWS access key>
rule.executors._default_.param.secret.access.key=<AWS secret key>

Alternatively, the AWS access key and AWS secret key can be passed using environment variables, named AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.

Now, whenever a message is sent, the ssn field is automatically encrypted before serialization and decrypted after deserialization.

Step 5 – Produce with Kafka serializers

To produce with Kafka serializers, you must add the following to your Kafka properties:

Add the appropriate serializer to the producer properties:

  • KafkaAvroSerializer
  • KafkaProtobufSerializer
  • KafkaJsonSchemaSerializer

Handle errors

If the Schema Registry client cannot decrypt the fields that have been encrypted, then it will throw an error. You can configure the client to pass the encrypted data through in spite of the error, or you can send the records to a dead-letter queue.

To pass encrypted data through

To pass encrypted data through in spite of an error, on the client specify an onFailure property in the rule. Here is an example:

{
  "schema": "...",
  "metadata": {...},
  "ruleSet": {
    "domainRules": [
      {
        "name": "encryptPII",
        "kind": "TRANSFORM",
        "type": "ENCRYPT",
        "mode": "WRITEREAD",
        "tags": ["PII"],
        "params": {
           "encrypt.kek.name": "<name of KEK>"
        },
        "onFailure": "ERROR,NONE"
      }
    ]
  }
}

The onFailure setting value of ERROR,NONE above includes two comma-separated values, the first for encryption and the second for decryption.

  • ERROR: If encryption fails, an error is thrown.
  • NONE: If decryption fails, the encrypted value is passed through without decryption.

Pre-registering keys

A DEK and its associated KEK can be registered before using the producer and consumer. For example, to register DEKs for the schema in subject mysubject for version 1, use the following command:

./bin/register-deks http://localhost:8081 mysubject 1

The detailed usage for the above command is below. The same properties that can be passed to the serdes can be passed to this command.

Usage: register-deks [-hV] [-X=<prop=val>]... <url> <subject> [<version>]
Register and/or auto-rotate DEKs according to a specified data contract.
      <url>                   SR (Schema Registry) URL
      <subject>               Subject
      [<version>]             Version, defaults to latest
  -X, --property=<prop=val>   Set configuration property.
  -h, --help                  Show this help message and exit.
  -V, --version               Print version information and exit.

Manual key rotation

Either the DEK or the KEK can be rotated manually.

To manually rotate the DEK, publish a new version of the schema with the same value for encrypt.kms.key.id and encrypt.kms.type, but use a different value for encrypt.kek.name.

To manually rotate the KEK, publish a new version of the schema with a different value for both encrypt.kms.key.id and encrypt.kek.name.

NIST rotation guidance

Periodic rotation of the encryption keys is recommended, even in the absence of compromise. For AES-GCM keys, rotation should occur before approximately 2^32 encryptions have been performed by a key version, following the guidelines of NIST publication 800-38D. For example, if one determines that the estimated encryption rate of a key is 40 million operations per day, then rotating a key every three months is sufficient.

Key deletion

Keep in mind the following when deleting keys:

  • When a DEK or KEK is soft-deleted, it can no longer be used by producers but can be still used by consumers.
  • When a DEK or KEK is hard-deleted, then any data encrypted with the DEK or KEK can no longer be decrypted. Hard-deletion is permanent, so use this operation with care, as it may render your data unreadable.