Use Client-Side Field Level Encryption in Confluent Cloud¶
For steps on using Confluent Cloud Console to configure client-side field level encryption, see Manage Client-Side Field Level Encryption using Confluent Cloud Console.
Client-side field level encryption (CSFLE) uses a technique called envelope encryption, where a key encryption key (KEK) is used to encrypt data encryption keys (DEKs), which are the actual keys used to encrypt fields. The KEK is typically a key from an external Key Management System (KMS), such as AWS KMS, Azure Key Vault, or Google Cloud KMS.
Schema Registry exposes a subcomponent called the DEK Registry, which provides APIs for managing KEKs and DEKs. The key from the KMS is registered as a KEK to the DEK Registry, and the DEK Registry will also hold the encrypted DEKs used for CSFLE. DEKs are scoped by subject.
In Confluent Cloud, for a given KEK, the DEK Registry can optionally be given permission to make direct calls to the KMS in order to encrypt and decrypt DEKs. The main advantage of granting this permission is to enable other Confluent Cloud offerings, such as Flink and ksqlDB, to process data that was previously encrypted by CSFLE.
If permission is granted, then the DEK Registry will generate the DEK, encrypt it, and make the decrypted DEK available to clients that have the proper authorization. If permission is not granted, then the client will typically generate the DEK, encrypt it, and store it to the DEK Registry.
At a high level, CSFLE requires the following steps:
- Creating a KEK in an external KMS and registering the KEK to the DEK Registry.
- Specifying tags on fields.
- Declaring rules that specify which tags should be encrypted.
- Either authorizing the DEK Registry to access the KMS in Confluent Cloud, or configuring the client to access the KMS.
CSFLE works with the latest versions of the existing Java serializers and deserializers for Avro, JSON Schema, and Protobuf.
Requirements¶
To use client-side field level encryption (CSFLE) in Confluent Cloud, the following requirements must be met:
- Stream Governance Advanced package. For more information, see Data Contracts for Schema Registry.
- Confluent Platform versions supported for clients include:
- 7.4.5 or later
- 7.5.4 or later
- 7.6.1 or later
- Kafka serializers and deserializers for Avro, JSON Schema, and Protobuf. For more information, see Formats, Serializers, and Deserializers for Schema Registry.
- Key managment service for managing encryption keys. Supported key management options include:
Configure client-side field level encryption¶
The following steps show how to configure client-side field level encryption.
Step 1 - Registering KEKs and DEKs¶
The first step is to create a KEK in an external KMS, such as AWS KMS, Azure Key Vault, or Google Cloud KMS. Before using CSFLE, the KEK can be registered with the DEK Registry using Confluent Cloud Console, Confluent CLI, or REST APIs. Note that if the KEK is not registered beforehand, you can register it on demand by the client, assuming that the client has the appropriate permissions with the DEK Registry.
KEK parameters¶
A KEK registered to the DEK Registry has the following parameters:
Parameter | Description |
---|---|
name | A meaningful name for the KEK. The name is used when referring to the KEK elsewhere, such as in RBAC. |
kmsType | The type of KMS, typically one of “aws-kms”, “azure-kms”, and “gcp-kms”. |
kmsKeyId | The key identifier for the KEK. When using AWS KMS, this is the ARN. |
kmsProps | Additional key-value properties used to access the KMS. |
doc | (optional) A meaningful description for the KEK. |
shared | (optional) Whether the DEK Registry has shared access to the KMS. |
ts | (optional) The timestamp indicating when the KEK was registered or updated. |
DEK parameters¶
During encryption, the client asks the DEK Registry for an existing DEK for a specified KEK name and subject (when using automtic DEK rotation, version is required). If a DEK does not exist, then depending on whether the DEK Registry has access to the KMS, either the DEK Registry or the client will generate and encrypt the DEK, and then register it with the DEK Registry. If the DEK Registry generates the DEK, the decrypted DEK is sent to the client.
A DEK registered to the DEK Registry has the following parameters:
Parameter | Description |
---|---|
kekName | The name of the KEK used to encrypt this DEK. |
subject | The subject for the DEK. |
version | The version of the DEK. |
algorithm | The encryption algorithm being used. Valid values include: AES128_GCM ,
AES256_GCM (default), or AES256_SIV . |
encryptedKeyMaterial | The encrypted key material for the DEK. |
ts | The timestamp indicating when the DEK was registered. |
Step 2 – Add tags to the schema fields.¶
Tags can either be inline or external. Here is an example of an inline tag.
{
"type":"record",
"name":"MyRecord",
"fields":[{
"name":"ssn",
"type":"string",
"confluent:tags": [ "PII", "PRIVATE" ]
}]
}
Important
Before using the tags, you must add the tag definitions in Stream Catalog.
Step 3 - Define an encryption policy¶
After adding the tags, you need to define an encryption policy that specifies rules for which tags use for encryption. The encryption policy is defined in a JSON file that is then uploaded to Confluent Cloud. Here is an example of an encryption policy:
{
"schema": "...",
"metadata": {...},
"ruleSet": {
"domainRules": [
{
"name": "encryptPII",
"kind": "TRANSFORM",
"type": "ENCRYPT",
"mode": "WRITEREAD",
"tags": ["PII"],
"params": {
"encrypt.kek.name": "<kekName>"
}
}
]
}
}
Note that you specified the name of the KEK in step 1. If the KEK has not yet been registered, you can optionally specify the KMS key ID and KMS type in the rule. The client automatically registers the KEK before registering any DEKs.
{
"schema": "...",
"metadata": {...},
"ruleSet": {
"domainRules": [
{
"name": "encryptPII",
"kind": "TRANSFORM",
"type": "ENCRYPT",
"mode": "WRITEREAD",
"tags": ["PII"],
"params": {
"encrypt.kek.name": "<kekName>",
"encrypt.kms.key.id": "<kmsKeyId>",
"encrypt.kms.type": "aws-kms"
}
}
]
}
}
During registration, if the schema is omitted, then the ruleset attaches to the latest schema in the subject.
After registration, include the following in the client:
auto.register.schemas=false
use.latest.version=true
If not included, the client attempts to register, or look up, a schema without any existing rules.
Encryption is supported for fields of type string
or bytes
. Type integer
is currently not supported. If a client does not have a rule executor for the
ENCRYPT
rule type and attempts to consume a message, then the message is not
decrypted and the client receives the encrypted values.
The following two additional properties can be specified for DEKs in the rule parameters:
Parameter | Description |
---|---|
encrypt.dek.algorithm | The encryption algorithm being used. Valid values include AES128_GCM ,
AES256_GCM (default), or AES256_SIV . You can use AES265_SIV
for deterministic encryption of keys. |
encrypt.dek.expiry.days | If specified, automatic DEK rotation occurs. When a DEK is older than the expiration period, a new DEK is generated and used for new messages, while previous DEKs are available to decrypt older messages. There is a limit to the number of old DEKs that can be retained in the DEK Registry, so use this property cautiously. |
preserve.source.fields | For performance
reasons, the fields of a message are updated during field-level
transforms. For field-level encryption, this results in the field
values being replaced with the encrypted field values. If the original
field values should be retained in the message, then set this property
to true . |
Step 4 – Configure the KMS Key Encryption Key¶
For Confluent Cloud, the DEK Registry can be granted permission to access your KMS.
Note: Currently, only AWS KMS is supported when allowing the DEK Registry to access your KMS.
To provide proper access for the DEK Registry, obtain the KMS key policy that needs to be added to the key, either through the Confluent Cloud Console or by using the following REST endpoint:
https://psrc-xxxxx.<region>.<provider>.confluent.cloud/dek-registry/v1/policy
If the DEK Registry has not been granted permission to access the KMS, or if you are using Confluent Platform instead of Confluent Cloud, then the credentials to access the KMS must be specified on the client.
The following KMS options are supported on the client:
- AWS KMS
- Azure Key Vault
- Google Cloud KMS
- Hashicorp Vault
- Local key (for testing only)
For example, if you’re using the AWS KMS, you would configure the Java clients with the following dependency:
<dependency>
<groupId>io.confluent</groupId>
<artifactId>kafka-schema-registry-client-encryption-aws</artifactId>
<version>7.6.0</version>
</dependency>
Next, you need to configure the following parameters on the clients.
rule.executors._default_.param.access.key.id=<AWS access key>
rule.executors._default_.param.secret.access.key=<AWS secret key>
Alternatively, the AWS access key and AWS secret key can be passed
using environment variables, named AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
.
Now, whenever a message is sent, the ssn
field is automatically encrypted
before serialization and decrypted after deserialization.
Step 5 – Produce with Kafka serializers¶
To produce with Kafka serializers, you must add the following to your Kafka properties:
Add the appropriate serializer to the producer properties:
- KafkaAvroSerializer
- KafkaProtobufSerializer
- KafkaJsonSchemaSerializer
Handle errors¶
If the Schema Registry client cannot decrypt the fields that have been encrypted, then it will throw an error. You can configure the client to pass the encrypted data through in spite of the error, or you can send the records to a dead-letter queue.
To pass encrypted data through¶
To pass encrypted data through in spite of an error, on the client specify
an onFailure
property in the rule. Here is an example:
{
"schema": "...",
"metadata": {...},
"ruleSet": {
"domainRules": [
{
"name": "encryptPII",
"kind": "TRANSFORM",
"type": "ENCRYPT",
"mode": "WRITEREAD",
"tags": ["PII"],
"params": {
"encrypt.kek.name": "<name of KEK>"
},
"onFailure": "ERROR,NONE"
}
]
}
}
The onFailure
setting value of ERROR,NONE
above includes two comma-separated
values, the first for encryption and the second for decryption.
ERROR
: If encryption fails, an error is thrown.NONE
: If decryption fails, the encrypted value is passed through without decryption.
Pre-registering keys¶
A DEK and its associated KEK can be registered before using the producer and consumer.
For example, to register DEKs for the schema in subject mysubject
for version 1
,
use the following command:
./bin/register-deks http://localhost:8081 mysubject 1
The detailed usage for the above command is below. The same properties that can be passed to the serdes can be passed to this command.
Usage: register-deks [-hV] [-X=<prop=val>]... <url> <subject> [<version>]
Register and/or auto-rotate DEKs according to a specified data contract.
<url> SR (Schema Registry) URL
<subject> Subject
[<version>] Version, defaults to latest
-X, --property=<prop=val> Set configuration property.
-h, --help Show this help message and exit.
-V, --version Print version information and exit.
Manual key rotation¶
Either the DEK or the KEK can be rotated manually.
To manually rotate the DEK, publish a new version of the schema with the same
value for encrypt.kms.key.id
and encrypt.kms.type
, but use a different
value for encrypt.kek.name
.
To manually rotate the KEK, publish a new version of the schema with a different
value for both encrypt.kms.key.id
and encrypt.kek.name
.
NIST rotation guidance¶
Periodic rotation of the encryption keys is recommended, even in the absence of compromise. For AES-GCM keys, rotation should occur before approximately 2^32 encryptions have been performed by a key version, following the guidelines of NIST publication 800-38D. For example, if one determines that the estimated encryption rate of a key is 40 million operations per day, then rotating a key every three months is sufficient.
Key deletion¶
Keep in mind the following when deleting keys:
- When a DEK or KEK is soft-deleted, it can no longer be used by producers but can be still used by consumers.
- When a DEK or KEK is hard-deleted, then any data encrypted with the DEK or KEK can no longer be decrypted. Hard-deletion is permanent, so use this operation with care, as it may render your data unreadable.