Encrypt Sensitive Fields with Tableflow in Confluent Cloud
You control whether sensitive fields in Apache Kafka® topics appear as plaintext or stay encrypted when Tableflow materializes them into Apache Iceberg™ or Delta Lake tables. With client-side field level encryption (CSFLE), you tag the personally identifiable information (PII) and payment card industry (PCI) fields in a topic on AWS, encrypt them on the producer, and enable Tableflow to either decrypt them into plaintext columns that analytics engines query directly or preserve them as ciphertext.
You own and manage the key encryption key (KEK) in your own key management service (KMS), and Confluent never stores the plaintext key.
Note
CSFLE with Tableflow is available to some Confluent customers as a Limited Availability feature. For those with access, Confluent fully supports CSFLE with Tableflow and recommends it for production use. To request access, contact Confluent Support.
How CSFLE works with Tableflow
With CSFLE, the producer encrypts the tagged fields before it sends a record, so the data arrives in Kafka as ciphertext. You configure this by tagging the sensitive fields in your Confluent Cloud Schema Registry schema and attaching an encryption rule that references your KEK. When you enable Tableflow on the topic, whether Tableflow decrypts those fields depends only on whether you share key access with Confluent Cloud. You set that choice when you register the KEK, and the enablement flow has no separate encryption option.
Tableflow supports two outcomes:
- Decrypt and store as plaintext
When you share access to your KEK with Confluent Cloud, Tableflow decrypts the tagged fields during materialization and writes them as plaintext columns. Analytics engines read the columns directly: you can query them from Amazon Athena, Databricks Unity Catalog, Snowflake, and Confluent Cloud for Apache Flink® snapshot queries. Schema Registry unwraps the data encryption key (DEK) by calling your KMS on behalf of Tableflow. This outcome corresponds to CSFLE with shared Confluent access to the KEK.
- Store in encrypted format
When you keep your KEK private, Confluent Cloud cannot access it, so Tableflow preserves the tagged fields as base64 ciphertext in the table, and each downstream consumer decrypts the values when it needs them. The producer wraps the DEK locally with its own KMS credentials, and Tableflow writes the tagged columns as ciphertext. This approach maintains end-to-end encryption. This outcome corresponds to CSFLE without Confluent access to the KEK.
Requirements
Setting up CSFLE with Tableflow requires the following:
The standard Tableflow setup described in the Tableflow quick start, including a Confluent Cloud cluster, a provider integration and storage bucket, and Tableflow enabled for your environment.
The Stream Governance Advanced package, which CSFLE rules require.
A schema, in Avro, Protobuf, or JSON Schema format, with tags on the fields that you want to encrypt.
Access to AWS KMS, where you create the KEK in Step 2.
To enable Tableflow on the topic, you need the following roles:
ResourceOwneron the topic.DeveloperReadon the schema subject.Assigneron the provider integration, if you use your own storage.
Limitations and known behavior
The following limitations and known behavior apply to CSFLE with Tableflow.
One ENCRYPT rule per subject
Tableflow supports one encryption rule for each schema subject, and a single KEK encrypts every tagged field on that subject. You can apply one rule to multiple tag classes (for example, PII and PCI). To encrypt different tag classes with different KEKs, split the fields across separate subjects or topics.
Schema evolution: additive optional fields only
Tableflow supports additive schema evolution for topics that use CSFLE. You can add a field if you declare it as optional, with a nullable union and a default of null. Tableflow re-materializes the table and sets the new column to null for older records. Other changes, including adding required fields, narrowing types, and changing tag classes, require you to recreate the destination table. For more information about schema evolution, see Schema compatibility and evolution.
Loss of key access during materialization
If Tableflow loses access to your KEK while it materializes a topic, the topic stays in the RUNNING state. Records that arrive after the cached DEK expires are written to the table with the tagged columns as ciphertext.
Decrypt and store sensitive fields as plaintext
This procedure sets up CSFLE so that Tableflow decrypts the tagged fields and stores them as plaintext columns in your destination table. Decrypting fields during materialization requires that you share key access with Confluent Cloud. To keep the fields encrypted instead, see Store sensitive fields in encrypted format.
The steps use the example topic customer-records, the subject customer-records-value, and a KEK named csfle-pii-kek.
Step 1: Register tag definitions
Register the PII and PCI tags once per environment. After you register them, the tags are reusable across all subjects in that environment.
Sign in to the Confluent Cloud Console at https://confluent.cloud/.
In the navigation menu, click Environments, and then select the environment that hosts your cluster.
In the navigation menu, click Catalog management.
On the Tags tab, click Create tags. If tags already exist, click Add tag.
For Tag name, enter
PII, add a description, and click Create.Repeat to register the
PCItag.
Set SR_URL, SR_API_KEY, and SR_API_SECRET for your Schema Registry cluster, then create the tag definitions:
curl -s -X POST -u "$SR_API_KEY:$SR_API_SECRET" \
-H "Content-Type: application/json" \
"$SR_URL/catalog/v1/types/tagdefs" \
-d '[{"name": "PII", "description": "Personally identifiable information"},
{"name": "PCI", "description": "Payment card industry data"}]'
Step 2: Create a customer-managed KEK in AWS KMS
In AWS KMS, create a symmetric key with encrypt and decrypt usage, and copy its key Amazon Resource Name (ARN). You reference this ARN when you register the key in Confluent in Step 3. For instructions, see Creating keys in the AWS KMS documentation.
Step 3: Register the KEK in Confluent
Register the KEK in Confluent and share key access, so that Tableflow can decrypt the tagged fields.
In the navigation menu, click Schema Registry, and then open the Encryption keys tab.
Click Add encryption key.
For Name, enter
csfle-pii-kek. This is the name your rule references.For Key management system provider, select AWS.
For Amazon resource name (key ID), enter the key ARN from Step 2.
Turn on Share key access with Confluent Cloud.
Copy the AWS KMS policy statement that Confluent Cloud displays. You add it to your key policy in Step 4.
Click Add.
Register the KEK with shared set to true to share key access:
curl -s -X POST -u "$SR_API_KEY:$SR_API_SECRET" \
-H "Content-Type: application/json" \
"$SR_URL/dek-registry/v1/keks" \
-d '{
"name": "csfle-pii-kek",
"kmsType": "aws-kms",
"kmsKeyId": "<kms_key_arn>",
"shared": true
}'
Sharing key access is required for this outcome. If you do not share key access, Tableflow cannot decrypt the fields, and the materialized table contains ciphertext.
Step 4: Grant Confluent access in your AWS KMS key policy
Add the policy statement from Step 3 to your AWS KMS key policy, alongside the existing statement that lets your account administer the key. The statement authorizes the Confluent Schema Registry role to call kms:Encrypt and kms:Decrypt under a TenantID condition for your Schema Registry cluster. For instructions, see Key policies in AWS KMS.
Step 5: Create the topic
Create the Kafka topic that you materialize with Tableflow.
In the navigation menu, click Clusters, and then select your cluster.
In the navigation menu, click Topics, and then click Add topic.
For Topic name, enter
customer-records.For Partitions, enter the number of partitions that you want.
Click Create with defaults.
When Confluent Cloud prompts you to create a data contract, click Skip. You create the schema in Step 6.
Step 6: Create the schema with tagged fields
Create the schema for the topic, and tag the sensitive fields with the PII and PCI tags. The confluent:tags attributes mark the fields that the encryption rule encrypts.
On the customer-records topic page, click the Data contract tab.
On the Value subtab, click Create data contract.
For the schema type, select Avro.
Paste the following schema:
{ "type": "record", "name": "CustomerRecord", "namespace": "io.confluent.tableflow.csfle", "fields": [ {"name": "customer_id", "type": "string"}, {"name": "name", "type": "string"}, {"name": "email", "type": "string", "confluent:tags": ["PII"]}, {"name": "ssn", "type": "string", "confluent:tags": ["PII"]}, {"name": "card_number", "type": "string", "confluent:tags": ["PCI"]}, {"name": "created_at", "type": "string"} ] }
Click Create. The subject
customer-records-valueappears at version 1.
Register the schema for the customer-records-value subject. You add the encryption rule in Step 7.
curl -s -X POST -u "$SR_API_KEY:$SR_API_SECRET" \
-H "Content-Type: application/json" \
"$SR_URL/subjects/customer-records-value/versions" \
-d '{
"schemaType": "AVRO",
"schema": "{\"type\":\"record\",\"name\":\"CustomerRecord\",\"namespace\":\"io.confluent.tableflow.csfle\",\"fields\":[{\"name\":\"customer_id\",\"type\":\"string\"},{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"email\",\"type\":\"string\",\"confluent:tags\":[\"PII\"]},{\"name\":\"ssn\",\"type\":\"string\",\"confluent:tags\":[\"PII\"]},{\"name\":\"card_number\",\"type\":\"string\",\"confluent:tags\":[\"PCI\"]},{\"name\":\"created_at\",\"type\":\"string\"}]}"
}'
Step 7: Add the encryption rule
Add an ENCRYPT rule that encrypts the tagged fields with your KEK.
On the data contract for customer-records, open the Rules tab and click Add rule.
For Category, select Field level encryption rule.
For Rule name, enter
encrypt-sensitive. Kind is set to TRANSFORM and Type is set to ENCRYPT.Under Encrypt fields with, for Tags, select PII and PCI.
Under using, for Encryption key, select
csfle-pii-kek.For the mode, select WRITE and READ.
Under Apply action on, leave On failure set to ERROR.
Click Save.
Register the schema with a ruleSet that adds the ENCRYPT rule on the PII and PCI tags:
curl -s -X POST -u "$SR_API_KEY:$SR_API_SECRET" \
-H "Content-Type: application/json" \
"$SR_URL/subjects/customer-records-value/versions" \
-d '{
"schemaType": "AVRO",
"schema": "<the_schema_from_step_6>",
"ruleSet": {
"domainRules": [{
"name": "encrypt-sensitive",
"kind": "TRANSFORM",
"type": "ENCRYPT",
"mode": "WRITEREAD",
"tags": ["PII", "PCI"],
"params": {"encrypt.kek.name": "csfle-pii-kek"},
"onFailure": "ERROR"
}]
}
}'
After you save the rule, producers that use a CSFLE-enabled serializer encrypt the email, ssn, and card_number fields on the next produce.
Step 8: Enable Tableflow on the topic
Because you shared key access in Step 3, the decrypt-and-store outcome applies. Your storage and provider integration are already in place from the Requirements, so this step only enables Tableflow on the topic. Enable Tableflow on customer-records as described in the Tableflow quick start.
Step 9: Produce records
Produce records with any CSFLE-enabled Kafka client. Because the encryption rule lives on the subject, the serializer encrypts the tagged fields automatically: your producer code matches a standard Avro producer, with the CSFLE rule-executor libraries for your KMS provider added. For ready-to-run examples, see Code Examples for Client-Side Field Level Encryption in Confluent Cloud and the Quick Start - Use CSFLE to Protect Sensitive Data.
When you share key access with Confluent Cloud, the producer does not need AWS credentials, because Schema Registry brokers the DEK. If you do not share key access, the producer needs AWS credentials with kms:Encrypt on the KEK.
Step 10: Verify the results
On the wire, the tagged fields are encrypted. In the Confluent Cloud Console, go to Topics > customer-records > Messages. The customer_id, name, and created_at fields appear as plaintext, and email, ssn, and card_number appear as base64 ciphertext.
In the materialized table, the tagged fields are decrypted. Query the customer-records table from an analytics engine, such as Amazon Athena, Databricks Unity Catalog, Snowflake, or Flink. The tagged fields appear as plaintext columns:
customer_id | name | email | ssn | card_number
CUST-000001 | Jane Doe | jane@example.com | 123-45-6789 | 4111111111111111
Store sensitive fields in encrypted format
To keep the tagged fields encrypted at rest instead of decrypting them, follow the decrypt-and-store procedure with two changes:
When you register the KEK (Step 3), leave Share key access with Confluent Cloud turned off. For the REST API, omit
"shared": trueor set it tofalse.Skip the AWS KMS key policy grant (Step 4). Do not add the Confluent statement to your key policy.
The producer still needs AWS credentials with kms:Encrypt on the KEK, because it wraps the data encryption key itself. Everything else stays the same. Because Tableflow cannot access your key, it materializes the tagged fields as base64 ciphertext in your Iceberg or Delta Lake table, and downstream consumers decrypt them when needed.
