Use Self-Managed Encryption Keys with Tableflow

Tableflow supports self-managed encryption keys (also known as BYOK) to enable organizations in sensitive industries to take advantage of Tableflow while meeting compliance requirements for data-at-rest encryption. With self-managed encryption key support, Tableflow can read from clusters with self-managed encryption keys and write to storage options with self-managed encryption keys, including Confluent Managed Storage and Bring Your Own Storage (BYOS).

This topic covers encryption behavior, limitations, and validation procedures for Tableflow with self-managed encryption keys.

Confluent Managed Storage encryption

When you use Confluent Managed Storage with Tableflow, the self-managed encryption key used for your Kafka cluster is automatically reused for Tableflow Confluent Managed Storage. This ensures consistent encryption across your Kafka data and Tableflow tables.

For an overview of storage behavior and capabilities, see Confluent Managed Storage.

Important

  • The encryption key cannot be changed separately for Tableflow Confluent Managed Storage after it’s configured.

  • Key rotations impact both Kafka data and Tableflow data stored in Confluent Managed Storage.

  • If you revoke access to the encryption key, both Kafka and Tableflow data in Confluent Managed Storage become inaccessible.

User-managed AWS S3 storage encryption

You can use use your own AWS S3 buckets with AWS KMS encryption to use self-managed encryption keys with Tableflow. This approach provides full control over both the storage infrastructure and encryption keys, ensuring maximum compliance for sensitive industries.

This corresponds to the Bring Your Own Storage (BYOS) option for Tableflow. You can choose between two AWS encryption modes:

  • SSE-KMS (Server-Side Encryption with KMS): Standard single-layer encryption using AWS KMS keys.

  • DSSE-KMS (Dual-layer Server-Side Encryption with KMS): Enhanced security with two independent layers of encryption for highly sensitive data.

For detailed information on these encryption modes, see SSE-KMS and DSSE-KMS in the AWS documentation. For setup steps and required IAM and KMS policies, see Configure Storage for Tableflow.

To configure self-managed encryption keys with user-managed S3 buckets, you must add the required KMS permissions to the KMS key used to encrypt your S3 bucket. For more information about configuring KMS permissions with provider integrations, see Create an AWS Provider Integration in Confluent Cloud.

Important

For multi-account setups, you must update both the IAM role permissions and the KMS key policy to grant the necessary KMS actions to the Confluent Cloud role.

User-managed Azure Data Lake Storage encryption

You can use your own Azure Data Lake Storage (ADLS) Gen2 with customer-managed keys in Azure Key Vault to use self-managed encryption keys with Tableflow. This approach provides full control over both the storage infrastructure and encryption keys, which helps ensure maximum compliance for sensitive industries.

This corresponds to the Bring Your Own Storage (BYOS) option for Tableflow. ADLS uses server-side encryption with customer-managed keys stored in Azure Key Vault.

ADLS does not require explicit permissions to be added to the encryption key used for storage. The storage account’s managed identity handles access to the encryption key automatically through Azure RBAC. For more information about configuring Azure provider integrations, see Create an Azure Provider Integration in Confluent Cloud.

Important

For Azure Dedicated clusters using Tableflow, you must update the firewall settings on the Azure Key Vault used for the Kafka cluster encryption key. This is a separate key from the one used for ADLS encryption.

In the Azure Key Vault Networking settings, change the firewall configuration from Disable public access to Allow public access from all networks. Ensure Allow Microsoft Trusted services is also selected. This enables Tableflow to access the Kafka cluster’s encryption key. For more, see Configure network security for Azure Key Vault.

Self-managed encryption key workflows for Tableflow

Tableflow supports workflows for using self-managed encryption keys with Confluent Managed Storage or user-managed storage:

Confluent Managed Storage with self-managed encryption keys

When your Kafka cluster uses self-managed encryption keys, Tableflow with Confluent Managed Storage automatically inherits the same encryption key with no additional configuration required:

  1. Navigate to your Kafka topic in the Confluent Cloud Console.

  2. Click Enable Tableflow.

  3. Select Confluent Managed Storage as the destination storage.

  4. Complete the Tableflow setup.

All data-at-rest is automatically encrypted using the self-managed encryption key associated with your Kafka cluster.

User-managed storage with self-managed encryption

To set up storage with your own AWS S3 bucket or ADLS and apply the required IAM and KMS policies, follow the steps in Configure Storage for Tableflow. That guide covers BYOS setup and includes the KMS key policy update.

Validation and error handling

Tableflow performs validation to ensure proper access to encryption keys:

Key access validation

During setup, Tableflow validates that it can successfully read from and write to encrypted storage using the configured encryption keys. If validation fails, you receive error messages and should take the following actions:

  • Missing KMS permissions: Verify that your IAM role has the required KMS actions (kms:Encrypt, kms:Decrypt, kms:ReEncrypt*, kms:GenerateDataKey*, kms:DescribeKey) and that the KMS key policy allows access from your IAM role.

  • Invalid or disabled encryption keys: Check that the KMS key exists, is enabled, and is in the correct AWS or Azure region.

  • Network connectivity issues: Ensure your AWS or Azure account has access to KMS services and that no network policies are blocking the connection.

Key unavailability scenarios

If encryption keys become unavailable after Tableflow is configured:

Cluster encryption key issues

  • Tableflow cannot read new data from Kafka topics.

  • Existing Tableflow data in Confluent Managed Storage becomes inaccessible.

  • Tableflow processing is suspended until key access is restored.

AWS S3 bucket key issues

  • Tableflow cannot write new data to the S3 bucket.

  • Tableflow processing is suspended until key access is restored.

  • Existing data in S3 remains encrypted but inaccessible without the key.

Recovery

When using AWS KMS, follow these steps to restore Tableflow operations after key-related issues:

  1. Identify the issue: Check AWS CloudTrail logs for KMS access denials or key usage patterns.

  2. Restore key access: Re-enable disabled keys, update IAM policies, or restore KMS key policies as needed.

  3. Verify permissions: Test key access using the AWS CLI or console before re-enabling Tableflow.

  4. Monitor resumption: Tableflow should automatically resume processing after key access is restored. This might take a few minutes.

If issues persist or if you need assistance with complex key recovery scenarios, contact Confluent Support.

Azure Data Lake Storage key issues

  • Tableflow cannot write new data to ADLS.

  • Tableflow processing is suspended until key access is restored.

  • Existing data in ADLS remains encrypted but inaccessible without the key.

  • Check that the storage account’s managed identity has access to the customer-managed key in Azure Key Vault.

Recovery

When using Azure Key Vault, follow these steps to restore Tableflow operations after key-related issues:

  1. Identify the issue: Check Azure Monitor logs and Key Vault diagnostics for access denials or key usage patterns.

  2. Restore key access: Re-enable disabled keys in Azure Key Vault, update Azure RBAC permissions, or verify the storage account’s managed identity has appropriate access to the customer-managed key.

  3. Verify permissions: Test key access using the Azure CLI or Azure Portal. Verify that the managed identity has the required permissions (Get, Unwrap Key, Wrap Key) on the Key Vault.

  4. Verify network access: For Dedicated clusters, ensure the Key Vault firewall is configured to allow public access from all networks.

  5. Monitor resumption: Tableflow should automatically resume processing after key access is restored. This might take a few minutes.

If issues persist or if you need assistance with complex key recovery scenarios, contact Confluent Support.

Security considerations for Tableflow

  • Data caching and key revocation: Confluent internal systems may locally cache decrypted data and keys for up to six hours for performance and cost efficiency. Confluent proactively evicts cached data and keys on a best-effort basis when key revocation is detected.

  • Data in transit: All data transfers use TLS 1.2 encryption by default when Tableflow writes to destination storage.

  • Key rotation: Automatic key rotation is supported and applied to both new and existing Tableflow data.

  • Catalog integration: When using catalogs with Tableflow and self-managed keys, you must grant the catalog access to the encryption key that encrypts the Tableflow data. For AWS, this is the KMS key. For Azure, this is the customer-managed key in Azure Key Vault. This applies to Confluent Managed Storage and Bring Your Own Storage (BYOS). For detailed instructions, see Integrate Catalogs with Tableflow in Confluent Cloud.

  • Azure Key Vault network security: For Azure Dedicated clusters, ensure proper Key Vault firewall configuration to allow Tableflow access to encryption keys.

  • Compliance: Tableflow with self-managed encryption keys enables organizations in sensitive and regulated industries to help maintain compliance requirements for data-at-rest encryption while leveraging the benefits of open table formats and real-time data processing.