Tiered Storage in Confluent Platform

Tiered Storage is a Confluent Platform feature that offloads older log segments from broker disk to cost-effective object storage, enabling infinite data retention without increasing broker storage capacity.

Confluent Tiered Storage makes storing huge volumes of data in Apache Kafka® manageable by reducing operational burden and cost. Tiered Storage helps separate data storage from the concerns of data processing, which enables each to scale independently. With Confluent Tiered Storage, you can send warm data to cost-effective object storage, and scale brokers only when you need more compute resources.

Important

This section describes Confluent Tiered Storage and how to configure it. This is a different feature than Kafka Tiered Storage. When you run Confluent Platform, you should use Confluent Tiered Storage.
Compacted topics are supported by Confluent Tiered Storage starting with Confluent Platform 7.6. To learn how to set up and use this feature, see Configuring Tiered Storage to support compacted topics.
The same bucket must be used across all brokers within a Tiered Storage enabled cluster. This applies to all supported platforms.
If you are using Tiered Storage with Cluster Linking bidirectional links as a part of a disaster recovery (DR) failback strategy on Confluent Platform versions 7.9.0 - 7.9.2 or 8.0.0, upgrade to a newer version before running the truncate-and-restore. For details, see Restore mirroring after a failover with truncate-and-restore.

Known limitations

Tiered Storage has specific requirements for object storage platforms and storage configurations:

Tiered Storage uses the Amazon S3 API, however non-certified object stores are not supported. Currently, Amazon S3, Google Cloud Storage (GCS), Azure Blob Storage, Pure Storage FlashBlade, Nutanix Objects, NetApp Object Storage, Dell EMC ECS, MinIO, Cloudian, Ceph, and Scality support Tiered Storage.
Just a Bunch of Disks (JBOD) is not supported because Tiered Storage does not currently support multiple log directories, an inherent requirement of JBOD.

How to enable Tiered Storage on a broker

Enabling Tiered Storage on a broker configures the broker to offload log segments to object storage and fetch tiered data when consumers request it. To enable Confluent Tiered Storage on a cluster running Confluent Platform, specify the configurations for your cloud provider as described in the sections that follow. After you update these configurations, restart the brokers enabled for Tiered Storage. This restart can be done in a rolling fashion.

You can also begin this procedure from Confluent Control Center, as described in Configure Tiered Storage.

If you later disable Tiered Storage, keep confluent.tier.feature=true so that previously tiered data can still be fetched. For more information, see Disabling Tiered Storage.

Note

Starting with Confluent Platform 7.9.1, Internet Protocol version 6 (IPv6) is supported. IPv6 support is subject to the following constraints:

It can be enabled only on clusters using Java 11 or above.
It is only supported on clusters running in KRaft mode. It may work on clusters running in ZooKeeper mode, but it is not supported.
Tiered Storage: To enable IPv6 for Tiered Storage, you should set the following JVM property:
- Djava.net.preferIPv6Addresses=true.
In addition, the following property was added with a default of true. You can set it to false to disable IPv6 for tiered storage:
- confluent.tier.s3.ipv6.enabled=true.

Amazon S3 (AWS)

Tiered Storage on AWS requires Amazon Simple Storage Service (S3) bucket with read and write access and credentials.

To enable Tiered Storage on AWS with S3:

Add the following properties in your broker.properties file:
```
confluent.tier.feature=true
confluent.tier.enable=true
confluent.tier.backend=S3
confluent.tier.s3.bucket=<BUCKET_NAME>
confluent.tier.s3.prefix=<DIRECTORY-PATH>
confluent.tier.s3.region=<REGION>
# confluent.tier.metadata.replication.factor=1
```
Tip
- The Tiered Storage internal topic defaults to a replication factor of 3. If you use confluent local services start to run a single broker cluster, uncomment the last line in the preceding configuration to override the default replication factor for the Tiered Storage topic.
- Supported object storage configuration on AWS S3 is standard class.
Adding the preceding properties enables the Tiered Storage components on AWS with default parameters on all of the possible configurations.
- confluent.tier.feature enables Tiered Storage for a broker. Setting this to true allows a broker to use Tiered Storage.
- confluent.tier.enable sets the default value for created topics. Setting this to true causes all non-compacted topics to be tiered. When set to true, this causes all existing, non-compacted topics to have this configuration set to true as well. Only topics explicitly set to false do not use Tiered Storage. It is not required to set confluent.tier.enable=true to enable Tiered Storage.
- confluent.tier.backend refers to the cloud storage service to which a broker connects. For Amazon S3, set this to S3 as shown in the preceding configuration.
- BUCKET_NAME and REGION are the S3 bucket name and its region, respectively. A broker interacts with this bucket for writing and reading tiered data.
- confluent.tier.s3.prefix indicates a directory inside the S3 bucket, for example: confluent.tier.s3.prefix=confluent-tiered-storage-dir/
For example, a bucket named tiered-storage-test-aws located in the us-west-2 region would have these properties:
```
confluent.tier.s3.bucket=tiered-storage-test-aws
confluent.tier.s3.region=us-west-2
```
Optionally, configure S3 server-side encryption. By default, Confluent Tiered Storage uses AES256 encryption. If you need to use AWS KMS encryption (SSE-KMS) to meet compliance requirements, add the following properties to your broker.properties file:
```
confluent.tier.s3.sse.algorithm=aws:kms
confluent.tier.s3.sse.customer.encryption.key=<KMS_KEY_ARN>
```
- confluent.tier.s3.sse.algorithm Specifies the S3 server-side encryption algorithm. Valid values are:
  - AES256 (default) - Amazon S3-managed encryption keys (SSE-S3)
  - aws:kms - AWS Key Management Service keys (SSE-KMS)
  - none - Disable server-side encryption
- confluent.tier.s3.sse.customer.encryption.key The ARN of your AWS KMS key. This parameter is required when confluent.tier.s3.sse.algorithm is set to aws:kms.
  Example: arn:aws:kms:us-west-2:123456789012:key/12345678-1234-1234-1234-123456789012
Important
- Tiered Storage does not automatically use the encryption settings configured on your S3 bucket. You must explicitly configure these broker properties to use SSE-KMS encryption.
- When using aws:kms, ensure the AWS principal (IAM role or user) used by your brokers has the necessary KMS permissions (kms:Decrypt, kms:Encrypt, kms:GenerateDataKey) to access the encryption key.
- Setting the encryption algorithm to none results in unencrypted objects in S3.
Provide AWS credentials to connect to the S3 bucket. You can set these through broker.properties, through environment variables, or through a short-lived credentials file in ~/.aws/credentials. Any of these methods is sufficient. The brokers prioritize using the credentials supplied through broker.properties. If the brokers do not find credentials in broker.properties, they use environment variables or the credentials file instead.
- Broker Properties - Add the following property to your broker.properties file:
```
confluent.tier.s3.cred.file.path=<PATH_TO_AWS_CREDENTIALS_FILE>
```
  Replace <PATH> with the file path of the file that contains your AWS credentials.
  This field is hidden from the server log files.
- Environment Variables - Specify AWS credentials with these environment variables:
```
export AWS_ACCESS_KEY_ID=<YOUR_ACCESS_KEY_ID>
export AWS_SECRET_ACCESS_KEY=<YOUR_SECRET_ACCESS_KEY>
```
  If broker.properties does not contain the two properties for credentials, the broker uses the preceding environment variables to connect to the S3 bucket.
- AWS credentials file - As another alternative, you can save a credentials file with short-lived credentials in ~/.aws/credentials.
- AWS credentials chain - Tiered Storage supports AWS AssumeRole and your configured EC2 instance profile credentials per the default S3 Default credentials provider chain. If you have an instance profile attached to your EC2 instance, the S3 API within Tiered Storage looks for them.
The S3 bucket should allow the broker to perform the following actions. These operations are required by the broker to properly enable and use Tiered Storage.
```
s3:DeleteObject
s3:GetObject
s3:PutObject
s3:GetBucketLocation
s3:ListBucket
```

Google Cloud Storage (GCS)

To enable Tiered Storage on Google Cloud Platform (GCP) with GCS, complete the following steps.

To enable Tiered Storage, add the following properties in your broker.properties file:
```
confluent.tier.feature=true
confluent.tier.enable=true
confluent.tier.backend=GCS
confluent.tier.gcs.bucket=<BUCKET_NAME>
confluent.tier.gcs.region=<REGION>
# confluent.tier.metadata.replication.factor=1
```
Tip
- The Tiered Storage internal topic defaults to a replication factor of 3. If you use confluent local services start to run a single broker cluster uncomment the last line in the preceding configuration to override the default replication factor for the Tiered Storage topic.
- Supported object storage configuration on GCS is standard class.
Adding the preceding properties enables the Tiered Storage components on GCS with default parameters on all of the possible configurations.
- confluent.tier.feature enables Tiered Storage for a broker. Setting this to true allows a broker to use Tiered Storage.
- confluent.tier.enable sets the default value for created topics. Setting this to true causes all non-compacted topics to be tiered. When set to true, this causes all existing, non-compacted topics to have this configuration set to true as well. Only topics explicitly set to false do not use Tiered Storage. It is not required to set confluent.tier.enable=true to enable Tiered Storage.
- confluent.tier.backend refers to the cloud storage service a broker connects to. For Google Cloud Storage, set this to GCS as shown in the preceding configuration.
- BUCKET_NAME and REGION are the GCS bucket name and its region, respectively. A broker interacts with this bucket for writing and reading tiered data.
For example, a bucket named tiered-storage-test-gcs located in the us-central1 region would have these properties:
```
confluent.tier.gcs.bucket=tiered-storage-test-gcs
confluent.tier.gcs.region=us-central1
```
Provide GCS credentials to connect to the GCS bucket. You can set these through broker.properties or through environment variables. Either method is sufficient. The brokers prioritize using the credentials supplied through broker.properties. If the brokers do not find credentials in broker.properties, they use environment variables instead.
- Broker Properties - Add the following property to your broker.properties file:
```
confluent.tier.gcs.cred.file.path=<PATH_TO_GCS_CREDENTIALS_FILE>
```
  This field is hidden from the server log files.
- Environment Variables - Specify GCS credentials with this local environment variable:
```
export GOOGLE_APPLICATION_CREDENTIALS=<PATH_TO_GCS_CREDENTIALS_FILE>
```
  If broker.properties does not contain the property with the path to the credentials file, the broker uses the preceding environment variable to connect to the GCS bucket.
See the GCS documentation for more information.
The GCS bucket should allow the broker to perform the following actions. These operations are required by the broker to properly enable and use Tiered Storage.
```
storage.buckets.get
storage.objects.get
storage.objects.list
storage.objects.create
storage.objects.delete
storage.objects.update
```

Troubleshooting certificates

If the brokers fail to start due to Tiered Storage errors such as inability to access buckets and security certificate issues, make sure that you have the needed Google CA certificate(s). To troubleshoot:

Go to Google Trust Services repository, scroll down to the section Download CA certificates, and click Expand all.
Choose a certificate suitable for your cluster (for example, GlobalSign R4) that is currently valid (not yet expired), click the Action drop-down next to it, and download the Certificate (PEM) file to all the brokers in the cluster.

Import the certificate by running the following command:

keytool -import -trustcacerts -keystore /var/ssl/private/kafka_broker.truststore.jks -alias root -file <certificate.pem file>

Azure Blob Storage (Azure)

To enable Tiered Storage on Microsoft Azure using Azure Blob Storage, complete the following steps.

Add the following properties in your broker.properties file:
```
confluent.tier.feature=true
confluent.tier.enable=true
confluent.tier.backend=AzureBlockBlob
confluent.tier.azure.block.blob.container=your-container-name
confluent.tier.azure.block.blob.cred.file.path=<path to az-cred.json>
# confluent.tier.metadata.replication.factor=1
```
Tip
- The Tiered Storage internal topic defaults to a replication factor of 3. If you use confluent local services start to run a single broker cluster uncomment the last line in the preceding configuration to override the default replication factor for the Tiered Storage topic.
- The Hot tier is the default, hardcoded tier for Azure. If you want to override this, use an Azure lifecycle policy instead.
- Storage account must be of type Standard general-purpose v2.
- Azure Blob Storage access tier is recommended as the Hot tier.
Adding the preceding properties enables the Tiered Storage components on Azure with default parameters on all of the possible configurations.
- confluent.tier.feature enables Tiered Storage for a broker. Setting this to true allows a broker to use Tiered Storage.
- confluent.tier.enable sets the default value for created topics. Setting this to true causes all non-compacted topics to be tiered.
- confluent.tier.backend refers to the cloud storage service to which a broker connects. For Azure, set this to AzureBlockBlob as shown in the preceding configuration.
- container is the name of your container on the Azure portal.
- cred.file.path is the path to your credentials file to access the Azure portal. Provide Azure credentials to connect to the container. You can set these through broker.properties or through environment variables. Either of these methods is sufficient. The brokers prioritize using the credentials supplied through broker.properties. If the brokers do not find credentials in broker.properties, they use the order and location specified in DefaultAzureCredential.
For example, a bucket named tiered-storage-test-azure located in the westus2 region would have these properties:
```
confluent.tier.container=tiered-storage-test-azure
confluent.tier.container.region=westus2
```
Provide Azure credentials to connect to the container. You can set these through broker.properties or through environment variables. Either method is sufficient. The brokers prioritize using the credentials supplied through broker.properties. If the brokers do not find credentials in broker.properties, they use environment variables instead.
- Broker Properties - Add the following property to your broker.properties file:
```
confluent.tier.azure.block.blob.cred.file.path=<PATH_TO_AZURE_CREDENTIALS_FILE>
```
  Replace <PATH> with the file path of the file that contains your Azure Blob credentials. To learn more, see Azure Blob storage documentation.
  This field is hidden from the server log files.
  Here are examples of Azure credentials files; cred.file.path is looking for either of these:
  - JSON file with one field for a connectionString
    { "connectionString": "DefaultEndpointsProtocol=https;AccountName=tieredstoragesystemtest;AccountKey=xxxxxxxxx;EndpointSuffix=core.windows.net" }
  - JSON file with three fields for azureClientId, azureTenantId, and azureClientSecret
    { "azureClientId": "xxxxxx", "azureTenantId": "xxxxxx", "azureClientSecret": "xxxxxx", }
- Environment Variables - Specify Azure credentials with these environment variables:
```
export AZURE_ACCESS_KEY_ID=<YOUR_ACCESS_KEY_ID>
export AZURE_SECRET_ACCESS_KEY=<YOUR_SECRET_ACCESS_KEY>
```
  If broker.properties does not contain the two properties for credentials, the broker uses the preceding environment variables to connect to the container.
The container should allow the broker to perform the following actions. These operations are required by the broker to properly enable and use Tiered Storage.
```
storage.objects.get
storage.objects.list
storage.objects.create
storage.objects.delete
```

Pure Storage FlashBlade

To enable Tiered Storage on Pure Storage FlashBlade through the Amazon S3 API, complete the following steps.

Add the following properties in your broker.properties file:
```
confluent.tier.feature=true
confluent.tier.enable=true
confluent.tier.backend=S3
confluent.tier.s3.bucket=<BUCKET_NAME>
confluent.tier.s3.region=<REGION>
confluent.tier.s3.aws.endpoint.override=<FLASHBLADE ENDPOINT>
# For Confluent Platform 7.9.1 or later, IPV6 must be disabled with an endpoint override.
confluent.tier.s3.ipv6.enabled=false
# confluent.tier.metadata.replication.factor=1
```
Tip
The Tiered Storage internal topic defaults to a replication factor of 3. If you use confluent local services start to run a single broker cluster uncomment the last line in the preceding configuration to override the default replication factor for the Tiered Storage topic.
Adding the preceding properties enables the Tiered Storage components on Pure Storage FlashBlade with default parameters on all of the possible configurations.
- confluent.tier.feature enables Tiered Storage for a broker. Setting this to true allows a broker to use Tiered Storage.
- confluent.tier.enable sets the default value for created topics. Setting this to true causes all non-compacted topics to be tiered. When set to true, this causes all existing, non-compacted topics to have this configuration set to true as well. Only topics explicitly set to false do not use Tiered Storage. It is not required to set confluent.tier.enable=true to enable Tiered Storage.
- confluent.tier.backend refers to the cloud storage service to which a broker connects. For Pure Storage FlashBlade, set this to S3 as shown in the preceding configuration.
- BUCKET_NAME and REGION are the S3 bucket name and its region, respectively. A broker interacts with this bucket for writing and reading tiered data. Set the region to any valid AWS region; these are all equivalent in Flashblade. The region cannot be null or empty. For example, a bucket named tiered-storage-test-aws located in the us-west-2 region would have these properties:
```
confluent.tier.s3.bucket=tiered-storage-test-aws
confluent.tier.s3.region=us-west-2
```
- confluent.tier.s3.aws.endpoint.override refers to the Pure Storage FlashBlade connection point.
- confluent.tier.s3.ipv6.enabled=false is required for Confluent Platform 7.9.1 or later when an endpoint override is used. This setting is not required for earlier versions of Confluent Platform.
```
confluent.tier.s3.aws.endpoint.override=https://flashblade.example.com
# For Confluent Platform 7.9.1 or later, IPV6 must be disabled with an endpoint override.
confluent.tier.s3.ipv6.enabled=false
```
Provide credentials generated by Pure Storage CLI to connect to the FlashBlade S3 Bucket. You can set these through broker.properties or through environment variables. Either method is sufficient. The brokers prioritize using the credentials supplied through broker.properties. If the brokers do not find credentials in broker.properties, they use environment variables instead.
- Broker Properties - Add the following property to your broker.properties file:
```
confluent.tier.s3.cred.file.path=<PATH_TO_AWS_CREDENTIALS_FILE>
```
  Replace <PATH_TO_AWS_CREDENTIALS_FILE> with the file path to the file that contains your AWS credentials in Java property format, like the following:
```
accessKey=ABCDYXXXXXXXXXXXXXX
secretKey=Q+LSASDSDASDdssbbbbb+pG
```
  This field is hidden from the server log files.
- Environment Variables - Specify Pure Storage FlashBlade credentials with these environment variables:
```
export AWS_ACCESS_KEY_ID=<YOUR_ACCESS_KEY_ID>
export AWS_SECRET_ACCESS_KEY=<YOUR_SECRET_ACCESS_KEY>
```
  If broker.properties does not contain the two properties for credentials, the broker uses the preceding environment variables to connect to the S3 bucket.

Nutanix Objects

To enable Tiered Storage on Nutanix Objects through the Amazon S3 API, complete the following steps.

Add the following properties in your broker.properties file:
```
confluent.tier.feature=true
confluent.tier.enable=true
confluent.tier.backend=S3
confluent.tier.s3.bucket=<BUCKET_NAME>
confluent.tier.s3.region=<REGION>
confluent.tier.s3.aws.endpoint.override=<NUTANIX OBJECTS ENDPOINT>
# For Confluent Platform 7.9.1 or later, IPV6 must be disabled with an endpoint override.
confluent.tier.s3.ipv6.enabled=false
# confluent.tier.metadata.replication.factor=1
```
Tip
The Tiered Storage internal topic defaults to a replication factor of 3. If you use confluent local services start to run a single broker cluster uncomment the last line in the preceding configuration to override the default replication factor for the Tiered Storage topic.
Adding the preceding properties enables the Tiered Storage components on Nutanix Objects with default parameters on all of the possible configurations.
- confluent.tier.feature enables Tiered Storage for a broker. Setting this to true allows a broker to use Tiered Storage.
- confluent.tier.enable sets the default value for created topics. Setting this to true causes all non-compacted topics to be tiered. When set to true, this causes all existing, non-compacted topics to have this configuration set to true as well. Only topics explicitly set to false do not use Tiered Storage. It is not required to set confluent.tier.enable=true to enable Tiered Storage.
- confluent.tier.backend refers to the cloud storage service to which a broker connects. For Nutanix Objects, set this to S3 as shown in the preceding configuration.
- BUCKET_NAME and REGION are the S3 bucket name and its region, respectively. A broker interacts with this bucket for writing and reading tiered data. Set the region to any valid AWS region. The region cannot be null or empty. For example, a bucket named tiered-storage-test-confluent located in the us-east-1 region would have these properties:
```
confluent.tier.s3.bucket=tiered-storage-test-confluent
confluent.tier.s3.region=us-east-1
```
- confluent.tier.s3.aws.endpoint.override refers to the Nutanix Objects endpoint fully qualified domain name. If ntnx-obj is the object store name and nutanix.com is the domain name, then endpoint override should be similar to the code example that follows.
```
confluent.tier.s3.aws.endpoint.override=http://ntnx-obj.nutanix.com
```
- confluent.tier.s3.ipv6.enabled=false is required for Confluent Platform 7.9.1 or later when an endpoint override is used. This setting is not required for earlier versions of Confluent Platform.
Provide credentials generated from the Objects page of Nutanix prism central to connect to the Nutanix Objects S3 Bucket. You can set these through broker.properties or through environment variables. Either method is sufficient. The brokers prioritize using the credentials supplied through broker.properties. If the brokers do not find credentials in broker.properties, they use environment variables instead.
- Broker Properties - Add the following property to your broker.properties file:
```
confluent.tier.s3.cred.file.path=<PATH_TO_AWS_CREDENTIALS_FILE>
```
  Replace <PATH_TO_AWS_CREDENTIALS_FILE> with the file path of the file that contains your Nutanix credentials.
  This field is hidden from the server log files.
- Environment Variables - Specify Nutanix Objects credentials with these environment variables:
```
export AWS_ACCESS_KEY_ID=<NUTANIX_OBJECTS_ACCESS_KEY_ID>
export AWS_SECRET_ACCESS_KEY=<NUTANIX_OBJECTS_ACCESS_KEY>
```
  If broker.properties does not contain the two properties for credentials, the broker uses the preceding environment variables to connect to the S3 bucket.
The Nutanix Objects bucket should allow the broker to perform the following actions. These operations are required by the broker to properly enable and use Tiered Storage. These operations can be allowed by providing read/write access to the user on the bucket.
```
storage.buckets.get
storage.objects.get
storage.objects.list
storage.objects.create
storage.objects.delete
storage.objects.update
```

NetApp Object Storage

To enable Tiered Storage on NetApp Object Storage through the Amazon S3 API, complete the following steps.

Add the following properties in your broker.properties file:
```
confluent.tier.feature=true
confluent.tier.enable=true
confluent.tier.backend=S3
confluent.tier.s3.bucket=<BUCKET_NAME>
confluent.tier.s3.region=<REGION>
confluent.tier.s3.aws.endpoint.override=<NETAPP OBJECT STORAGE ENDPOINT>
# For Confluent Platform 7.9.1 or later, IPV6 must be disabled with an endpoint override.
confluent.tier.s3.ipv6.enabled=false
confluent.tier.s3.force.path.style.access=<BOOLEAN>
# confluent.tier.metadata.replication.factor=1
```
Tip
The Tiered Storage internal topic defaults to a replication factor of 3. If you use confluent local services start to run a single broker cluster uncomment the last line in the preceding configuration to override the default replication factor for the Tiered Storage topic.
Adding the preceding properties enables the Tiered Storage components on NetApp Object Storage with default parameters on all of the possible configurations.
- confluent.tier.feature enables Tiered Storage for a broker. Setting this to true allows a broker to use Tiered Storage.
- confluent.tier.enable sets the default value for created topics. Setting this to true causes all non-compacted topics to be tiered. When set to true, this causes all existing, non-compacted topics to have this configuration set to true as well. Only topics explicitly set to false do not use Tiered Storage. It is not required to set confluent.tier.enable=true to enable Tiered Storage.
- confluent.tier.backend refers to the cloud storage service to which a broker connects. For NetApp Object Storage, set this to S3 as shown in the preceding configuration.
- BUCKET_NAME and REGION are the S3 bucket name and its region, respectively. A broker interacts with this bucket for writing and reading tiered data. Set the region to any valid AWS region. The region cannot be null or empty. For example, a bucket named tiered-storage-test-confluent located in the us-east-1 region would have these properties:
```
confluent.tier.s3.bucket=tiered-storage-test-confluent
confluent.tier.s3.region=us-east-1
```
- confluent.tier.s3.aws.endpoint.override refers to the NetApp Object Storage endpoint.
- confluent.tier.s3.ipv6.enabled=false is required for Confluent Platform 7.9.1 or later when an endpoint override is used. This setting is not required for earlier versions of Confluent Platform.
- confluent.tier.s3.force.path.style.access configures the client to use path-style access for all requests. This flag is not enabled by default. The default behavior is to detect which access style to use based on the configured endpoint and the bucket being accessed. Setting this flag results in path-style access being forced for all requests. NetApp supports both virtual host style and path style access. Setting this parameter to true enables path style access.
Provide credentials generated by NetApp Object Storage to connect to the NetApp Object Storage S3 Bucket. You can set these through broker.properties or through environment variables. Either method is sufficient. The brokers prioritize using the credentials supplied through broker.properties. If the brokers do not find credentials in broker.properties, they use environment variables instead.
- Broker Properties - Add the following property to your broker.properties file:
```
confluent.tier.s3.cred.file.path=<PATH_TO_AWS_CREDENTIALS_FILE>
```
  Replace <PATH_TO_AWS_CREDENTIALS_FILE> with the file path of the file that contains your NetApp Object Storage credentials.
  This field is hidden from the server log files.
- Environment Variables - Specify NetApp Object Storage credentials with these environment variables:
```
export AWS_ACCESS_KEY_ID=<NETAPP_OBJECT_STORAGE_ACCESS_KEY_ID>
export AWS_SECRET_ACCESS_KEY=<NETAPP_OBJECT_STORAGE_ACCESS_KEY>
```
  If broker.properties does not contain the two properties for credentials, the broker uses the preceding environment variables to connect to the S3 bucket.
The NetApp Object Storage bucket should allow the broker to perform the following actions. These operations are required by the broker to properly enable and use Tiered Storage. These operations can be allowed by providing read/write access to the user on the bucket.
```
storage.buckets.get
storage.objects.get
storage.objects.list
storage.objects.create
storage.objects.delete
storage.objects.update
```

Dell EMC ECS

To enable Tiered Storage on Dell EMC ECS through the Amazon S3 API, complete the following steps.

Add the following properties in your broker.properties file:
```
confluent.tier.feature=true
confluent.tier.enable=true
confluent.tier.backend=S3
confluent.tier.s3.bucket=<BUCKET_NAME>
confluent.tier.s3.region=<REGION>
confluent.tier.s3.aws.endpoint.override=<DELL EMC ECS ENDPOINT>
# For Confluent Platform 7.9.1 or later, IPV6 must be disabled with an endpoint override.
confluent.tier.s3.ipv6.enabled=false
confluent.tier.s3.force.path.style.access=<BOOLEAN>
# confluent.tier.metadata.replication.factor=1
```
Tip
The Tiered Storage internal topic defaults to a replication factor of 3. If you use confluent local services start to run a single broker cluster uncomment the last line in the preceding configuration to override the default replication factor for the Tiered Storage topic.
Adding the preceding properties enables the Tiered Storage components on Dell EMC ECS with default parameters on all of the possible configurations.
- confluent.tier.feature enables Tiered Storage for a broker. Setting this to true allows a broker to use Tiered Storage.
- confluent.tier.enable sets the default value for created topics. Setting this to true causes all non-compacted topics to be tiered. When set to true, this causes all existing, non-compacted topics to have this configuration set to true as well. Only topics explicitly set to false do not use Tiered Storage. It is not required to set confluent.tier.enable=true to enable Tiered Storage.
- confluent.tier.backend refers to the cloud storage service to which a broker connects. For Dell EMC ECS, set this to S3 as shown in the preceding configuration.
- BUCKET_NAME and REGION are the S3 bucket name and its region, respectively. A broker interacts with this bucket for writing and reading tiered data. Set the region to any valid AWS region. The region cannot be null or empty. For example, a bucket named tiered-storage-test-confluent located in the us-east-1 region would have these properties:
```
confluent.tier.s3.bucket=tiered-storage-test-confluent
confluent.tier.s3.region=us-east-1
```
- confluent.tier.s3.aws.endpoint.override refers to the Dell EMC ECS endpoint.
- confluent.tier.s3.ipv6.enabled=false is required for Confluent Platform 7.9.1 or later when an endpoint override is used. This setting is not required for earlier versions of Confluent Platform.
- confluent.tier.s3.force.path.style.access configures the client to use path-style access for all requests. This flag is not enabled by default. The default behavior is to detect which access style to use based on the configured endpoint and the bucket being accessed. Setting this flag results in path-style access being forced for all requests.
Provide credentials generated by Dell EMC ECS to connect to the ECS Bucket. You can set these through broker.properties or through environment variables. Either method is sufficient. The brokers prioritize using the credentials supplied through broker.properties. If the brokers do not find credentials in broker.properties, they use environment variables instead.
- Broker Properties - Add the following property to your broker.properties file:
```
confluent.tier.s3.cred.file.path=<PATH_TO_AWS_CREDENTIALS_FILE>
```
  Replace <PATH_TO_AWS_CREDENTIALS_FILE> with the file path of the file that contains your Dell EMC ECS credentials.
  This field is hidden from the server log files.
- Environment Variables - Specify Dell EMC ECS credentials with these environment variables:
```
export AWS_ACCESS_KEY_ID=<DELL_EMC_ECS_ACCESS_KEY_ID>
export AWS_SECRET_ACCESS_KEY=<DELL_EMC_ECS_ACCESS_KEY>
```
  If broker.properties does not contain the two properties for credentials, the broker uses the preceding environment variables to connect to the S3 bucket.
The Dell EMC ECS bucket should allow the broker to perform the following actions. These operations are required by the broker to properly enable and use Tiered Storage. These operations can be allowed by providing read/write access to the user on the bucket.
```
storage.buckets.get
storage.objects.get
storage.objects.list
storage.objects.create
storage.objects.delete
storage.objects.update
```

MinIO

To enable Tiered Storage on MinIO through the Amazon S3 API, complete the following steps.

Add the following properties in your broker.properties file:
```
confluent.tier.feature=true
confluent.tier.enable=true
confluent.tier.backend=S3
confluent.tier.s3.bucket=<BUCKET_NAME>
confluent.tier.s3.region=<REGION>
confluent.tier.s3.aws.endpoint.override=<MINIO_ENDPOINT>
# For Confluent Platform 7.9.1 or later, IPV6 must be disabled with an endpoint override.
confluent.tier.s3.ipv6.enabled=false
confluent.tier.s3.force.path.style.access=true
# confluent.tier.metadata.replication.factor=1
```
Tip
The Tiered Storage internal topic defaults to a replication factor of 3. If you use confluent local services start to run a single broker cluster uncomment the last line in the preceding configuration to override the default replication factor for the Tiered Storage topic.
Adding the preceding properties enables the Tiered Storage components on MinIO with default parameters on all of the possible configurations.
- confluent.tier.feature enables Tiered Storage for a broker. Setting this to true allows a broker to use Tiered Storage.
- confluent.tier.enable sets the default value for created topics. Setting this to true causes all non-compacted topics to be tiered. When set to true, this causes all existing, non-compacted topics to have this configuration set to true as well. Only topics explicitly set to false do not use Tiered Storage. It is not required to set confluent.tier.enable=true to enable Tiered Storage.
- confluent.tier.backend refers to the cloud storage service to which a broker connects. For MinIO Object Storage, set this to S3 as shown in the preceding configuration.
- BUCKET_NAME and REGION are the S3 bucket name and its region, respectively. A broker interacts with this bucket for writing and reading tiered data. MinIO defaults to us-east-1 for REGION unless explicitly started with a different region value. Replace REGION with the appropriate value based on your MinIO configuration. You can retrieve this value using the mc command-line tool:
```
mc admin info --json ALIAS | jq ".info.region"
```
  Create the bucket before configuring Tiered Storage. See mc mb for details on creating a bucket.
```
confluent.tier.s3.aws.endpoint.override=http://minio.example.net
confluent.tier.s3.ipv6.enabled=false
confluent.tier.s3.force.path.style.access=true
```
- confluent.tier.s3.aws.endpoint.override refers to the MinIO Object Storage endpoint fully qualified domain name. MinIO recommends using a load balancer configured for round-robin selection of all hosts in the MinIO deployment. If minio is the load balancer hostname and example.com is the domain name, then the endpoint override should be as follows.
- confluent.tier.s3.ipv6.enabled=false is required for Confluent Platform 7.9.1 or later when an endpoint override is used. This setting is not required for earlier versions of Confluent Platform.
- confluent.tier.s3.force.path.style.access configures the client to use path-style access for all requests. This flag is not enabled by default. The default behavior is to detect which access style to use based on the configured endpoint and the bucket being accessed. Setting this flag results in path-style access being forced for all requests. MinIO deployments typically support path-style access only. There are no advantages to using other access styles.
Provide credentials generated by MinIO to connect to the MinIO Bucket. You can set these through broker.properties or through environment variables. Either method is sufficient. The brokers prioritize using the credentials supplied through broker.properties. If the brokers do not find credentials in broker.properties, they use environment variables instead.
- Broker Properties - Add the following property to your broker.properties file:
```
confluent.tier.s3.cred.file.path=<PATH_TO_MINIO_CREDENTIALS_FILE>
```
  Replace <PATH_TO_MINIO_CREDENTIALS_FILE> with the file path of the file that contains your MinIO Object Storage credentials The user must have read/write/list permissions on the BUCKET_NAME bucket.
  This field is hidden from the server log files.
- Environment Variables - Specify MinIO credentials with these environment variables:
```
export AWS_ACCESS_KEY_ID=<MINIO_OBJECT_STORAGE_ACCESS_KEY_ID>
export AWS_SECRET_ACCESS_KEY=<MINIO_OBJECT_STORAGE_ACCESS_KEY>
```
  If broker.properties does not contain the two properties for credentials, the broker uses the preceding environment variables to connect to the S3 bucket.

The MinIO credentials must allow the broker to perform the following actions. These operations are required by the broker to properly enable and use Tiered Storage. These operations can be allowed by providing read/write access to the user on the bucket.

{
  "PolicyName": "ConfluentTieredStorage",
  "Policy": {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Action": [
          "s3:DeleteObject",
          "s3:GetObject",
          "s3:PutObject",
          "s3:GetBucketLocation"
        ],
        "Resource": [
          "arn:aws:s3:::BUCKET_NAME",
          "arn:aws:s3:::BUCKET_NAME/*"
        ]
      }
    ]
  }
}

Use mc admin policy set to assign this policy to the user credentials…

Cloudian HyperStore Object Storage

To enable the Tiered Storage on Cloudian Hyperstore Object Storage through the Amazon S3 API, complete the following steps.

Add the following properties in your broker.properties file:
```
confluent.tier.feature=true
confluent.tier.enable=true
confluent.tier.backend=S3
confluent.tier.s3.bucket=<BUCKET_NAME>
confluent.tier.s3.region=<REGION>
confluent.tier.s3.aws.endpoint.override=<CLOUDIAN_ENDPOINT>
# For Confluent Platform 7.9.1 or later, IPV6 must be disabled with an endpoint override.
confluent.tier.s3.ipv6.enabled=false
```
Adding the preceding properties enables the Tiered Storage components on Hyperstore Object Storage with default parameters on all of the possible configurations.
- confluent.tier.feature enables Tiered Storage for a broker. Setting this to true allows a broker to use Tiered Storage.
- confluent.tier.enable sets the default value for created topics. Setting this to true causes all non-compacted topics to be tiered.
- confluent.tier.backend refers to the cloud storage service to which a broker connects. For Cloudian Hyperstore, set this to S3 as shown in the preceding configuration.
- BUCKET_NAME and REGION are the Hyperstore bucket name and its region, respectively. A broker interacts with this bucket for writing and reading tiered data.
  For example, a bucket named tiered-storage-test-aws located in the us-west-2 region would have these properties:
```
confluent.tier.s3.bucket=tiered-storage-test-aws
confluent.tier.s3.region=us-west-2
```
- confluent.tier.s3.aws.endpoint.override refers to the Hyperstore Object Storage connection point.
- confluent.tier.s3.ipv6.enabled=false required for Confluent Platform 7.9.1 or later when an endpoint override is used. This setting is not required for earlier versions of Confluent Platform.
Provide credentials generated by Cloudian to connect to the Hyperstore Bucket. You can set these through broker.properties or through environment variables. Either method is sufficient. The brokers prioritize using the credentials supplied through broker.properties. If the brokers do not find credentials in broker.properties, they use environment variables instead.
- Broker Properties - Add the following property to your broker.properties file:
```
confluent.tier.s3.cred.file.path=<PATH_TO_CREDENTIALS_FILE>
```
  Replace <PATH> with the file path of the file that contains your Cloudian Hyperstore supplied credentials and should be in the format of:
```
accessKey=<YOUR_ACCESS_KEY_ID>
secretKey=<YOUR_SECRET_ACCESS_KEY>
```
  This field is hidden from the server log files.
- Environment Variables - Specify Hyperstore credentials with these environment variables:
```
export AWS_ACCESS_KEY_ID=<YOUR_ACCESS_KEY_ID>
export AWS_SECRET_ACCESS_KEY=<YOUR_SECRET_ACCESS_KEY>
```
  If broker.properties does not contain the two properties for credentials, the broker uses the preceding environment variables to connect to the bucket.

Ceph

Using Tiered Storage on Ceph requires that Ceph is configured as an S3 endpoint with a working S3 gateway bucket. You must have credentials to access the bucket and you must be able to read and write to the bucket with those credentials.

To enable Tiered Storage on Ceph:

You must already have Ceph running as the S3 storage provider. You must run the bucket in the us-east-1 region.
Modify the broker.properties file for your broker as follows:
```
confluent.tier.enable=true
confluent.tier.feature=true
confluent.tier.backend=S3
confluent.tier.s3.cred.file.path=<PATH_TO_CREDENTIALS_FILE>
confluent.tier.s3.bucket=<BUCKET_NAME>
confluent.tier.s3.region=us-east-1
confluent.tier.s3.aws.endpoint.override=<CEPH_ENDPOINT>
# For Confluent Platform 7.9.1 or later, IPV6 must be disabled with an endpoint override.
confluent.tier.s3.ipv6.enabled=false
# Optional: Configure S3 server-side encryption
# Supported values: AES256 (default), aws:kms, none
confluent.tier.s3.sse.algorithm=AES256
# Required only when using aws:kms
# confluent.tier.s3.sse.customer.encryption.key=<KMS_KEY_ARN>
confluent.tier.metadata.replication.factor=1
```
Description of the settings:
- confluent.tier.enable sets the default value for created topics. Setting this to true causes all non-compacted topics to be tiered.
- confluent.tier.feature enables Tiered Storage for a broker. Setting this to true allows a broker to use Tiered Storage.
- confluent.tier.backend refers to the cloud storage service to which a broker connects. For Ceph, set this to S3 as shown in the preceding configuration.
- confluent.tier.s3.cred.file.path provides the path to the file that contains credentials to access the bucket. This field is hidden from the server log files.
- confluent.tier.s3.bucket and confluent.tier.s3.region are the Ceph bucket name and its region, respectively. A broker interacts with this bucket for writing and reading tiered data.
- confluent.tier.s3.aws.endpoint.override provides the URL for the Ceph bucket.
- confluent.tier.s3.ipv6.enabled=false is required for Confluent Platform 7.9.1 or later when an endpoint override is used. This setting is not required for earlier versions of Confluent Platform.
- confluent.tier.s3.sse.algorithm specifies the S3 server-side encryption algorithm. Supported values are AES256 (default), aws:kms, or none. When using aws:kms, you must also set confluent.tier.s3.sse.customer.encryption.key to the ARN of your KMS key.
- confluent.tier.metadata.replication.factor sets the replication factor to 1 for a single-cluster configuration.

Scality

To enable Tiered Storage on Scality through the Amazon S3 API, complete the following steps.

Add the following properties in your broker.properties file:
```
confluent.tier.feature=true
confluent.tier.enable=true
confluent.tier.backend=S3
confluent.tier.s3.bucket=<BUCKET_NAME>
confluent.tier.s3.region=<REGION>
confluent.tier.s3.aws.endpoint.override=<SCALITY ENDPOINT>
# For Confluent Platform 7.9.1 or later, IPV6 must be disabled with an endpoint override.
confluent.tier.s3.ipv6.enabled=false
# confluent.tier.metadata.replication.factor=1
```
Tip
The Tiered Storage internal topic defaults to a replication factor of 3. If you use confluent local services start to run a single broker cluster uncomment the last line in the preceding configuration to override the default replication factor for the Tiered Storage topic.
Adding the preceding properties enables the Tiered Storage components on Scality with default parameters on all of the possible configurations.
- confluent.tier.feature enables Tiered Storage for a broker. Setting this to true allows a broker to use Tiered Storage.
- confluent.tier.enable sets the default value for created topics. Setting this to true causes all non-compacted topics to be tiered. When set to true, this causes all existing, non-compacted topics to have this configuration set to true as well. Only topics explicitly set to false do not use Tiered Storage. It is not required to set confluent.tier.enable=true to enable Tiered Storage.
- confluent.tier.backend refers to the cloud storage service to which a broker connects. For Scality, set this to S3 as shown in the preceding configuration.
- BUCKET_NAME and REGION are the S3 bucket name and its region, respectively. A broker interacts with this bucket for writing and reading tiered data. Set the region to any valid AWS region. The region cannot be null or empty. For example, a bucket named tiered-storage-test-confluent located in the us-east-1 region would have these properties:
```
confluent.tier.s3.bucket=tiered-storage-test-confluent
confluent.tier.s3.region=us-east-1
```
- confluent.tier.s3.aws.endpoint.override refers to the Scality endpoint including the fully qualified domain name. If s3.isv is the object store name and scality.com is the domain name, the endpoint would look like the following code example.
- confluent.tier.s3.ipv6.enabled=false is required for Confluent Platform 7.9.1 or later when an endpoint override is used. This setting is not required for earlier versions of Confluent Platform.
```
confluent.tier.s3.aws.endpoint.override=https://s3.isv.scality.com
confluent.tier.s3.ipv6.enabled=false
```
Provide credentials for the broker to access the Scality S3 Bucket. You can set these through broker.properties or through environment variables. The brokers prioritize using the credentials supplied through broker.properties. If the brokers do not find credentials in broker.properties, they use environment variables instead.
- Broker Properties - Add the following property to your broker.properties file:
```
confluent.tier.s3.cred.file.path=<PATH_TO_AWS_CREDENTIALS_FILE>
```
  Replace <PATH_TO_AWS_CREDENTIALS_FILE> with the file path of the file that contains your Scality credentials.
  This field is hidden from the server log files.
- Environment Variables - Specify Scality credentials with these environment variables:
```
export AWS_ACCESS_KEY_ID=<SCALITY_ACCESS_KEY_ID>
export AWS_SECRET_ACCESS_KEY=<SCALITY_ACCESS_KEY>
```
  If broker.properties does not contain the two properties for credentials, the broker uses the preceding environment variables to connect to the S3 bucket.
The Scality bucket should allow the broker to perform the following actions. These operations are required by the broker to properly enable and use Tiered Storage. These operations can be allowed by providing read/write access to the user on the bucket.
```
storage.buckets.get
storage.objects.get
storage.objects.list
storage.objects.create
storage.objects.delete
storage.objects.update
```

Configuring Tiered Storage to support compacted topics

Tiered Storage supports compacted topics starting with Confluent Platform 7.6.

Tiered Storage support for compacted topics enables log compaction to work with segments stored in object storage, maintaining only the latest value for each key. The properties are set the same way, regardless of the cloud provider you are using. Enabling Tiered Storage for compacted topics consists of configurations and cluster rolls to take effect, and some configurations are not reversible, as described in the following steps.

To enable Tiered Storage for compacted topics:

Set both confluent.tier.feature on the broker and confluent.tier.enable on the topic to true.
Set confluent.tier.cleaner.feature.enable to true. This flag enables tiering component for compacted topics. This configuration is not reversible.
Roll (restart) the cluster. If you do not roll the cluster, compacted topics do not use Tiered Storage.
Set confluent.tier.cleaner.enable to true. This flag activates tiering for compacted topics.
Roll (restart) the cluster.

Creating a topic with Tiered Storage

To create a topic with Tiered Storage enabled, use the kafka-topics tool with tier-specific configurations. This configures the topic to automatically offload older segments to object storage based on retention and hotset policies. The hotset is the set of data segments retained on broker-local storage for fast access.

You can create a topic using the kafka-topics tool, found in the $CONFLUENT_HOME/bin directory.

kafka-topics --bootstrap-server localhost:9092   \
  --create --topic trades \
  --partitions 6 \
  --replication-factor 3 \
  --config confluent.tier.enable=true \
  --config confluent.tier.local.hotset.ms=3600000 \
  --config retention.ms=604800000

confluent.tier.local.hotset.ms - controls the maximum time non-active segments are retained on broker-local storage before being discarded to free up space. Segments deleted from local disks exist in object storage and remain available according to the retention policy. If set to -1, no time limit is applied.
retention.ms works similarly in tiered topics to non-tiered topics, but expires segments from both object storage and local storage according to the retention policy.

Tip

As a recommended best practice, do not set a retention policy on the cloud storage (such as an AWS S3 bucket) because this may conflict with the Kafka topic retention policy.
You can also configure topics from Control Center. To do so, log on to Control Center and navigate to a topic (<environment> -> <cluster> -> <topic>). To learn more, see Tiered Storage in the Control Center User Guide. For a local deployment, Control Center is available at http://localhost:9021/ in your web browser.
For more on using Kafka commands, see Use the command line tools in Learn More About Confluent Products and Kafka.

Sending test messages to experiment with data storage

You can use the topic you created in the previous section as a test case, or create a new one. To speed up the rate at which data is transferred to storage for the purpose of this example, update from the default on the segment.bytes setting on the topic, as shown in the following example. You can do this by updating configurations on an existing topic from the Control Center expert mode settings on the topic, or create a new topic, as shown.

kafka-topics --bootstrap-server localhost:9092   \
  --create --topic hot-topic \
  --partitions 6 \
  --replication-factor 3 \
  --config confluent.tier.enable=true \
  --config confluent.tier.local.hotset.ms=3600000 \
  --config retention.ms=604800000 \
  --config segment.bytes=10485760

After you have Tiered Storage configured, you can send test data to one or more topics, and run a consumer to read the messages. For example, use the following command to produce messages:

kafka-producer-perf-test \
   --producer-props bootstrap.servers=localhost:9092 \
   --topic hot-topic \
   --record-size 1000 \
   --throughput 1000 \
   --num-records 3600000

Let this run for five or 10 minutes, and then run a consumer, for example:

kafka-console-consumer --bootstrap-server localhost:9092 --from-beginning --topic hot-topic

After 10 or 20 minutes, check the Control Center UI, and you should see the data saved off to your storage container. For a local deployment, Control Center is available at http://localhost:9021/ in your web browser.

To learn more about using Control Center to configure and manage Tiered Storage, see Tiered Storage.

Best practices and recommendations

Follow these best practices to optimize Tiered Storage performance, security, and reliability.

Tuning

To improve the performance of Tiered Storage, you can increase TierFetcherNumThreads and TierArchiverNumThreads. As a general guideline, you want to increase TierFetcherNumThreads to match the number of physical CPU cores and TierArchiverNumThreads to half the number of CPU cores. For example, in broker.properties, if you have a machine with eight physical cores, set confluent.tier.fetcher.num.threads = 8 and confluent.tier.archiver.num.threads = 4.

Time interval for topic deletes

When a topic is deleted, the deletion of the log segment files in object storage does not immediately begin. The deletion takes place on a timer interval, specified by the confluent.tier.topic.delete.check.interval.ms property. The default time interval for deletion is five minutes. After a topic or cluster is deleted, it’s OK to manually delete the objects in the respective bucket.

Log segment sizes

If tiering is enabled, you should decrease the segment size configuration from the default of 1 gibibyte specified in the log.segment.bytes property. The archiver waits for a log segment to close before attempting to upload the segment to object storage. Using a smaller segment size, such as 100 MB, enables segments to close at a more frequent rate. Also, smaller segments sizes help with page cache behavior, improving the performance of the archiver.

ACLs on Tiered Storage internal topics

A recommended best practice for on-premises deployments is to enable an ACL authorizer on the internal topics used for Tiered Storage. Set ACL rules to limit access on this data to the broker user only. This secures the internal topics, and prevents unauthorized access to Tiered Storage data and metadata.

For example, the following command sets ACLs on the internal topic for Tiered Storage, _confluent-tier-state. (Currently, there is only a single internal topic related to Tiered Storage.) The example creates an ACL that provides the principal kafka permission for all operations on the internal topic.

kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf \
--add --allow-principal User:<kafka> --operation All --topic "_confluent-tier-state"

Tip

In practice, replace the User:<kafka> with the actual broker principal in your deployment.

To set ACLs on any Tiered Storage internal topic (if others are added in future releases), specify the topic name with _confluent-tier as the prefix. For example, the following command sets ACLs on any internal topic related to Tiered Storage.

kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf \
--add --allow-principal User:<kafka> --operation All --topic "_confluent-tier-" --resource-pattern-type PREFIXED

TLS settings and troubleshooting certificates

To prevent TLS certificate errors when enabling Tiered Storage, take the following steps.

Import the correct storage server certificate into the truststore on Kafka broker.
Set truststore related configurations so that the correct truststore file is picked up by the client library for your storage type. Credential file paths are documented in Kafka Broker and Controller Configuration Reference for Confluent Platform.
For example, you can use the following properties in broker settings along with other confluent.tier settings to specify the truststore for the AWS S3 client to verify the server certificate correctly.
```
confluent.tier.s3.ssl.truststore.location = ...
confluent.tier.s3.ssl.truststore.password = ...
confluent.tier.s3.ssl.truststore.type = JKS
```
For another example, see the “Troubleshooting Certificates” section in the Google Cloud Storage (GCS) steps.

Sizing brokers with Tiered Storage

Sizing brokers with Tiered Storage focuses on determining local disk capacity based on the hotset retention period and leaving headroom for object storage outages.

The confluent.tier.local.hotset.ms property controls how long segments are retained on broker-local storage before being discarded to free up space. Tuning this value is ultimately a business decision, but a few practices help with sizing.

If consumers are lagging while fetching data from object storage, consider increasing the hotset retention period by increasing the value of confluent.tier.local.hotset.ms.

When planning disk sizes on your brokers, also leave headroom to accommodate potential issues communicating with object storage. Cloud object storage outages are extremely rare, but they can happen, and Tiered Storage continues to store segments at the broker until it can successfully tier them to object storage.

Tiered Storage metrics

Tiered Storage exposes JMX metrics for the archiver and fetcher components and for per-partition tier size. Use these metrics to monitor tiering throughput, lag, and errors.

Tier archiver metrics

The archiver is a component of Tiered Storage that is responsible for uploading non-active segments to cloud storage.

kafka.tier.tasks.archive:type=TierArchiver,name=BytesPerSec: Rate of bytes per second that is being uploaded by the archiver to cloud storage.
kafka.tier.tasks.archive:type=TierArchiver,name=TotalLag: Number of bytes in non-active segments not yet uploaded by the archiver to cloud storage. As the archiver steadily uploads to cloud storage, the total lag decreases toward 0.
kafka.tier.tasks.archive:type=TierArchiver,name=RetriesPerSec: Number of times the archiver has reattempted to upload a non-active segment to cloud storage.
kafka.tier.tasks:type=TierTasks,name=NumPartitionsInError: Number of partitions that the archiver was unable to upload that are in an error state. The partitions in this state are skipped by the archiver and not uploaded to cloud storage.

Tier fetcher metrics

The fetcher is a component of Tiered Storage that is responsible for retrieving data from cloud storage.

kafka.server:type=TierFetcher,name=BytesFetchedRate: Number of bytes fetched per second from Tiered Storage.
kafka.server:type=TierFetcher,name=BytesFetchedTotal: Total number of bytes fetched from Tiered Storage.
kafka.server:type=TierFetcher,name=QueueSize: Number of elements in the fetcher executor queue. A queue that grows and stays high indicates that fetch requests are arriving faster than the fetcher can service them.
kafka.server:type=TierFetcher,name=QueuedTotalTimeMs50Percentile, kafka.server:type=TierFetcher,name=QueuedTotalTimeMs90Percentile, kafka.server:type=TierFetcher,name=QueuedTotalTimeMs99Percentile: Time, in milliseconds, that a fetch request waits in the fetcher thread pool queue before processing, reported at the 50th, 90th, and 99th percentiles.
kafka.server:type=TierFetcher,name=FetchTotalTimeMs50Percentile, kafka.server:type=TierFetcher,name=FetchTotalTimeMs90Percentile, kafka.server:type=TierFetcher,name=FetchTotalTimeMs99Percentile: Total time, in milliseconds, for the GET calls made to object storage to complete a fetch, reported at the 50th, 90th, and 99th percentiles.

Kafka log of tier size per partition

Use kafka.log to find the total tier size for a partition.

kafka.log:type=Log,name=TierSize,topic=TieredStorage,partition=0: Total tier size for the given partition.

Example performance test

This reference performance test demonstrates expected throughput metrics for a 6-broker cluster with Tiered Storage, showing producer, consumer, archiver, and fetcher rates. The throughput performance of a Kafka cluster with Tiered Storage activated has been recorded as a reference for expected metrics. The high-level details and configurations of the cluster and environment are listed.

Cluster Configurations:

6 brokers
r5.xlarge AWS instances
GP2 disks, 2 TB
104857600 (100 MB) segment.bytes
A single topic with 24 partitions and replication factor 3

Producer and Consumer Details:

6 producers each with a target rate of 20 MB/s (120 MB/s total)
12 total consumers
6 consumers read records from the end of the topic with a target rate of 20 MB/s
6 consumers read records from the beginning of the topic from S3 through the fetcher with an uncapped rate

Values may differ based on the configurations of producers, consumers, brokers, and topics. For example, using a log segment size of 1 GB instead of 100 MB may result in lower archiver throughput.

Throughput Metric	Value (MB/s)
Producer (cluster total)	115
Producer (average across brokers)	19
Consumer (cluster total)	623
Archiver (average across brokers)	19
Fetcher (average across brokers)	104

Supported platforms and features

AWS with Amazon S3, GCP with Google Cloud Storage (GCS), Azure with Azure Blob Storage, Pure Storage FlashBlade, Nutanix Objects, NetApp Object Storage, Dell EMC ECS, MinIO, Cloudian, Ceph, and Scality through the Amazon S3 API are supported with Tiered Storage.
The Kubernetes Confluent Operator supports Tiered Storage, as described in Tiered Storage under Configure Storage with Confluent Operator.

Configuration options

To learn more about configuration options for Tiered Storage, see the following sections:

Kafka Broker and Controller Configuration Reference for Confluent Platform - Search the page for options prefixed with confluent.tier.
Configure and Manage Tiered Storage from Confluent Control Center

Disabling Tiered Storage

Disabling Tiered Storage stops offloading new segments to object storage while preserving access to previously tiered data. You can disable Tiered Storage by setting confluent.tier.enable=false. This disables additional tiering. New and existing data not yet offloaded to object storage is not offloaded. Previously tiered (offloaded) data remains in object storage (tiered).

If you disable Tiered Storage, make sure that confluent.tier.feature remains enabled (confluent.tier.feature=true) so that tiering-related components like the tier fetcher remain enabled and previously tiered data can be fetched.