Configure Tableflow in Confluent Cloud

Snapshot retention (retention_ms)

Snapshot retention involves managing metadata that enables you to query a previous state of your table, also known as “time-travel queries”. Tableflow creates a snapshot every time it commits a change to your table. This includes any time Tableflow adds or updates data to your table, and when it performs maintenance tasks, like compaction.

Tableflow always maintains a minimum number of snapshots, but you can configure how long additional snapshots should be retained before they are expired by setting the retention_ms configuration. You can set this value to infinity or a specific length of time. When a snapshot is past its expiration time, Tableflow asynchronously removes the snapshot from the table, as well as any data files that are no longer necessary. This operation doesn’t remove data that is still in use by your table, regardless of when that data was added to the table.

Failure Strategy (record_failure_strategy)

Tableflow offers two modes for handling per-record materialization failures: suspend and skip. The default mode, suspend, causes Tableflow to enter the Pause state whenever a record can’t be materialized and added to the table. This means that in situations where your topic ingests a corrupted record, Tableflow will Pause processing on that record.

When the Tableflow failure strategy is set to skip, it skips over records that fail to materialize. Tableflow reports the number of skipped records on the rows_skipped metric.

Failures that occur for reasons that are not record-specific always cause Tableflow to enter the Pause state, regardless of the configured record_failure_strategy. This includes, but is not limited to, catalog- and storage-access related errors and illegal schema changes.