Snapshot Queries in Confluent Cloud for Apache Flink
In Confluent Cloud for Apache Flink®, a snapshot query is a query that reads data from a table at a specific point in time. In contrast with a streaming query, which runs continuously and returns results incrementally, a snapshot query runs, returns results, and then exits. Snapshot queries are also known as point-in-time or pull queries.
You can query Kafka topics as well as Apache Iceberg™ tables by using Confluent Tableflow.
Note
Snapshot query is an Early Access Program feature in Confluent Cloud for Apache Flink.
An Early Access feature is a component of Confluent Cloud introduced to gain feedback. This feature should be used only for evaluation and non-production testing purposes or to provide feedback to Confluent, particularly as it becomes more widely available in follow-on preview editions.
Early Access Program features are intended for evaluation use in development and testing environments only, and not for production use. Early Access Program features are provided: (a) without support; (b) “AS IS”; and (c) without indemnification, warranty, or condition of any kind. No service level commitment will apply to Early Access Program features. Early Access Program features are considered to be a Proof of Concept as defined in the Confluent Cloud Terms of Service. Confluent may discontinue providing preview releases of the Early Access Program features at any time in Confluent’s sole discretion.
Snapshot query uses
A snapshot query returns a consistent view of your data at the current point in time, similar to taking a photograph of your data at that moment. This is particularly useful when you need to:
Generate reports that reflect your data’s state at a specific time
Analyze historical data for auditing or compliance purposes
Compare data states across different points in time
Debug or investigate issues by examining past data states
For example, if you want to know the total number of orders in your system at the current time, you can use a snapshot query.
Snapshot mode
A snapshot query is an ordinary Flink SQL statement that has one additional property, named sql.snapshot.mode.
To enable snapshot queries, set the sql.snapshot.mode property to now. You can set this property in the following ways:
SQL Workspace: Toggle the Mode dropdown to Snapshot.
Flink SQL: Prepend your query with
SET 'sql.snapshot.mode' = 'now';.Table API: In the
Cloud.Propertiesproject file, addsql.snapshot.mode = now.REST API: In the statement’s
spec.propertiesmap, add"sql.snapshot.mode": "now".Terraform: In the statement properties, add
"sql.snapshot.mode" = "now".
Snapshot queries use Flink’s batch execution mode, which enables you to run batch processing jobs beside your existing stream processing workloads, within the same Confluent Cloud environment.
Also, Confluent Cloud for Apache Flink bounds all sources, which means that Flink processes only a finite set of records up to a specific point in time, rather than continuously processing an infinite stream of incoming data.
How snapshot queries work
When you execute a snapshot query, Flink performs the following steps:
Determines the Kafka offsets corresponding to your current timestamp across all partitions
Reads data from the source topics up to these offsets
Processes the records to build the state of your tables at this point in time
Returns the query results based on this state
The query execution is optimized to use Kafka’s time index for efficient offset lookup, to leverage parallel processing across partitions, and to minimize the amount of data that needs to be processed.
Snapshot queries and Tableflow
If Tableflow is enabled on a topic, snapshot queries on the topic run in a hybrid mode.
If Tableflow is not enabled on a topic, the query reads from Kafka.
If Tableflow is enabled on a topic, the query reads from both Kafka and Parquet, for Confluent Managed Storage and custom storage (BYOS).
Run a snapshot query
To run a snapshot query, in a Flink workspace or the Flink SQL shell, prepend your query with the following SET statement:
SET 'sql.snapshot.mode' = 'now';
Also, in a Flink workspace, you can change the Mode dropdown setting to Snapshot.
For more information, see Run a Snapshot Query.
Technical Details
Timestamp Resolution: Timestamps are processed with millisecond precision
State Handling: For tables with state (like aggregations), Flink reconstructs the state by processing all relevant records up to the specified timestamp
Parallelism: Queries are automatically parallelized across available compute resources
Resource Optimization: Flink uses Kafka’s time index to quickly locate the relevant offsets, minimizing unnecessary data scanning
Relationship to Batch Mode
Snapshot queries are closely related to Flink’s batch processing mode. When you execute a snapshot query:
Flink automatically switches to batch mode processing
The query processes a finite, bounded dataset up to the current timestamp
The computation benefits from batch optimizations like sort-merge joins
Resources are released once the query completes
Results are deterministic and reproducible
This behavior contrasts with streaming queries which:
Process continuous, unbounded data streams
Maintain persistent state and resources
Produce incremental, real-time results
May give different results when rerun due to new data
Billing
Snapshot queries are billed in CFUs, in the same way that streaming queries are. For more information, see Flink Billing.