Processing Guarantees in ksqlDB for Confluent Platform

ksqlDB supports at-least-once and exactly-once processing guarantees.

At-least-once semantics

Records are never lost but may be redelivered. If your stream processing application fails, no data records are lost and fail to be processed, but some data records may be re-read and therefore re-processed. At-least-once semantics is enabled by default in your ksqlDB configuration, with processing.guarantee="at_least_once".

Exactly-once semantics

Records are processed once. If a producer within a ksqlDB application sends a duplicate record, it’s written to the broker exactly once.

Exactly-once stream processing is the ability to execute a read-process-write operation exactly one time. All of the processing happens exactly once, including the processing and the materialized state created by the processing job that is written back to Kafka.

To enable exactly-once semantics, set processing.guarantee="exactly_once_v2" in your ksqlDB configuration. Your Kafka broker version must be 2.5 or later. If you’re using the Confluent Platform distribution of ksqlDB, you need Confluent Platform version 5.5 or later.

Important

Use the exactly_once_v2 setting with care. To achieve a true exactly-once system, end consumers and producers must also implement exactly-once semantics.

For more information, see Processing Guarantees.

Enable exactly-once semantics

Exactly-once isn’t enabled by default in ksqlDB, but you can enable it on a query-by-query basis by passing the processing.guarantee configuration setting to ksqlDB.

How you pass the configuration setting to ksqlDB depends on how you run ksqlDB Server and how you send requests to start queries.

ksqlDB CLI

Use the SET command to enable exactly-once for the subsequent query:

SET 'processing.guarantee' = 'exactly_once_v2';

For more information, see Configure ksqlDB CLI.

REST API

Pass the config as a property along with the request:

POST /query HTTP/1.1
Accept: application/vnd.ksql.v1+json
Content-Type: application/vnd.ksql.v1+json

{
"ksql": "SELECT * FROM pageviews EMIT CHANGES;",
"streamsProperties": {
    "processing.guarantee": "exactly_once_v2"
}
}

For more information, see Run a query and stream back the output.

Default for all queries and non-interactive (headless) mode

To enable exactly-once by default for all queries, and for non-interactive (headless) mode, set the configuration in the ksqlDB Server properties file, which by default is located at ${CONFLUENT_HOME}/etc/ksqldb/ksql-server.properties in a Confluent Platform deployment.

For more information, see ksqlDB Configuration Parameter Reference.

If your ksqlDB Server is deployed in a Docker container, you can enable exactly-once by passing in the corresponding environment variable, for example:

docker run -d \
    -e KSQL_KSQL_STREAMS_PROCESSING_GUARANTEE=exactly_once \
  -e KSQL_BOOTSTRAP_SERVERS=localhost:9092 \
    confluentinc/cp-ksqldb-server:7.8.0

For more information, see Configure ksqlDB with Docker.

Tip

If you use the SET command at the start of a SQL script, the setting is applied to all persistent queries in the script, assuming there is no corresponding UNSET command in the script.