Important
You are viewing documentation for an older version of Confluent Platform. For the latest, click here.
Upgrading ksqlDB¶
Upgrading to ksqlDB 5.5 from KSQL 5.4¶
Warning
The upgrade from KSQL 5.4 to ksqlDB 5.5 is not a rolling restart. You must shut down all KSQL instances and then start up all ksqlDB instances, so there will be downtime.
Complete the following steps to perform the upgrade from KSQL 5.4 to ksqlDB 5.5:
- Capture existing SQL statements
- Stop clients from writing to KSQL
- Stop the existing KSQL deployment
- Deploy a new ksqlDB cluster with a new service ID
- Set up security (optional)
- Recompile user-defined functions (optional)
- Replay SQL statements that you captured in the first step
Capture existing SQL statements¶
To capture existing SQL statements, we recommend using the kafka-console-consumer
to output the existing KSQL command topic. The following example command shows
how to pipe the output to jq and save the the SQL commands to a statements.sql
file.
Note
You must provide credentials for the kafka-console-consumer
command by
using the -consumer.config
option. For more information, see
Encryption and Authentication with SSL.
# export KSQL_SERVICE_ID=<ksql.service.id>
# export BROKER=localhost
# export PORT=9092
./bin/kafka-console-consumer --bootstrap-server ${BROKER}:${PORT} --topic _confluent-ksql-${KSQL_SERVICE_ID}_command_topic --from-beginning | jq -r ".statement" > statements.sql
To get the kafka-console-consumer
tool, install Confluent Platform.
Look through the statements to make sure that the command worked as expected. Also, you may want to remove CREATE/DROP pairings, because you will execute all of these statements in the new cluster.
Stop clients that write to KSQL¶
To prevent data loss, stop all client applications and producers that write to the KSQL cluster.
Stop the existing KSQL deployment¶
Stop the KSQL cluster. The procedure for stopping the cluster varies depending
on your deployment. For example, we recommend using systemctl
for RPM
deployments. If you’re using a docker-compose stack, you might use the
docker-compose down
command.
Install ksqlDB packages¶
If your deployment uses DEB or RPM release artifacts, you must uninstall the
old packages and install the new ones. Because the configuration directory
has changed from /etc/ksql
to /etc/ksqldb
you must also copy any
configuration files to the new location:
cp ${CONFLUENT_HOME}/etc/ksql/* ${CONFLUENT_HOME}/etc/ksqldb/*
Change the ksqlDB service ID¶
Different deployment strategies configure ksqlDB differently, but you must
use a different value for ksql.service.id
before you start the new ksqlDB
server. If you use the old value, the server won’t start. Here are some common
deployment mechanisms and how to change this configuration:
- Debian/RPM: change the property in
${CONFLUENT_HOME}/etc/ksqldb/ksql-server.properties
. - Docker: change the environment variable
KSQL_KSQL_SERVICE_ID
. - Operator: see Upgrading Operator.
Set up security (optional)¶
If you have security enabled, set up security for your ksqlDB app. ksqlDB supports using role-based access control (RBAC), ACLs, and no authorization.
Create new role bindings or assign ACLs for the ksql
service
principal:
Topic:
__consumer_offsets
Topic:
__transaction_state
TransactionalId: The value that you set in the configuration file, for example,
ksqldb_
.If you’re using ACLs for security, these ACLs are required:
DESCRIBE
operation on theTOPIC
withLITERAL
name__consumer_offsets
.DESCRIBE
operation on theTOPIC
withLITERAL
name__transaction_state
.DESCRIBE
andWRITE
operations on theTRANSACTIONAL_ID
withLITERAL
name<ksql.service.id>
.
If you’re using RBAC for security, these role assignments are required:
DeveloperRead
role on the__consumer_offsets
topic.DeveloperRead
role on__transaction_state
topic.DeveloperWrite
role on the<ksql.service.id>
TransactionalId.
Recompile user-defined functions (optional)¶
If your KSQL application uses user-defined functions, you must recompile them with the upgraded dependencies. For more information, see ksqlDB Custom Function Reference (UDF, UDAF, and UDTF).
Start ksqlDB¶
Start the ksqldb
service. The procedure for starting the cluster varies
depending on your deployment. For example, we recommend using systemctl
for RPM
deployments. If you’re using a docker-compose stack, you might use the
docker-compose up
command.
Replay SQL statements¶
To replay SQL statements, start the ksqlDB CLI and issue
RUN SCRIPT <path-to-statements.sql>;
.
Important
There have been backward-incompatible syntax changes between KSQL and ksqlDB,
and some of the statements may fail. If this happens, run the statements in
statements.sql
one-by-one, fixing any statements that have failed. In
particular, continuous and persistent queries now require the
EMIT CHANGES
syntax. For more information, see
Breaking Changes.
Upgrading to KSQL 5.4¶
Upgrade one server at a time in a “rolling restart”. The remaining servers should have sufficient spare capacity to take over temporarily for unavailable, restarting servers.
Notable changes in 5.4:
KSQL Server
Query Id generation
This version of KSQL includes a change to how query ids are generated for Persistent Queries (INSERT INTO/CREATE STREAM AS SELECT/CREATE TABLE AS SELECT). Previously, query ids would be incremented on every successful Persistent Query created. New query ids use the Kafka record offset of the query creating command in the KSQL command topic.
In order to prevent inconsistent query ids, don’t create new Persistent Queries while upgrading your KSQL servers (5.3 or lower). Old running queries will retain their original id on restart, while new queries will utilize the new id convention.
See Github PR #3354 for more info.
Upgrading from KSQL 5.2 to KSQL 5.3¶
Notable changes in 5.3:
KSQL Server
Avro schema compatibility
This version of KSQL fixes a bug where the schemas returned by UDF and UDAFs might not be marked as nullable. This can cause serialization issues in the presence of
null
values, as might be encountered if the UDF fails.With the bug fix all fields are now optional.
This is a forward compatible change in Avro, i.e. after upgrading, KSQL will be able to read old values using the new schema. However, it is important to ensure downstream consumers of the data are using the updated schema before upgrading KSQL, as otherwise deserialization may fail. The updated schema is best obtained from running the query in another KSQL cluster, running version 5.3.
See Github issue #2769 for more info.
Configuration:
ksql.sink.partitions
andksql.sink.replicas
are deprecated. All new queries will use the source topic partition count and replica count for the sink topic instead unless partitions and replicas are set in the WITH clause.- A new config variable,
ksql.internal.topic.replicas
, was introduced to set the replica count for the internal topics created by KSQL Server. The internal topics include command topic or config topic.
Upgrading from KSQL 5.1 to KSQL 5.2¶
Notable changes in 5.2:
- KSQL Server
- Interactive mode:
- The use of the
RUN SCRIPT
statement via the REST API is now deprecated and will be removed in the next major release. (Github issue 2179). The feature circumnavigates certain correctness checks and is unnecessary, given the script content can be supplied in the main body of the request. If you are using theRUN SCRIPT
functionality from the KSQL CLI, your scripts will not be affected, as this will continue to be supported. If you are using theRUN SCRIPT
functionality directly against the REST API your requests will work with the 5.2 server, but will be rejected after the next major version release. Instead, include the contents of the script in the main body of your request.
- The use of the
- Interactive mode:
- Configuration:
- When upgrading your headless (non-interactive) mode application from version 5.0.0 and below, you must include the configs specified in the 5.1 upgrade instructions.
- When upgrading your headless (non-interactive) mode application, you must include the following properties in your properties file:
ksql.windowed.session.key.legacy=true
ksql.named.internal.topics=off
ksql.streams.topology.optimization=none
Upgrading from KSQL 5.0.0 and below to KSQL 5.1¶
- KSQL server:
- The KSQL engine metrics are now prefixed with the
ksql.service.id
. If you have been using any metric monitoring tool, you need to update your metric names. For instance, assumingksql.service.id
is set todefault_
,messages-produced-per-sec
will be changed to_confluent-ksql-default_messages-consumed-per-sec
.
- The KSQL engine metrics are now prefixed with the
- Configuration:
- When upgrading your headless (non-interactive) mode application, you must
either update your queries to use the new SUBSTRING indexing semantics, or
set
ksql.functions.substring.legacy.args
totrue
. If possible, we recommend that you update your queries accordingly, instead of enabling this configuration setting. Refer to the SUBSTRING documentation in the function guide for details on how to do so. Note that this is NOT required for interactive mode KSQL.
- When upgrading your headless (non-interactive) mode application, you must
either update your queries to use the new SUBSTRING indexing semantics, or
set
Upgrading from KSQL 0.x (Developer Preview) to KSQL 4.1¶
KSQL 4.1 is not backward-compatible with the previous KSQL 0.x developer preview releases.
In particular, you must manually migrate queries running in the older preview releases of KSQL to the 4.1 version by
issuing statements like CREATE STREAM
and CREATE TABLE
again.
Notable changes in 4.1:
- KSQL CLI:
- The
ksql-cli
command was renamed toksql
. - The CLI no longer supports what was formerly called “standalone” or “local” mode, where
ksql-cli
would run both the CLI and also a KSQL server process inside the same JVM. In 4.1,ksql
will only run the CLI. For local development and testing, you can now runconfluent start
(which will also launch a KSQL server), followed byksql
to start the CLI. This setup is used for the Confluent Platform quickstart. Alternatively, you can start the KSQL server directly as described in Starting the ksqlDB Server, followed byksql
to start the CLI.
- The
- KSQL server:
- The default
listeners
address was changed tohttp://localhost:8088
(KSQL 0.x usedhttp://localhost:8080
). - Assigning KSQL servers to a specific KSQL cluster has been simplified and is now done with the
ksql.service.id
setting. See ksql.service.id for details.
- The default
- Executing
.sql
files: To run pre-defined KSQL queries stored in a.sql
file, see Non-interactive (Headless) ksqlDB Usage. - Configuration: Advanced KSQL users can configure the Kafka Streams and Kafka producer/consumer client settings used by KSQL. This is achieved by using prefixes for the respective configuration settings. See Configure ksqlDB Server as well as Configuration Parameter Reference and Configure ksqlDB CLI for details.