ksqlDB Operations¶

Watch the screencast of Taking KSQL to Production on YouTube.

Local Development and Testing with Confluent CLI¶

For development and testing purposes, you can use Confluent CLI to spin up services on a single host. For more information, see the Quick Start for Confluent Platform.

Important

The confluent local commands are intended for a single-node development environment and are not suitable for a production environment. The data that are produced are transient and are intended to be temporary. For production-ready workflows, see Install and Upgrade Confluent Platform.

Installing and Configuring ksqlDB¶

You have a number of options when you set up ksqlDB Server. For more information on installing and configuring ksqlDB, see the following topics.

Starting and Stopping ksqlDB Clusters¶

ksqlDB provides the following start and stop scripts:

ksql-server-start: This script starts the ksqlDB server. It requires a server configuration file as an argument and is located in the /bin directory of your Confluent Platform installation. For more information, see Starting the ksqlDB Server.
ksql-server-stop: This script stops the ksqlDB server. It is located in the /bin directory of your Confluent Platform installation.

Health Checks¶

The ksqlDB REST API supports a “server info” request at http://<server>:8088/info and a basic server health check endpoint at http://<server>:8088/healthcheck.
Check runtime stats for the ksqlDB server that you are connected to using DESCRIBE <stream or table> EXTENDED and EXPLAIN <name of query>.

Monitoring and Metrics¶

ksqlDB includes JMX (Java Management Extensions) metrics, which give insights into what is happening inside your ksqlDB servers. These metrics include the number of messages, the total throughput, throughput distribution, error rate, and more.

To enable JMX metrics, set JMX_PORT before starting the ksqlDB server:

export JMX_PORT=1099 && \
$CONFLUENT_HOME/bin/ksql-server-start $CONFLUENT_HOME/etc/ksqldb/ksql-server.properties
Copy

For more information about Kafka Streams metrics, see Monitoring Kafka Streams Applications.

Capacity Planning¶

The Capacity Planning guide describes how to size your ksqlDB clusters.

Troubleshooting¶

SELECT query hangs and doesn’t stop?¶

Queries in ksqlDB, including non-persistent queries such as SELECT * FROM myTable EMIT CHANGES, are continuous streaming queries. Streaming queries will not stop unless explicitly terminated. To terminate a non-persistent query in the ksqlDB CLI you must type Ctrl + C.

No results from `SELECT * FROM` table or stream?¶

This is typically caused by the query being configured to process only newly arriving data instead, and no new input records are being received. To fix, do one of the following:

Run SET 'auto.offset.reset' = 'earliest';. For more information, see Configure ksqlDB CLI and Configure ksqlDB Server.
Write new records to the input topics.

Can’t create a stream from the output of windowed aggregate?¶

ksqlDB doesn’t support structured keys, so you can’t create a stream from a windowed aggregate.

ksqlDB doesn’t clean up its internal topics?¶

Make sure that your Apache Kafka® cluster is configured with delete.topic.enable=true. For more information, see deleteTopics.

ksqlDB CLI doesn’t connect to ksqlDB server?¶

The following warning may occur when you start the ksqlDB CLI.

**************** WARNING ******************
Remote server address may not be valid:
Error issuing GET to KSQL server

Caused by: java.net.SocketException: Connection reset
Caused by: Connection reset
*******************************************
Copy

Also, you may see a similar error when you create a ksqlDB query by using the CLI.

Error issuing POST to KSQL server
Caused by: java.net.SocketException: Connection reset
Caused by: Connection reset
Copy

In both cases, the CLI can’t connect to the ksqlDB server, which may be caused by one of the following conditions:

ksqlDB CLI isn’t connected to the correct ksqlDB server port.
ksqlDB server isn’t running.
ksqlDB server is running but listening on a different port.

Check the port that ksqlDB CLI is using¶

Ensure that the ksqlDB CLI is configured with the correct ksqlDB server port. By default, the server listens on port 8088. For more info, see Starting the ksqlDB CLI.

Check the ksqlDB server configuration¶

In the ksqlDB server configuration file, check that the list of listeners has the host address and port configured correctly. Look for the listeners setting:

listeners=http://0.0.0.0:8088
Copy

Or if you are running over IPv6:

listeners=http://[::]:8088
Copy

For more info, see Starting ksqlDB Server.

Check for a port conflict¶

There may be another process running on the port that the ksqlDB server listens on. Use the following command to check the process that’s running on the port assigned to the ksqlDB server. The following example checks the default port, which is 8088.

netstat -anv | egrep -w .*8088.*LISTEN
Copy

Your output should resemble:

tcp4  0 0  *.8088       *.*    LISTEN      131072 131072    46314      0
Copy

In this example, 46314 is the PID of the process that’s listening on port 8088. Run the following command to get info on the process:

ps -wwwp <pid>
Copy

Your output should resemble:

io.confluent.ksql.rest.server.KsqlServerMain ./config/ksql-server.properties
Copy

If the KsqlServerMain process isn’t shown, a different process has taken the port that KsqlServerMain would normally use. Check the assigned listeners in the ksqlDB server configuration, and restart the ksqlDB CLI with the correct port.

Replicated topic with Avro schema causes errors?¶

Confluent Replicator renames topics during replication, and if there are associated Avro schemas, they aren’t automatically matched with the renamed topics.

In the ksqlDB CLI, the PRINT statement for a replicated topic works, which shows that the Avro schema ID exists in Schema Registry, and ksqlDB can deserialize the Avro message. But CREATE STREAM fails with a deserialization error:

CREATE STREAM pageviews_original (viewtime bigint, userid varchar, pageid varchar) WITH (kafka_topic='pageviews.replica', value_format='AVRO');

[2018-06-21 19:12:08,135] WARN task [1_6] Skipping record due to deserialization error. topic=[pageviews.replica] partition=[6] offset=[1663] (org.apache.kafka.streams.processor.internals.RecordDeserializer:86)
org.apache.kafka.connect.errors.DataException: pageviews.replica
        at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:97)
        at io.confluent.ksql.serde.connect.KsqlConnectDeserializer.deserialize(KsqlConnectDeserializer.java:48)
        at io.confluent.ksql.serde.connect.KsqlConnectDeserializer.deserialize(KsqlConnectDeserializer.java:27)
Copy

The solution is to register schemas manually against the replicated subject name for the topic:

# Original topic name = pageviews
# Replicated topic name = pageviews.replica
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" --data "{\"schema\": $(curl -s http://localhost:8081/subjects/pageviews-value/versions/latest | jq '.schema')}" http://localhost:8081/subjects/pageviews.replica-value/versions
Copy

Check ksqlDB server logs¶

If you’re still having trouble, check the ksqlDB server logs for errors:

confluent log ksql-server
Copy

Look for logs in the default directory at /usr/local/logs or in the LOG_DIR that you assign when you start the ksqlDB CLI. For more info, see Starting the ksqlDB CLI.