Flink SQL Statements in Confluent Cloud for Apache Flink¶
Learn how to use statements for your SQL queries and data processing needs in Confluent Cloud for Apache Flink®.
A statement represents a high-level resource that’s created by Confluent Cloud when you enter a SQL query.
Each statement has a property that holds the SQL query that you entered. Based on the SQL query, the statement may be one of these kinds:
- A metadata operation, or DDL statement
- A background statement, which writes data back to a table/topic while running in the background
- A foreground statement, which writes data back to the UI or a client.
In all of these cases, the statement represents any SQL statement for Data Definition Language (DDL), Data Manipulation Language (DML), and Data Query Language (DQL).
When you submit a SQL query, Confluent Cloud creates a statement resource. You can create a statement resource from any Confluent-supported interface, including the SQL shell, Confluent CLI, Cloud Console, the REST API, and Terraform.
The SQL query within a statement is immutable, which means that you can’t make changes to the SQL query once it’s been submitted. If you need to edit a statement, stop the running statement and create a new statement.
You can change the security principal for the statement. If a statement is running under a user account, you can change it to run under a service account by using the the Confluent Cloud Console, Confluent CLI, the REST API, or the Terraform provider. Running a statement under a service account provides better security and stability, ensuring that your statements aren’t affected by changes in user status or authorization.
Also, you can change the compute pool that runs a statement. This can be useful if you’re close to maxing out the resources in one pool.
You must stop the statement before changing the principal or compute pool, then restart the statement after the change.
Confluent Cloud for Apache Flink enforces a 30-day retention for statements in terminal states. For example, once a statement transitions to the STOPPED state, it no longer consumes compute and is deleted after 30 days.
If there is no consumer for the results of a foreground statement for five minutes or longer, Confluent Cloud moves the statement to the STOPPED state.
Lifecycle operations statements¶
These are the supported lifecycle operations for a statement.
Submit a statement¶
List running statements¶
Describe a statement¶
Delete a statement¶
List statement exceptions¶
Stop and resume a statement¶
Queries in Flink¶
Flink enables issuing queries with an ANSI-standard SQL on data at rest (batch) and data in motion (streams).
These are the queries that are possible with Flink SQL.
- Metadata queries
- CRUD on catalogs, databases, tables, etc. Because Flink implements ANSI-Standard SQL, Flink uses a database analogy, and similar to a database, it uses the concepts of catalogs, databases and tables. In Apache Kafka®, these concepts map to environments, Kafka clusters, and topics, respectively.
- Ad-hoc / exploratory queries
- You can issue queries on a topic and see the results immediately. A query can be a batch query (“show me what happened up to now”), or a transient streaming query (“show me what happened up to now and give me updates for the near future”). In this case, when the query or the session is ended, no more compute is needed.
- Streaming queries
- These queries run continuously and read data from one or more tables/topics and write results of the queries to one table/topic.
In general, Flink supports both batch and stream processing, but the exact subset of allowed operations differs slightly depending of the type of query. For more information, see Flink SQL Queries.
All queries are executed in streaming execution mode, whether the sources are bounded or unbounded.
Data lifecycle¶
Broadly speaking, the Flink SQL lifecycle is:
Data is read into a Flink table from Kafka via the Flink connector for Kafka.
Data is processed using SQL statements.
Data is processed using Flink task managers (managed by Confluent and not exposed to users), which are part of the Flink runtime. Some data may be stored temporarily as state in Flink while it’s being processed
Data is returned to the user as a result-set.
- The result-set may be bounded, in which case the query terminates.
- The result-set may be unbounded, in which case the query runs until canceled manually.
OR
Data is written back out to one or more tables.
- Data is stored in Kafka topics.
- Schema for the table is stored in Flink Metastore and synchronized out to Schema Registry.
Flink SQL Data Definition Language (DDL) statements¶
Data Definition Language (DDL) statements are imperative verbs that define metadata in Flink SQL by adding, changing, or deleting tables. Data Definition Language statements modify metadata only and don’t operate on data. Use these statements with declarative Flink SQL Queries to create your Flink SQL applications.
Flink SQL makes it simple to develop streaming applications using standard SQL. It’s easy to learn Flink SQL if you’ve ever worked with a database or SQL-like system that’s ANSI-SQL 2011 compliant.
Available DDL statements¶
These are the available DDL statements in Confluent Cloud for Flink SQL.
- ALTER
- CREATE
- DESCRIBE
- RESET
- SET
- SHOW
- USE