SELECT Statement¶
Important
Confluent Cloud for Apache Flink®️ is currently available for Preview. A Preview feature is a Confluent Cloud component that is being introduced to gain early feedback from developers. Preview features can be used for evaluation and non-production testing purposes or to provide feedback to Confluent. The warranty, SLA, and Support Services provisions of your agreement with Confluent do not apply to Preview features. Confluent may discontinue providing Preview releases of the Preview features at any time in Confluent’s sole discretion. Check out Getting Help for questions, feedback and requests.
For Flink SQL features and limitations in the preview program, see Notable Limitations in Public Preview.
Confluent Cloud for Apache Flink®️ enables querying the content of your tables by using familiar SELECT syntax.
Syntax¶
SELECT [DISTINCT] select_list FROM table_expression [ WHERE boolean_expression ]
Description¶
Select list¶
The select_list
specification *
means the query resolves all
columns. But in production, using *
is not recommended, because it
makes queries less robust to catalog changes. Instead, use a select_list
to specify a subset of available columns or make calculations using
the columns. For example, if an orders
table has columns named
order_id
, price
, and tax
you could write the following query:
SELECT order_id, price + tax FROM orders
Table expression¶
The table_expression
can be any source of data, including a table, view,
or VALUES
clause, the joined results of multiple existing tables, or a
subquery.
Assuming that an orders
table is available in the catalog, the
following would read all rows from .
SELECT * FROM orders;
VALUES
clause¶
Queries can consume from inline data by using the VALUES
clause. Each tuple
corresponds to one row. You can provide an alias to assign a name to each column.
SELECT order_id, price
FROM (VALUES (1, 2.0), (2, 3.1))
AS t (order_id, price);
Your output should resemble:
order_id price
1 2.0
2 3.1
Functions¶
You can invoke built-in scalar functions on the columns of a single row.
SELECT PRETTY_PRINT(order_id) FROM orders;
DISTINCT¶
If SELECT DISTINCT
is specified, all duplicate rows are removed from
the result set, which means that one row is kept from each group of duplicates.
For streaming queries, the required state for computing the query result might grow infinitely. State size depends on the number of distinct rows.
SELECT DISTINCT id FROM orders;
Examples¶
In the Flink SQL shell or in a Cloud Console workspace, run the following commands to see examples of the SELECT statement.
Create a table for web page click events.
-- Create a table for web page click events. CREATE TABLE clicks ( ip_address VARCHAR, url VARCHAR, click_ts_raw BIGINT );
Populate the table with mock clickstream data.
-- Populate the table with mock clickstream data. INSERT INTO clicks VALUES( '10.0.0.1', 'https://acme.com/index.html', 1692812175), ( '10.0.0.12', 'https://apache.org/index.html', 1692826575), ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575), ( '10.0.0.1', 'https://acme.com/index.html', 1692812175), ( '10.0.0.12', 'https://apache.org/index.html', 1692819375), ( '10.0.0.13', 'https://confluent.io/index.html', 1692826575);
Press ENTER to return to the SQL shell. Because INSERT INTO VALUES is a point-in-time statement, it exits after it completes inserting records.
View all rows in the
clicks
table by using a SELECT statement.SELECT * FROM clicks;
Your output should resemble:
ip_address url click_ts_raw 10.0.0.1 https://acme.com/index.html 1692812175 10.0.0.12 https://apache.org/index.html 1692826575 10.0.0.13 https://confluent.io/index.html 1692826575 10.0.0.1 https://acme.com/index.html 1692812175 10.0.0.12 https://apache.org/index.html 1692819375 10.0.0.13 https://confluent.io/index.html 1692826575
View only unique rows in the
clicks
table by using a SELECT DISTINCT statement.SELECT DISTINCT * FROM clicks;
Your output should resemble:
ip_address url click_ts_raw 10.0.0.1 https://acme.com/index.html 1692812175 10.0.0.12 https://apache.org/index.html 1692826575 10.0.0.13 https://confluent.io/index.html 1692826575 10.0.0.12 https://apache.org/index.html 1692819375
View only records that have the ip_address of
10.0.0.1
by using a SELECT WHERE statement.SELECT * FROM clicks WHERE ip_address='10.0.0.1';
Your output should resemble:
ip_address url click_ts_raw 10.0.0.1 https://acme.com/index.html 1692812175 10.0.0.1 https://acme.com/index.html 1692812175