EXPLAIN Statement in Confluent Cloud for Apache Flink

Confluent Cloud for Apache Flink®️ enables viewing and analyzing the query plans of Flink SQL statements.

Syntax

EXPLAIN { <query_statement> | <insert_statement> | <statement_set> }

<statement_set>:
STATEMENT SET
BEGIN
  -- one or more INSERT INTO statements
  { INSERT INTO <select_statement>; }+
END;

Description

The EXPLAIN statement provides detailed information about how Flink executes a specified query or INSERT statement. EXPLAIN shows:

  • The optimized physical execution plan
  • If the changelog mode is not append-only, details about the changelog mode per operator

This information is valuable for understanding query performance, optimizing complex queries, and debugging unexpected results.

The output of an EXPLAIN statement typically includes:

  • A high-level overview of the physical plan
  • Detailed information about each operation in the plan
  • Additional metadata, like changelog modes for streaming queries

Example

You can use the following EXPLAIN statement to analyze an example query that finds users who have clicked but never placed an order.

EXPLAIN
SELECT c.*
FROM clicks c
LEFT JOIN (
  SELECT DISTINCT customer_id
  FROM orders
) o ON c.user_id = o.customer_id
WHERE o.customer_id IS NULL;

EXPLAIN
INSERT INTO orders VALUES
  (1, 1001, '2023-02-24', 50.0),
  (2, 1002, '2023-02-25', 60.0),
  (3, 1003, '2023-02-26', 70.0);

EXPLAIN STATEMENT SET
BEGIN
  INSERT INTO low_orders SELECT * from `orders` where price < 100;
  INSERT INTO high_orders SELECT * from `orders` where price > 100;
END;

The output resembles:

== Physical Plan ==

StreamPhysicalSink [11]
  +- StreamPhysicalCalc [10]
    +- StreamPhysicalJoin [9]
      +- StreamPhysicalExchange [3]
      :  +- StreamPhysicalCalc [2]
      :    +- StreamPhysicalTableSourceScan [1]
      +- StreamPhysicalExchange [8]
        +- StreamPhysicalGroupAggregate [7]
          +- StreamPhysicalExchange [6]
            +- StreamPhysicalCalc [5]
              +- StreamPhysicalTableSourceScan [4]

== Physical Details ==

[1] StreamPhysicalTableSourceScan
Table: `examples`.`marketplace`.`clicks`

[4] StreamPhysicalTableSourceScan
Table: `examples`.`marketplace`.`orders`

[7] StreamPhysicalGroupAggregate
Changelog mode: retract

[8] StreamPhysicalExchange
Changelog mode: retract

[9] StreamPhysicalJoin
Changelog mode: retract

[10] StreamPhysicalCalc
Changelog mode: retract

[11] StreamPhysicalSink
Changelog mode: retract
Table: Foreground

Understanding the output

Use the EXPLAIN statement to gain insights into query execution, identify potential bottlenecks, and optimize your Flink SQL queries in Confluent Cloud.

  • Physical Plan: This section shows the overall structure of the query execution plan. Each operation is numbered and indented to show its position in the plan hierarchy.
  • Physical Details: This section provides more information about specific operations in the plan:
    • Table scans show the source tables.
    • Aggregations, exchanges, and joins indicate their changelog modes.
    • The final sink operation shows where the results will be output.
  • Changelog Modes: For streaming queries, changelog modes, for example, “retract”, indicate how Flink manages state updates for each operation.