Comparing Open Source Flink with Confluent Cloud for Apache Flink¶
Confluent Cloud for Apache Flink® supports many of the capabilities of Open Source Apache Flink® and provides additional features. Also, Confluent Cloud for Apache Flink has some different behaviors and limitations relative to Open Source (OSS) Flink. This topic describes the key differences between Confluent Cloud for Apache Flink and OSS Flink.
Additional features¶
The following list shows features provided by Confluent Cloud for Apache Flink that go beyond what OSS Flink offers.
Auto-inference of environments, clusters, topics, and schemas¶
In OSS Flink, you must define and configure your tables and their schemas, including authentication and authorization to Apache Kafka®. Confluent Cloud for Apache Flink maps environments, clusters, topics, and schemas automatically from Confluent Cloud to the corresponding Flink concepts of catalogs, databases, tables, and table schemas.
Autoscaling¶
Autopilot scales up and scales down the compute resources that SQL statements use in Confluent Cloud. The autoscaling process is based on parallelism, which is the number of parallel operations that occur when the SQL statement is running. A SQL statement performs at its best when it has the optimal resources for its required parallelism.
Default system column implementation¶
Confluent Cloud for Apache Flink has a default implementation for a
system column named
$rowtime. This column is mapped
to the Kafka record timestamp, which can be either LogAppendTime
or
CreateTime
.
Default watermark strategy¶
Flink requires a watermark strategy for a variety of features, such as
windowing and
temporal joins. Confluent Cloud for Apache Flink has a default
watermark strategy applied on all tables/topics, which is based on the
$rowtime
system column. OSS Flink requires you to define a watermark strategy
manually. For more information, see Event Time and Watermarks.
Because the default strategy is defined for general usage, there are cases that require a custom strategy, for example, when delays in record arrival of longer than 7 days occur in your streams. You can override the default strategy with a custom strategy by using the ALTER TABLE statement.
Schema Registry support for JSON_SR and Protobuf¶
Confluent Cloud for Apache Flink has support for Schema Registry formats AVRO, JSON_SR, and Protobuf, while OSS Flink currently supports only Schema Registry AVRO.
INFORMATION_SCHEMA support¶
Confluent Cloud for Apache Flink has an implementation for IMPLEMENTATION_SCHEMA, which is a system view that provides insights on catalogs, databases, tables, and schemas. This doesn’t exist in OSS Flink.
Behavioral differences¶
The following list shows differences in behavior between Confluent Cloud for Apache Flink and OSS Flink.
Configuration options¶
OSS Flink supports various optimization configuration options on different levels, like Execution Options, Optimizer Options, Table Options, and SQL Client Options. Confluent Cloud for Apache Flink supports only the necessary subset of these options.
Same of these options have different names in Confluent Cloud for Apache Flink, as shown in the following table.
Confluent Cloud for Apache Flink | OSS Flink |
---|---|
client.results-timeout | table.exec.async-lookup.timeout |
client.statement-name | – |
sql.current-catalog | table.builtin-catalog-name |
sql.current-database | table.builtin-database-name |
sql.local-time-zone | table.local-time-zone |
sql.state-ttl | table.exec.state.ttl |
sql.tables.scan.bounded.mode | scan.bounded.mode |
sql.tables.scan.bounded.timestamp-millis | scan.bounded.timestamp-millis |
sql.tables.scan.idle-timeout | table.exec.source.idle-timeout |
sql.tables.scan.startup.mode | scan.startup.mode |
sql.tables.scan.startup.timestamp-millis | scan.startup.timestamp-millis |
CREATE statements provision underlying resources¶
When you run a CREATE TABLE statement in Confluent Cloud for Apache Flink, it creates the underlying Kafka topic and a Schema Registry schema in Confluent Cloud. In OSS Flink, a CREATE TABLE statement only registers the object in the Flink catalog and doesn’t create an underlying resource.
This also means that temporary tables are not supported in Confluent Cloud for Apache Flink, while they are in OSS Flink.
One Kafka connector and only Confluent Cloud support¶
OSS Flink contains a Kafka connector and an Upsert-Kafka connector, which, combined with the format, defines whether the source/sink is treated as an append-stream or update stream. Confluent Cloud for Apache Flink has only one Kafka connector and determines if the source/sink is an append-stream or update stream by examining the changelog.mode connector option.
Confluent Cloud for Apache Flink only supports reading from and writing to Kafka topics that are located on Confluent Cloud. OSS Flink supports other connectors, like Kinesis, Pulsar, JDBC, etc., and also other Kafka environments, like on-premises and different cloud service providers.
Limitations¶
The following list shows limitations of Confluent Cloud for Apache Flink compared with OSS Flink.
Windowing functions syntax¶
Confluent Cloud for Apache Flink supports the TUMBLE, HOP, SESSION, and CUMULATE windowing functions only by using so-called Table-Valued Functions syntax. OSS Flink supports these windowing functions also by using the outdated Group Window Aggregations functions.
Currently unsupported statements and features¶
Confluent Cloud for Apache Flink does not yet support the following:
- ANALYZE statements
- CALL statements
- CATALOG commands other than SHOW (No CREATE/DROP/ALTER)
- DATABASE command other than SHOW (No CREATE/DROP/ALTER)
- DELETE statements
- DROP (like DROP TABLE, DROP CATALOG, DROP DATABASE, DROP FUNCTION)
- FUNCTION statements
- JAR statements
- LOAD / UNLOAD statements
- TRUNCATE statements
- UPDATE statements
- Views
Limited support for ALTER¶
Confluent Cloud for Apache Flink has limited support for ALTER TABLE compared with OSS Flink. In Confluent Cloud for Apache Flink, you can use ALTER TABLE only to change the watermark strategy, add a metadata column, or change a parameter value.