Frequently Asked Questions (FAQ) on Confluent Platform for Apache Flink¶
The following sections provide answers to some of the most common questions about Confluent Platform for Apache Flink®.
What is Confluent Platform for Apache Flink?¶
Confluent Platform for Apache Flink is a component of Confluent Platform that enables you to run and manage Apache Flink® applications on-premises alongside other Confluent Platform components.
What is the relationship between Confluent Platform for Apache Flink and Flink on Confluent Cloud?¶
Confluent Platform for Apache Flink and Confluent Cloud for Apache Flink are two separate products that have different deployment models:
- Confluent Platform for Apache Flink is designed for on-premises deployments, providing a managed experience for running Flink applications alongside other Confluent Platform components.
- Confluent Cloud for Apache Flink provides a cloud-native, serverless service for Flink that enables simple and scalable stream processing that easily integrates with Kafka.
How do I install Confluent Platform for Apache Flink?¶
You can install Confluent Platform for Apache Flink using Helm. For more details, see Install Confluent Manager for Apache Flink with Helm.
What license do I need for Confluent Platform for Apache Flink?¶
Confluent Platform for Apache Flink is an Confluent Enterprise feature. You need a Confluent Platform for Apache Flink license to use Confluent Platform for Apache Flink.
What is the advantage of using Confluent Platform for Apache Flink over Apache Flink?¶
Confluent provides support and security patches for Confluent Platform for Apache Flink just like other Confluent Platform components. For more information, see Confluent Platform for Apache Flink.
What is Confluent Manager for Apache Flink (CMF)?¶
CMF is the central management component that enables users to securely manage a fleet of Flink applications across multiple environments. CMF sits next to other Confluent Platform components (like Confluent Server and Kafka Connect), and exposes functionality primarily through a REST API, which is also used by the Confluent CLI and Confluent Control Center
How do I get started with Confluent Platform for Apache Flink?¶
To get started with Confluent Platform for Apache Flink, you can either create and submit a Flink SQL statement or deploy a Flink application.
- To learn how to submit a Flink SQL statement with Confluent Manager for Apache Flink (CMF), see Submit Flink SQL Statement with Confluent Manager for Apache Flink.
- To learn how to deploy a Flink application with CMF, see Get Started with Applications in Confluent Manager for Apache Flink.
What is the purpose of a Flink environment in CMF?¶
Flink environments serve two main roles:
- Isolation: They enable logical isolation via access control (RBAC is scoped at the Environment level) and physical isolation by specifying the target Kubernetes namespace for deployment.
- Shared Configuration: They allow configuration options (like observability settings or checkpoint storage location) to be set at the Environment level, taking precedence over settings in individual Flink applications. This helps separate concerns between platform operators and developers.
What are Flink applications in CMF?¶
CMF applications are resources consisting of a Flink job (packaged as a Java JAR file), a Flink configuration, the specification of a Flink Kubernetes cluster, and status information. Every application runs on its own cluster, providing isolation between all applications.
What are Flink application instances?¶
A Flink application instance tracks the details of a deployed Flink application. A new application instance (with a unique identifier or UID) is created every time the specification for a Flink application is changed. Instances allow users to:
- Understand the effective specification of the application after environment defaults have been applied.
- Track changes made to the application over time.
- Correlate Flink application activity with centralized logging systems, as the instance name is provided as an annotation on Kubernetes pods.
- Track the status of the underlying Flink job, which is especially useful for finite streaming workloads (batch processing).
How can I manage CMF resources?¶
Confluent provides a number of ways to manage CMF resources.
You can use:
- The Confluent CLI. To manage Flink resources with the Confluent CLI, you must be running version 4.7.0 or later and should be logged out of Confluent Cloud to access these on-premises commands. For more information, see CLI Operations for Confluent Manager for Apache Flink.
- REST APIs. To use REST APIs to manage Flink environments and applications. For more information, see REST APIs for Confluent Manager for Apache Flink.
- Confluent for Kubernetes to manage Flink resources in Kubernetes. To manage Flink environments and applications in Kubernetes, see Manage Flink With Confluent for Kubernetes.
- You can manage some Flink resources in Confluent Control Center, which provides a graphical interface. To manage Flink environment and applications in Confluent Control Center, see Use Control Center with Confluent Manager for Apache Flink.
What are Flink statements and their limitations in CMF?¶
Statements are CMF resources representing Flink SQL queries. Flink SQL support in CMF is currently available as an open preview feature.
There are three main types of statements:
- Statements reading catalog metadata, for example
SHOW TABLES
: Immediately executed by CMF without creating a Flink deployment, typically for interactive scenarios. - Interactive SELECT statements: Executed on Flink clusters, collecting results retrievable via the Statement Results endpoint (ad-hoc data exploration).
- Detached INSERT INTO statements: Executed on Flink clusters to deploy data pipeline jobs in production, writing results into a table backed by a Kafka topic.
Key limitations of statements include the following:
- No support for
CREATE TABLE
,ALTER TABLE
,DROP TABLE
orEXPLAIN
statements. - Compacted Kafka topics are not supported.
- User-defined functions are not supported
SELECT
andINSERT INTO
statements with updating results are not supported.
For more details on what is and is not supported, see Features & Support for Statements in Confluent Manager for Apache Flink.
How are data sources connected for Flink SQL Statements?¶
Flink SQL uses catalogs to connect to external storage systems. Confluent Manager for Apache Flink features built-in Kafka catalogs that expose Kafka topics as tables and derive their schemas from Schema Registry.
When configuring a catalog:
- A catalog references a Schema Registry instance and one or more Kafka clusters.
- Each Kafka cluster is represented as a DATABASE, and each topic of a cluster is a TABLE in that database.
- Sensitive connection properties (like credentials) must be stored separately in Flink Secrets.
What is a compute pool?¶
A compute pool defines the compute resources used to execute a SQL statement. Each statement must reference a compute pool, which acts as a template for the dedicated Flink cluster running the query.
- The currently supported type is
DEDICATED
, meaning each statement runs or its own dedicated Flink cluster in application mode. - The compute pool configuration includes the Flink version, image (must use a
confluentinc/cp-flink-sql
image), and resource specifications for the JobManager and TaskManagers.
Can I configure logging and metrics for Flink applications?¶
Yes, you can configure both logging and metrics.
For Flink logging configurations, you use the Log4j 2 configuration file, exposed through the application and environment APIs. For more information, see Logging with Confluent Manager for Apache Flink.
Flink metrics collection leverages Flink’s metrics system. For more information, see Collect Metrics for Confluent Manager for Apache Flink.
What are the risks associated with deploying Flink jobs via CMF?¶
Apache Flink is a framework for executing user code, which has some inherent risks. It is crucial to set up proper authentication/authorization (RBAC) and limit networked access. Unconfigured Flink clusters should never be deployed in an internet-facing environment.
What security features are offered with Confluent Platform for Apache Flink?¶
Confluent Manager for Apache Flink provides the following key security features:
- Authentication with Mutual TLS (mTLS) authentication and OAuth authentication. For more information, see Configure Authentication for Confluent Manager for Apache Flink.
- Authorization with Role-Based Access Control (RBAC) for fine-grained access control to CMF resources. For more information, see Configure Authorization for Confluent Manager for Apache Flink and How to Secure a Flink Job with Confluent Manager for Apache Flink.
- Encryption of secrets. For more information, see Data Encryption in Confluent Manager for Apache Flink.