Confluent Platform for Apache Flink Security

The following sections provide an overview of security risks and available security controls for Flink jobs deployed with Confluent Platform for Apache Flink®.

Risks

Apache Flink® is a framework for executing user code. As such, users that have permission to deploy Flink jobs have the ability to execute arbitrary code. It is therefore critical that you set up authentication/authorization and limit networked access. Unconfigured Flink clusters should never be deployed in an Internet-facing environment.

TLS/SSL for authentication and encryption

Flink supports TLS/SSL for authentication and encryption of network traffic between Flink processes, both for internal connectivity and external connectivity.

You should follow the SSL guidelines in the Flink documentation.

Internal connectivity in Flink includes control messages between Flink processes and the data connections between TaskManagers. All internal connections can be configured to use TLS/SSL for authentication and encryption (as outlined in the reference documentation above). When configured, the connections use mutual authentication, meaning both server and client side of each connection need to present the certificate to each other. The certificate acts effectively as a shared secret when a dedicated CA is used to exclusively sign an internal certificate. The certificate for internal communication is not needed by any other party to interact with Flink, and can be simply added to the container images.

External connectivity in Flink happens with HTTP/REST endpoints. For example, these endpoints are used by the web UI and the Flink CLI. These endpoints can be configured to require TLS/SSL connections. The server will, however, accept connections from any client by default, meaning the REST endpoint does not authenticate the client. Simple mutual authentication may be enabled by configuration (as outlined in the Flink documentation mentioned previously) if authentication of connections to the REST endpoints is required. However, Confluent recommends you deploy a “side car proxy” meaning you bind the REST endpoint to the loopback interface (or the pod-local interface in Kubernetes) and start a REST proxy that authenticates and forwards the requests to Flink. Examples for proxies that Flink users have deployed are Envoy Proxy or NGINX with MOD_AUTH.

Kubernetes-level security controls

Confluent Platform for Apache Flink supports Flink deployments via Flink’s Kubernetes operator. You have tight control over who can deploy Flink applications via the security controls available in Kubernetes.

You must follow the best practices described in the Kubernetes documentation to help ensure that your system stays secure.

The Flink Kubernetes operator installs two custom roles: flink-operator and flink. flink-operator is used to manage FlinkDeployment resources, meaning it creates and manages the JobManager deployment for each Flink job (and related resources). The flink role is used by the jobManager `` process of each job to create and manage the ``taskManager and configMap resources.

More details on role-based access control (RBAC) with the Flink operator can be found in the Flink RBAC documentaton.