How to Package a Flink Job for Confluent Manager for Apache Flink

This topic walks you through configuring your project for packaging with Confluent Manager for Apache Flink.

Prerequisites

Set up the project configuration

The Flink configuration overview provides instructions on how to configure your project. Follow the linked instructions to create a Flink project for Maven. For more on configuring your project with Maven, see Maven configuration.

To use the Flink dependencies that are provided as part of Confluent Platform for Apache Flink®, you need to add the Confluent Maven repository and change the Maven group ID and version for supported components in the POM file for your project.

  1. Add the Confluent Platform for Apache Flink Maven repository to the pom.xml file like shown in the following code:

    <repositories>
      <repository>
        <id>cp-flink-releases</id>
        <url>https://packages.confluent.io/maven</url>
        <releases>
          <enabled>true</enabled>
        </releases>
        <snapshots>
          <enabled>false</enabled>
        </snapshots>
      </repository>
    </repositories>
    
  2. Replace the Maven group IDs and versions.

    In the dependency section of the pom.xml file, change the group ID to io.confluent.flink and update the version for each supported component like the following code:

     <dependency>
      <groupId>io.confluent.flink</groupId>
      <artifactId>flink-streaming-java</artifactId>
      <version>1.19.1-cp1</version>
    </dependency>
    

After you have changed these settings, your project will use Confluent Platform for Apache Flink.

Deploy your Flink Application with CMF

After you have developed your Flink application locally, you should have produced a jar archive that includes your Flink applications. For more information, see Packaging your Application in the Flink documentation.

Following are some different packaging options.

Package with a custom docker image

Packaging a Flink job with a customer docker image that contains the Flink jar is recommended when you have the infrastructure in place for building docker images in a build pipeline. The custom docker image should be based on the confluentinc/cp-flink:1.19.1-cp1 image found on Docker Hub.

---
apiVersion: cmf.confluent.io/v1alpha1
kind: FlinkApplication
metadata:
name: basic-example
spec:
image: image-with-application:1.19.1-cp1
flinkVersion: v1_19
flinkConfiguration:
    taskmanager.numberOfTaskSlots: '1'
serviceAccount: flink
jobManager:
    resource:
    memory: 1048m
    cpu: 1
taskManager:
    resource:
    memory: 1048m
    cpu: 1
job:
    # Jar is already part of the image and is immediately referenceable
    jarURI: local:///opt/flink/Application.jar
    state: running
    parallelism: 3
    upgradeMode: stateless

Package with a side-car container

In this option you package with a side-car container that downloads the application jar before starting the Flink container.

Use this approach when the application jars are located in remote storage, like an Amazon Web Services S3 bucket.

For example:

---
apiVersion: cmf.confluent.io/v1alpha1
kind: FlinkApplication
metadata:
name: basic-example
spec:
image: confluentinc/cp-flink:1.19.1-cp1
flinkVersion: v1_19
flinkConfiguration:
    taskmanager.numberOfTaskSlots: '1'
serviceAccount: flink
podTemplate:
    spec:
    containers:
        - name: flink-main-container
        volumeMounts:
            - mountPath: /opt/flink/downloads
            name: downloads
    # Use shared volume to download the jar file
    volumes:
        - name: downloads
        emptyDir: { }
jobManager:
    resource:
    memory: 1048m
    cpu: 1
    podTemplate:
    spec:
        initContainers:
        # Sample init container for fetching remote artifacts
        - name: busybox
            image: busybox:1.35.0
            volumeMounts:
            - mountPath: /opt/flink/downloads
                name: downloads
            command:
            - /bin/sh
            - -c
            - "wget -O /opt/flink/downloads/flink-examples-streaming.jar \
            https://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming_2.12/1.19.1/flink-examples-streaming_2.12-1.19.1.jar"
taskManager:
    resource:
    memory: 1048m
    cpu: 1
job:
    # Referencing the just downloaded jar
    jarURI: local:///opt/flink/downloads/flink-examples-streaming.jar
    entryClass: org.apache.flink.streaming.examples.statemachine.StateMachineExample
    state: running
    parallelism: 3
    upgradeMode: stateless

Submit the application definition

To run your Flink application with CMF you need to make the jar available to the Flink clusters. After you package the application, you can use the CLI to submit your Flink application definition.

For example:

confluent flink application create --environment <env-name> <appliction-definition>