JDBC Connector (Source and Sink) for Confluent Platform

You can use the Kafka Connect JDBC Source connector to import data from any relational database with a JDBC driver into Apache Kafka® topics. You can use the JDBC Sink connector to export data from Kafka topics to any relational database with a JDBC driver. The JDBC connector supports a wide variety of databases without requiring custom code for each one.

Install the JDBC Connector

You can install this connector by using the instructions or you can manually download the ZIP file.

Install the connector using Confluent Hub

Prerequisite
Confluent Hub Client must be installed. This is installed by default with Confluent Enterprise.

Navigate to your Confluent Platform installation directory and run the following command to install the latest (latest) connector version. The connector must be installed on every machine where Connect will run.

confluent-hub install confluentinc/kafka-connect-jdbc:latest

You can install a specific version by replacing latest with a version number. For example:

confluent-hub install confluentinc/kafka-connect-jdbc:10.0.0

Note

If you are running a multi-node Connect cluster, the JDBC connector and JDBC driver JARs must be installed on every Connect worker in the cluster. See below for details.

Install the connector manually

Download and extract the ZIP file for your connector and then follow the manual connector installation instructions.

License

This connector is available under the Confluent Community License.

Configuration Properties

For a complete list of configuration properties for the source connector, see JDBC Connector Source Connector Configuration Properties.

For a complete list of configuration properties for the sink connector, see JDBC Sink Connector Configuration Properties.

Note

For an example of how to get Kafka Connect connected to Confluent Cloud, see Distributed Cluster.

Installing JDBC Drivers

The JDBC Source and Sink connectors use the Java Database Connectivity (JDBC) API that enables applications to connect to and use a wide range of database systems. In order for this to work, the connectors must have a JDBC driver for the particular database systems you will use.

The connector comes with JDBC drivers for a few database systems, but before you use the connector with other database systems, you must install the most recent JDBC 4.0 drivers for those database systems. Although the details vary for each JDBC driver, the basic steps are:

  1. Find the JDBC 4.0 driver JAR file for each database system that will be used.
  2. Place these JAR files into the share/java/kafka-connect-jdbc directory in your Confluent Platform installation on each of the Connect worker nodes.
  3. Restart all of the Connect worker nodes.

The rest of this section outlines the specific steps for more common database management systems.

General Guidelines

The following are additional guidelines to consider:

  • Use the most recent version of the JDBC 4.0 driver available. The latest version of a JDBC driver supports most versions of the database management system, and includes more bug fixes.
  • Use the correct JAR file for the Java version used to run Connect workers. Some JDBC drivers have a single JAR that works on multiple Java versions. Other drivers have one JAR for Java 8 and a different JAR for Java 10 or 11. Make sure to use the correct JAR file for the Java version in use. If you install and try to use the JDBC driver JAR file for the wrong version of Java, starting any JDBC Source connector or JDBC Sink connector will likely fail with UnsupportedClassVersionError. If this happens, remove the JDBC driver JAR file you installed and repeat the driver installation process with the correct JAR file.
  • The share/java/kafka-connect-jdbc directory mentioned above is for Confluent Platform. If you are using a different installation, find the location where the Confluent JDBC Source and Sink connector JAR files are located, and place the JDBC driver JAR file(s) for the target databases into the same directory.
  • If the JDBC driver specific to the database management system is not installed correctly, the JDBC Source or Sink connector will fail on startup. Typically, the system throws the error No suitable driver found. If this happens, install the JDBC driver again by following the instructions.

Microsoft SQL Server

The JDBC Source and Sink connectors include the open source jTDS JDBC driver and the open source Microsoft JDBC driver to read from and write to Microsoft SQL Server. Because the JDBC 4.0 driver is included, no additional steps are necessary before running a connector to Microsoft SQL Server.

If for some reason you need to install the Microsoft JDBC driver manually, do the following:

  1. Download the latest version of the JDBC driver archive (for example, sqljdbc_7.2.2.0_enu.tar.gz for English)
  2. Extract the contents of the file to a temporary directory, and find the correct JAR file for your version of Java. For example, if downloading the 7.2.2.0 version of the driver, find either of the following:
    • mssql-jdbc-7.2.2.jre8.jar if running Connect on Java 8.
    • mssql-jdbc-7.2.2.jre11.jar if running Connect on Java 11.
  3. Perform the following steps on each of the Connect worker nodes before deploying a JDBC Source or Sink connector:
    1. Remove the existing share/java/kafka-connect-jdbc/jtds-1.3.1.jar file from the Confluent Platform installation.
    2. Install the JAR file into the share/java/kafka-connect-jdbc/ directory in the Confluent Platform installation.
    3. Restart the Connect worker.

If you install the JDBC driver JAR file for the wrong version of Java and try to start a JDBC Source connector or JDBC Sink connector that uses a SQL Server database, the connector will likely fail with an UnsupportedClassVersionError. If this happens, remove the JDBC driver JAR file and repeat the driver installation process with the correct JAR file.

Kerberos Authentication

The JDBC source and sink connectors are able to authenticate with SQL Server using Kerberos, which must be installed and configured on each connector worker where the JDBC source or sink connectors will run. For more information about configuring Kerberos with SQL Server, see the Microsoft documentation.

Once you’ve set up your Kerberos principal in your KDC, you can add the following properties to the new JDBC source or sink connector configuration:

connection.authenticationScheme=JavaKerberos
connection.integratedSecurity=true
connection.userName=user@REALM
connection.password=*****

The value of the connection.userName property does not require REALM if the user belongs to the default_realm set in the krb5.conf file. When these properties are used with the connector, the connection to SQL Server is established with the supplied Kerberos Principal and Password.

Note

The JDBC source and sink’s connection.user configuration property is not used with Kerberos. The connection.userName property is used instead.

Kerberos Authentication Exception

If the Kafka cluster is secured with Kerberos (SASL_SSL) and the connector is accessing a Microsoft SQL Server database configured with integrated Kerberos authentication, the following exception occurs:

connect-distributed: Caused by: javax.security.auth.login.LoginException: Unable to obtain Principal Name for authentication

To resolve this exception, you must change a system property before loading the Connect properties file. Enter the following:

export KAFKA_OPTS="-Djavax.security.auth.useSubjectCredsOnly=false"

bin/connect-distributed etc/kafka/connect-distributed.properties

For more information about this system property, see the Oracle documentation.

PostgreSQL Database

The JDBC Source and Sink connectors include the open source PostgreSQL JDBC 4.0 driver to read from and write to a PostgreSQL database server. Because the JDBC 4.0 driver is included, no additional steps are necessary before running a connector to PostgreSQL databases.

Oracle Database

Oracle provides a number of JDBC drivers for Oracle. Find the latest version and download either ojdbc8.jar, if running Connect on Java 8 or ojdbc10.jar, if running Connect on Java 11. Then, place this one JAR file into the share/java/kafka-connect-jdbc directory in your Confluent Platform installation and restart all of the Connect worker nodes.

If you download a tar.gz file with the JDBC driver and companion JARs, extract the files contents of the tar.gz to a temporary directory, and use the readme file to determine which JAR files are required. Copy the JDBC driver JAR file and other required companion JAR files into the share/java/kafka-connect-jdbc directory in your Confluent Platform installation on each of the Connect worker nodes, and then restart all of the Connect worker nodes.

If you install the JDBC driver JAR file for the wrong version of Java and try to start a JDBC Source connector or JDBC sink connector that uses an Oracle database, the connector will likely fail with an UnsupportedClassVersionError. If this happens, remove the JDBC driver JAR file and repeat the driver installation process with the correct JAR file.

Kerberos Authentication

The JDBC source and sink connectors are able to authenticate with Oracle using Kerberos, which must be installed and configured on each Connect worker where the JDBC source or sink connectors will run. For more information about configuring Kerberos with Oracle, see the Oracle documentation.

Once you’ve set up your Kerberos principal in your KDC, you can run okinit to request an initial ticket on each Connect worker:

okinit <username>

Enter your credentials.

Locate the ticket cache (the location defaults to /tmp/krb5cc_<uid>). Use the command below to list the ticket(s) and show the location of the ticket cache.

oklist -f

Now you can add the following properties to the new JDBC source or sink connector configuration:

connection.oracle.net.authentication_services=(KERBEROS5)
connection.oracle.net.kerberos5_mutual_authentication=true
connection.oracle.net.kerberos5_cc_name=/tmp/krb5cc_5088

The last property value specifies the location of the ticket cache on each machine. These properties instruct the Oracle JDBC driver to authenticate with the database via Kerberos.

Note

The following JDBC source and sink connector configuration properties are not required when using Kerberos:

  • connection.user
  • connection.password

See the Oracle documentation for troubleshooting.

IBM DB2

IBM provides a number of JDBC drivers for DB2 that depend on the version of DB2. In general, pick the most recent JDBC 4.0 driver, and choose one of the download options. Extract and find the db2jcc4.jar file within the downloaded tar.gz file, and place only the db2jdcc4.jar file into the share/java/kafka-connect-jdbc directory in your Confluent Platform installation.

For example, if you downloaded a compressed tar.gz file (e.g., v10.5fp10_jdbc_sqlj.tar.gz), perform the following steps:

  1. Extract the contents of the tar.gz file into a temporary directory.
  2. Find the ZIP file (e.g., db2_db2driver_for_jdbc_sqlj) in the extracted files.
  3. Extract the contents of the zip file to a different temporary directory.
  4. Find the db2jdcc4.jar file and copy it into the share/java/kafka-connect-jdbc directory in your Confluent Platform installation on each of the Connect worker nodes, and then restart all of the Connect worker nodes.
  5. Remove the two temporary directories.

Note

Do not place any other files from the IBM download into the share/java/kafka-connect-jdbc directory in your Confluent Platform installation. UPSERT for DB2 running on AS/400 is not currently supported with the Confluent JDBC Connector.

MySQL Server

MySQL provides the Connect/J JDBC driver for MySQL for a number of platforms. Choose the Platform Independent option, and download the Compressed TAR Archive. This file contains both the JAR file and the source code.

Extract the contents of this tar.gz file to a temporary directory. One of the extracted files will be a jar file (for example, mysql-connector-java-8.0.16.jar), and copy only this JAR file into the share/java/kafka-connect-jdbc directory in your Confluent Platform installation on each of the Connect worker nodes, and then restart all of the Connect worker nodes.

SAP HANA

SAP provides the SAP HANA JDBC Driver and makes it available on Maven Central. Download the latest version of the JAR file (for example, ngdbc-2.4.56.jar) and place it into the share/java/kafka-connect-jdbc directory in your Confluent Platform installation on each of the Connect worker nodes. Then, restart all of the Connect worker nodes.

SQLite Embedded Database

The JDBC Source and Sink connectors include the open source SQLite JDBC 4.0 driver to read from and write to a local SQLite database. Using a local SQLite embedded database is primarily for development testing.

Other Databases

Find the JDBC 4.0 driver JAR file(s) for other databases, and place only the required JAR file(s) into the share/java/kafka-connect-jdbc directory in your Confluent Platform installation on each of the Connect worker nodes, and then restart all of the Connect worker nodes.