Important
You are viewing documentation for an older version of Confluent Platform. For the latest, click here.
Kudu Connector (Source and Sink) for Confluent Platform¶
You can use the Kafka Connect Kudu source connector to import data from columnar relational database Kudu with Impala JDBC driver into Apache Kafka® topics. You can use the Kudu sink connector to export data from Kafka topics to Kudu with Impala JDBC driver.
Install the Kudu Connector¶
You can install this connector by using the Confluent Hub client (recommended) or you can manually download the ZIP file.
If you are running a multi-node Connect cluster, the Kudu connector and Impala JDBC driver JARs must be installed on every Connect worker in the cluster. See below for details.
Install the connector using Confluent Hub¶
- Prerequisite
- Confluent Hub Client must be installed. This is installed by default with Confluent Enterprise.
Navigate to your Confluent Platform installation directory and run the following command to install the latest (latest
) connector version. The connector must be installed on every machine where Connect will run.
confluent-hub install confluentinc/kafka-connect-kudu:latest
You can install a specific version by replacing latest
with a version number. For example:
confluent-hub install confluentinc/kafka-connect-kudu:1.0.0-preview
Install the connector manually¶
Download and extract the ZIP file for your connector and then follow the manual connector installation instructions.
License¶
You can use this connector for a 30-day trial period without a license key.
After 30 days, this connector is available under a Confluent enterprise license. Confluent issues enterprise license keys to subscribers, along with providing enterprise-level support for Confluent Platform and your connectors. If you are a subscriber, please contact Confluent Support at support@confluent.io for more information.
See Confluent Platform license for license properties and License topic configuration for information about the license topic. License requirements are the same for both the sink and source connector.
Configuration Properties¶
For a complete list of configuration properties for the source connector, see Kudu Source Connector Configuration Properties.
For a complete list of configuration properties for the sink connector, see Kudu Sink Connector Configuration Properties.
Note
For an example of how to get Kafka Connect connected to Confluent Cloud, see Distributed Cluster in Connect Kafka Connect to Confluent Cloud.
Installing Impala JDBC Driver¶
The Kudu source and sink connectors use the Java Database Connectivity (JDBC) API . In order for this to work, the connectors must use Impala to query Kudu database, and have Impala JDBC Driver installed.
The basic steps of installation are:
- Download Impala JDBC Connector, and unzip to get the JAR files.
- Place these JAR files into the
share/confluent-hub-components/confluentinc-kafka-connect-kudu/lib
directory in your Confluent Platform installation on each of the Connect worker nodes. - Restart all of the Connect worker nodes.
General Guidelines¶
The following are additional guidelines to consider:
- Use the most recent version of the Impala JDBC driver available.
- Use the correct JAR file for the Java version used to run Connect workers.
Make sure to use the correct JAR file for the Java version in use.
If you install and try to use the Impala JDBC driver JAR file for the wrong version of Java,
starting any Kudu source connector
or Kudu sink connector will likely fail with
UnsupportedClassVersionError
. If this happens, remove the Impala JDBC driver JAR file you installed and repeat the driver installation process with the correct JAR file. - The
share/confluent-hub-components/confluentinc-kafka-connect-kudu/lib
directory mentioned above is for Confluent Platform. If you are using a different installation, find the location where the Confluent Kudu source and sink connector JAR files are located, and place the Impala JDBC driver JAR file(s) for the target databases into the same directory. - If the Impala JDBC driver is not installed correctly, the
Kudu source or sink connector will fail on startup. Typically, the system throws the error
No suitable driver found
. If this happens, install the Impala JDBC driver again.
Limitations¶
- Kudu does not support
DATE
andTIME
types. ConnectDate
,Time
andTimestamp
types all will be mapped to ImpalaTIMESTAMP
type, which corresponds to Kuduunixtime_micros
type. - Impala does not support
BINARY
type, so our connectors will not accept binary data as well. - Complex data types like
Array
,Map
andStruct
are not supported. - For
Decimal
type, both Impala and Kudu allow at most 38 precision. And our connector shall observe the cap.