MapR DB Sink Connector for Confluent Platform¶
The Kafka Connect MapR DB Sink connector provides a way to export data from an Apache Kafka® topic and write data to a MapR DB cluster.
Features¶
The MapR DB Sink connector for Confluent Platform includes the following features:
At least once delivery¶
This connector guarantees that records from the Kafka topic are delivered at least once.
Dead Letter Queue¶
This connector supports the Dead Letter Queue (DLQ) functionality. For information about accessing and using the DLQ, see Confluent Platform Dead Letter Queue.
Multiple tasks¶
The MapR DB Sink connector supports running one or more tasks. You can specify
the number of tasks in the tasks.max
configuration parameter. This can lead
to performance gains when multiple files need to be parsed.
License¶
You can use this connector for a 30-day trial period without a license key.
After 30 days, you must purchase a connector subscription which includes Confluent enterprise license keys to subscribers, along with enterprise-level support for Confluent Platform and your connectors. If you are a subscriber, you can contact Confluent Support at support@confluent.io for more information.
For license properties, see Confluent Platform license. For information about the license topic, see License topic configuration.
Configuration Properties¶
For a complete list of configuration properties for this connector, see Configuration Reference for MapR DB Sink Connector for Confluent Platform.
For an example of how to get Kafka Connect connected to Confluent Cloud, see Connect Self-Managed Kafka Connect to Confluent Cloud.
Install the MapR DB Connector¶
You can install this connector by using the confluent connect plugin install command, or by manually downloading the ZIP file.
Prerequisites¶
You must install the connector on every machine where Connect will run.
Kafka Broker: Confluent Platform 3.3.0 or later, or Kafka 0.11.0 or later.
Connect: Confluent Platform 4.1.0 or later, or Kafka 1.1.0 or later (requires header support in Connect).
MapR DB 5.x or higher (installed locally in
opt/mapr
).Java 1.8.
An installation of the MapR client. The MapR client must work properly on the host running the connect worker process. The Kafka Connect worker process must be started with the following command:
-Dmapr.home.dir=/opt/mapr -Dmapr.library.flatclass
You can do this by exporting the KAFKA_OPTS environment variable before starting Kafka Connect. For example:
export KAFKA_OPTS="-Dmapr.home.dir=/opt/mapr -Dmapr.library.flatclass"
An installation of the latest (
latest
) connector version.To install the
latest
connector version, navigate to your Confluent Platform installation directory and run the following command:confluent connect plugin install confluentinc/kafka-connect-maprdb:latest
You can install a specific version by replacing
latest
with a version number as shown in the following example:confluent connect plugin install confluentinc/kafka-connect-maprdb:1.1.1
Install the connector manually¶
Download and extract the ZIP file for your connector and then follow the manual connector installation instructions.
Mapping topics to tables¶
The connector uses the topic-to-table mapping property
mapr.table.map.<topic>=<table-path>
to write data from a Kafka topic to a
MapR DB table.
Example:
mapr.table.map.twitter=/apps/twitter
The connector also uses the key written to Kafka as the key written to MapR DB.
If you need to use a different key, you can use the
ExtractField Single
Message Transform (SMT) to extract a different key field. The following example
shows taking the Id
field value to use as the key.
transforms=extractkey
transforms.extractkey.type=org.apache.kafka.connect.transforms.ExtractField$Key
transforms.extractkey.field=Id
Note
MapR only supports String and Byte keys. Any other types cause an exception.
Examples¶
Property-based example¶
This configuration is used typically along with standalone workers.
name=MapRDBSinkConnector1
connector.class=io.confluent.connect.mapr.db.MapRDBSinkConnector
tasks.max=1
mapr.table.map.<topic>=<table-path>
REST-based example¶
This configuration is used typically along with distributed workers. Write the following json to connector.json, configure all of the required values, and use the command below to post the configuration to one the distributed connect worker(s). Check here for more information about the Kafka Connect REST API
Connect distributed REST example¶
{
"config" : {
"name" : "MapRDBSinkConnector1",
"connector.class" : "io.confluent.connect.mapr.db.MapRDBSinkConnector",
"tasks.max" : "1",
"mapr.table.map.<topic>" : "<table-path>"
}
}
Use curl to post the configuration to one of the Kafka Connect Workers. Change http://localhost:8083/ the endpoint of one of your Kafka Connect worker(s).
Create a new connector¶
To create a new connector, run the following command:
curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
Update an existing connector¶
To update an existing connector, run the following command:
curl -s -X PUT -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors/MapRDBSinkConnector1/config