Kafka Connect FileStream Connectors
The Kafka Connect FileStream connector refers to FileStream Source and Sink connectors that are bundled with Kafka Connect in both Apache Kafka® and Confluent Platform. The Kafka Connect FileStream connector examples are intended to show how a simple connector runs for users getting started with Kafka.
Important
The FileStream Sink and Source connector artifacts have been moved out of Kafka Connect. To run the FileStream connector, you must add the new path in the
plugin.pathconfiguration property as shown in the following example:plugin.path=/usr/local/share/kafka/plugins,/usr/share/filestream-connectors
Use FileStream connectors for demonstration and learning purposes only. Confluent does not recommend the FileStream Connector for production use. If you want a production connector to read from files, use a Spool Dir connector.
Install the FileStream connector
The installation process depends on whether you use Confluent Platform or build them manually using Kafka.
Confluent Platform (Recommended)
If you have Confluent Platform installed, the FileStream connectors are included by default.
Locate the artifacts: The FileStream artifacts are located at
/usr/share/filestream-connectors.Update the plugin path: Include the FileStream path in the
plugin.pathproperty of your Kafka Connect worker configuration file. For example:plugin.path=$CONFLUENT_HOME/usr/share/filestream-connectors
Example configuration files are available in the following locations:
$CONFLUENT_HOME/etc/kafka/connect-file-source.properties$CONFLUENT_HOME/etc/kafka/connect-file-sink.properties
Kafka
If you use a standalone Kafka distribution, the connectors are included in the core connect/file module.
Download: Obtain the distribution from the Apache Kafka downloads page.
Build from source: To build a specific version or inspect the code, clone the repository and build the module using Maven:
git clone https://github.com/apache/kafka.git cd kafka/connect/file mvn package
Install: After the build completes, move the resulting JAR file to the Kafka Connect
plugin.path.
Start the connector
Once your Kafka Connect worker is configured and running, you can create connector instances using one of the following methods:
Confluent Control Center: Add the connector using the Confluent Control Center. For more information, see Add a connector.
Kafka Connect REST API: Send a
POSTrequest to the/connectorsendpoint by specifying the class name. For more information, see Configure self-managed connectors.Confluent CLI: Use the Confluent CLI to start the connector. For more information, see Kafka Connect quick start guide.
Quick start
The following examples include both a file source and a file sink to demonstrate end-to-end data flow through Kafka Connect, in a local environment. You can find additional information about this connector in the developer guide as a demonstration of how a custom connector can be implemented.
FileSource connector
The FileSource connector reads data from a file and sends it to Kafka. Beyond the configurations common to all connectors it takes only an input file and output topic as properties. Here is an example configuration:
name=local-file-source
connector.class=FileStreamSource
tasks.max=1
file=/tmp/test.txt
topic=connect-test
This connector reads only one file and sends the data within that file to Kafka. The connector then watches the file for appended updates only. The connector does not reprocess any modification of file lines already sent to Kafka.
FileSink connector
The FileSink connector reads data from Kafka and outputs it to a local file. Multiple topics may be specified as with any other sink connector. The FileSink connector only supports the file property and the configurations common to all connectors. Here is an example key-value mapping:
name=local-file-sink
connector.class=FileStreamSink
tasks.max=1
file=/tmp/test.sink.txt
topics=connect-test
As messages are added to the topics specified in the configuration, they are produced to a local file as specified in the configuration.