.. _cassandra-sink-connector:
Cassandra Sink Connector for |cp|
=================================
The |kconnect-long| Cassandra Sink Connector is a high-speed mechanism for writing data to `Apache Cassandra `__.
It is compatible with versions 2.1, 2.2, and 3.0 of Cassandra.
Install the Cassandra Connector
-------------------------------
.. include:: ../includes/connector-install.rst
.. codewithvars:: bash
confluent-hub install confluentinc/kafka-connect-cassandra:latest
.. include:: ../includes/connector-install-version.rst
.. codewithvars:: bash
confluent-hub install confluentinc/kafka-connect-cassandra:1.0.1
------------------------------
Install the connector manually
------------------------------
`Download and extract the ZIP file `__ for your connector and then follow the manual connector installation :ref:`instructions `.
License
-------
.. include:: ../includes/enterprise-license.rst
See :ref:`cassandra-sink-connector-license-config` for license properties and :ref:`cassandra-sink-license-topic-configuration` for information about the license topic.
Configuration Properties
------------------------
For a complete list of configuration properties for this connector, see :ref:`cassandra-sink-connector-config`.
Usage Notes
-----------
This connector uses the topic to determine the name of the table to write to. You can change this dynamically by using a
transform like `Regex Router `__ to change the topic name.
.. include:: ../includes/connect-to-cloud-note.rst
-----------------
Schema Management
-----------------
You can configure this connector to manage the schema on the Cassandra cluster. When altering an existing table the key
is ignored. This is to avoid the potential issues around changing a primary key on an existing table. The key schema is used to
generate a primary key for the table when it is created. These fields must also be in the value schema. Data
written to the table is always read from the value from |ak-tm|. This connector uses the topic to determine the name of
the table to write to. This can be changed on the fly by using a transform to change the topic name.
--------------------------
Time To Live (TTL) Support
--------------------------
This connector provides support for TTL by which data can be automatically expired after a specific period.
``TTL`` value is the time to live value for the data. After that particular amount of time, data will be automatically deleted. For example, if the TTL value is set to 100 seconds then data would be automatically deleted after 100 seconds.
To use this feature you have to set ``cassandra.ttl`` config with time(in seconds) for which you want to retain the data. If you don't specify this property then the record will be inserted with default TTL value null, meaning that written data will not expire.
--------------------------------
Offset tracking Support in Kafka
--------------------------------
This connector support two type of offset tracking support.
**Offset stored in Cassandra Table**
This is the default behaviour of the connector. Here, the offset is stored in the cassandra table.
**Offset stored in Kafka**
If you want that offset should be managed in kafka then you must specify ``cassandra.offset.storage.table.enable=false``. By default, this property is true (in this case offset will be stored in Cassandra table).
---------------
Troubleshooting
---------------
If you encounter error messages like this:
.. codewithvars:: bash
Batch for [test.twitter] is of size 127.661KiB, exceeding specified threshold of 50.000KiB by 77.661KiB
Or warning messages like this:
.. codewithvars:: bash
Batch for [test.twitter] is of size 25.885KiB, exceeding specified threshold of 5.000KiB by 20.885KiB
Try adjusting the ``consumer.max.poll.records`` setting in the worker.properties for |kconnect-long|.
.. _cassandra-sink-connector-examples:
Examples
--------
-----------
Upsert mode
-----------
This example will configure the connector to use upserts when writing data to Cassandra.
Select one of the following configuration methods based on how you have deployed |kconnect-long|.
Distributed Mode will the JSON / REST examples. Standalone mode will use the properties based
example.
**Distributed Mode JSON**
.. literalinclude:: CassandraSinkConnector.updatemode.example.json
:language: JSON
**Standalone Mode Properties**
.. literalinclude:: CassandraSinkConnector.updatemode.example.properties
:language: properties
--------
Standard
--------
This example will connect to an Apache Cassandra instance without authentication.
Select one of the following configuration methods based on how you have deployed |kconnect-long|.
Distributed Mode will the JSON / REST examples. Standalone mode will use the properties based
example.
**Distributed Mode JSON**
.. literalinclude:: CassandraSinkConnector.standard.example.json
:language: JSON
**Standalone Mode Properties**
.. literalinclude:: CassandraSinkConnector.standard.example.properties
:language: properties
----------------------
SSL and Authentication
----------------------
This example will connect to an Apache Cassandra instance with SSL and username / password authentication.
Select one of the following configuration methods based on how you have deployed |kconnect-long|.
Distributed Mode will the JSON / REST examples. Standalone mode will use the properties based
example.
**Distributed Mode JSON**
.. literalinclude:: CassandraSinkConnector.authenicated.example.json
:language: JSON
**Standalone Mode Properties**
.. literalinclude:: CassandraSinkConnector.authenicated.example.properties
:language: properties
Additional Documentation
------------------------
.. toctree::
:maxdepth: 1
cassandra_sink_connector_config
changelog