.. _external_volumes:
Mount Docker External Volumes in |cp|
-------------------------------------
When working with Docker, you may sometimes need to persist data in the event of a container going down or share data across containers.
In order to do so, you can use `Docker Volumes `_.
For |cp|, you need to use external volumes for several main use cases:
1. :ref:`Data Storage`: |ak-tm| and |zk| require externally mounted volumes to persist data in the event that a container stops running or is restarted.
2. :ref:`Security`: When security is configured, the secrets are stored on the host and made available to the containers using mapped volumes.
3. :ref:`config_connect_ext_jars`: |ak| Connect can be configured to use third-party jars by storing them on a volume on the host.
.. note::
If you need to add support for additional use cases for external volumes, see :ref:`extending the images `.
.. _data-volumes:
Data volumes for |ak| in |zk| mode
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|ak| uses volumes for log data and |zk| uses volumes for transaction logs. You should separate volumes on the host for these services.
You must also ensure that the host directory has read/write permissions for the Docker container user, which has root permissions
by default unless you assigned a user using the Docker ``run`` command.
Note that when you map volumes from host, you must use full path (example: ``/var/lib/kafka/data``).
The following example shows how to use |ak| and |zk| with mounted volumes and
how to configure volumes if you are running Docker container as non-root user.
In this example, the containers run with the user ``appuser`` with ``UID=1000``
and ``GID=1000``. In all |cp| images, the containers run with the ``appuser``
user.
.. important:: .. include:: ../../../includes/zk-deprecation.rst
On the Docker host (VirtualBox VM for example), create the directories:
.. codewithvars:: bash
# Create dirs for Kafka / ZK data.
mkdir -p /vol1/zk-data
mkdir -p /vol2/zk-txn-logs
mkdir -p /vol3/kafka-data
# Make sure the user has the read and write permissions.
chown -R 1000:1000 /vol1/zk-data
chown -R 1000:1000 /vol2/zk-txn-logs
chown -R 1000:1000 /vol3/kafka-data
Then, start the containers:
.. codewithvars:: bash
# Run Zookeeper with user 1000 and volumes mapped to host volumes
docker run -d \
--name=zk-vols \
--net=host \
--user=1000 \
-e ZOOKEEPER_TICK_TIME=2000 \
-e ZOOKEEPER_CLIENT_PORT=32181 \
-v /vol1/zk-data:/var/lib/zookeeper/data \
-v /vol2/zk-txn-logs:/var/lib/zookeeper/log \
confluentinc/cp-zookeeper:|release|
docker run -d \
--name=kafka-vols \
--net=host \
--user=1000 \
-e KAFKA_BROKER_ID=1 \
-e KAFKA_ZOOKEEPER_CONNECT=localhost:32181 \
-e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:39092 \
-e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \
-v /vol3/kafka-data:/var/lib/kafka/data \
confluentinc/cp-kafka:|release|
The data volumes are mounted using the ``-v`` flag.
.. _security-data-volumes:
Security: Data volumes for configuring secrets
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When security is enabled, the secrets are made available to the containers using volumes. For example, if the host has the secrets (credentials, keytab, certificates, kerberos config, JAAS config) in ``/vol007/kafka-node-1-secrets``, we can configure Kafka as follows to use the secrets:
.. codewithvars:: bash
docker run -d \
--name=kafka-sasl-ssl-1 \
--net=host \
-e KAFKA_BROKER_ID=1 \
-e KAFKA_ZOOKEEPER_CONNECT=localhost:22181,localhost:32181,localhost:42181/saslssl \
-e KAFKA_ADVERTISED_LISTENERS=SASL_SSL://localhost:39094 \
-e KAFKA_SSL_KEYSTORE_FILENAME=kafka.broker3.keystore.jks \
-e KAFKA_SSL_KEYSTORE_CREDENTIALS=broker3_keystore_creds \
-e KAFKA_SSL_KEY_CREDENTIALS=broker3_sslkey_creds \
-e KAFKA_SSL_TRUSTSTORE_FILENAME=kafka.broker3.truststore.jks \
-e KAFKA_SSL_TRUSTSTORE_CREDENTIALS=broker3_truststore_creds \
-e KAFKA_SECURITY_INTER_BROKER_PROTOCOL=SASL_SSL \
-e KAFKA_SASL_MECHANISM_INTER_BROKER_PROTOCOL=GSSAPI \
-e KAFKA_SASL_ENABLED_MECHANISMS=GSSAPI \
-e KAFKA_SASL_KERBEROS_SERVICE_NAME=kafka \
-e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \
-e KAFKA_OPTS=-Djava.security.auth.login.config=/etc/kafka/secrets/host_broker3_jaas.conf -Djava.security.krb5.conf=/etc/kafka/secrets/host_krb.conf \
-v /vol007/kafka-node-1-secrets:/etc/kafka/secrets \
confluentinc/cp-kafka:latest
In the example above, the location of the data volumes is specified by setting ``-v /vol007/kafka-node-1-secrets:/etc/kafka/secrets``.
You then specify how they are to be used by setting:
.. codewithvars:: bash
-e KAFKA_OPTS=-Djava.security.auth.login.config=/etc/kafka/secrets/host_broker3_jaas.conf -Djava.security.krb5.conf=/etc/kafka/secrets/host_krb.conf
.. _config_connect_ext_jars:
Configuring Connect with external JARs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|ak| Connect can be configured to use third-party jars by storing them on a volume on the host and mapping the volume to ``/etc/kafka-connect/jars`` on the container.
At the host (Virtualbox VM for example), download the MySQL driver:
.. codewithvars:: bash
# Create a dir for jars and download the mysql jdbc driver into the directories
mkdir -p /vol42/kafka-connect/jars
# get the driver and store the jar in the dir
curl -k -SL "https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.39.tar.gz" | tar -xzf - -C /vol42/kafka-connect/jars --strip-components=1 mysql-connector-java-5.1.39/mysql-connector-java-5.1.39-bin.jar
Then start |ak| Connect mounting the download directory as ``/etc/kafka-connect/jars``:
.. codewithvars:: bash
docker run -d \
--name=connect-host-json \
--net=host \
-e CONNECT_BOOTSTRAP_SERVERS=localhost:39092 \
-e CONNECT_REST_PORT=28082 \
-e CONNECT_GROUP_ID="default" \
-e CONNECT_CONFIG_STORAGE_TOPIC="default.config" \
-e CONNECT_OFFSET_STORAGE_TOPIC="default.offsets" \
-e CONNECT_STATUS_STORAGE_TOPIC="default.status" \
-e CONNECT_KEY_CONVERTER="org.apache.kafka.connect.json.JsonConverter" \
-e CONNECT_VALUE_CONVERTER="org.apache.kafka.connect.json.JsonConverter" \
-e CONNECT_INTERNAL_KEY_CONVERTER="org.apache.kafka.connect.json.JsonConverter" \
-e CONNECT_INTERNAL_VALUE_CONVERTER="org.apache.kafka.connect.json.JsonConverter" \
-e CONNECT_REST_ADVERTISED_HOST_NAME="localhost" \
-e CONNECT_PLUGIN_PATH=/usr/share/java,/etc/kafka-connect/jars \
-e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \
-v /vol42/kafka-connect/jars:/etc/kafka-connect/jars \
confluentinc/cp-kafka-connect:latest
Related content
~~~~~~~~~~~~~~~
- :ref:`cpdocker_intro`
- :ref:`image_reference`
- :ref:`docker_operations_logging`
- :ref:`use-jmx-monitor-docker-deployments`
- :ref:`config_reference`