.. _external_volumes: Mount Docker External Volumes in |cp| ------------------------------------- When working with Docker, you may sometimes need to persist data in the event of a container going down or share data across containers. In order to do so, you can use `Docker Volumes `_. For |cp|, you need to use external volumes for several main use cases: 1. :ref:`Data Storage`: |ak-tm| and |zk| require externally mounted volumes to persist data in the event that a container stops running or is restarted. 2. :ref:`Security`: When security is configured, the secrets are stored on the host and made available to the containers using mapped volumes. 3. :ref:`config_connect_ext_jars`: |ak| Connect can be configured to use third-party jars by storing them on a volume on the host. .. note:: If you need to add support for additional use cases for external volumes, see :ref:`extending the images `. .. _data-volumes: Data volumes for |ak| in |zk| mode ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |ak| uses volumes for log data and |zk| uses volumes for transaction logs. You should separate volumes on the host for these services. You must also ensure that the host directory has read/write permissions for the Docker container user, which has root permissions by default unless you assigned a user using the Docker ``run`` command. Note that when you map volumes from host, you must use full path (example: ``/var/lib/kafka/data``). The following example shows how to use |ak| and |zk| with mounted volumes and how to configure volumes if you are running Docker container as non-root user. In this example, the containers run with the user ``appuser`` with ``UID=1000`` and ``GID=1000``. In all |cp| images, the containers run with the ``appuser`` user. .. important:: .. include:: ../../../includes/zk-deprecation.rst On the Docker host (VirtualBox VM for example), create the directories: .. codewithvars:: bash # Create dirs for Kafka / ZK data. mkdir -p /vol1/zk-data mkdir -p /vol2/zk-txn-logs mkdir -p /vol3/kafka-data # Make sure the user has the read and write permissions. chown -R 1000:1000 /vol1/zk-data chown -R 1000:1000 /vol2/zk-txn-logs chown -R 1000:1000 /vol3/kafka-data Then, start the containers: .. codewithvars:: bash # Run Zookeeper with user 1000 and volumes mapped to host volumes docker run -d \ --name=zk-vols \ --net=host \ --user=1000 \ -e ZOOKEEPER_TICK_TIME=2000 \ -e ZOOKEEPER_CLIENT_PORT=32181 \ -v /vol1/zk-data:/var/lib/zookeeper/data \ -v /vol2/zk-txn-logs:/var/lib/zookeeper/log \ confluentinc/cp-zookeeper:|release| docker run -d \ --name=kafka-vols \ --net=host \ --user=1000 \ -e KAFKA_BROKER_ID=1 \ -e KAFKA_ZOOKEEPER_CONNECT=localhost:32181 \ -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:39092 \ -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \ -v /vol3/kafka-data:/var/lib/kafka/data \ confluentinc/cp-kafka:|release| The data volumes are mounted using the ``-v`` flag. .. _security-data-volumes: Security: Data volumes for configuring secrets ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When security is enabled, the secrets are made available to the containers using volumes. For example, if the host has the secrets (credentials, keytab, certificates, kerberos config, JAAS config) in ``/vol007/kafka-node-1-secrets``, we can configure Kafka as follows to use the secrets: .. codewithvars:: bash docker run -d \ --name=kafka-sasl-ssl-1 \ --net=host \ -e KAFKA_BROKER_ID=1 \ -e KAFKA_ZOOKEEPER_CONNECT=localhost:22181,localhost:32181,localhost:42181/saslssl \ -e KAFKA_ADVERTISED_LISTENERS=SASL_SSL://localhost:39094 \ -e KAFKA_SSL_KEYSTORE_FILENAME=kafka.broker3.keystore.jks \ -e KAFKA_SSL_KEYSTORE_CREDENTIALS=broker3_keystore_creds \ -e KAFKA_SSL_KEY_CREDENTIALS=broker3_sslkey_creds \ -e KAFKA_SSL_TRUSTSTORE_FILENAME=kafka.broker3.truststore.jks \ -e KAFKA_SSL_TRUSTSTORE_CREDENTIALS=broker3_truststore_creds \ -e KAFKA_SECURITY_INTER_BROKER_PROTOCOL=SASL_SSL \ -e KAFKA_SASL_MECHANISM_INTER_BROKER_PROTOCOL=GSSAPI \ -e KAFKA_SASL_ENABLED_MECHANISMS=GSSAPI \ -e KAFKA_SASL_KERBEROS_SERVICE_NAME=kafka \ -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \ -e KAFKA_OPTS=-Djava.security.auth.login.config=/etc/kafka/secrets/host_broker3_jaas.conf -Djava.security.krb5.conf=/etc/kafka/secrets/host_krb.conf \ -v /vol007/kafka-node-1-secrets:/etc/kafka/secrets \ confluentinc/cp-kafka:latest In the example above, the location of the data volumes is specified by setting ``-v /vol007/kafka-node-1-secrets:/etc/kafka/secrets``. You then specify how they are to be used by setting: .. codewithvars:: bash -e KAFKA_OPTS=-Djava.security.auth.login.config=/etc/kafka/secrets/host_broker3_jaas.conf -Djava.security.krb5.conf=/etc/kafka/secrets/host_krb.conf .. _config_connect_ext_jars: Configuring Connect with external JARs ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |ak| Connect can be configured to use third-party jars by storing them on a volume on the host and mapping the volume to ``/etc/kafka-connect/jars`` on the container. At the host (Virtualbox VM for example), download the MySQL driver: .. codewithvars:: bash # Create a dir for jars and download the mysql jdbc driver into the directories mkdir -p /vol42/kafka-connect/jars # get the driver and store the jar in the dir curl -k -SL "https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.39.tar.gz" | tar -xzf - -C /vol42/kafka-connect/jars --strip-components=1 mysql-connector-java-5.1.39/mysql-connector-java-5.1.39-bin.jar Then start |ak| Connect mounting the download directory as ``/etc/kafka-connect/jars``: .. codewithvars:: bash docker run -d \ --name=connect-host-json \ --net=host \ -e CONNECT_BOOTSTRAP_SERVERS=localhost:39092 \ -e CONNECT_REST_PORT=28082 \ -e CONNECT_GROUP_ID="default" \ -e CONNECT_CONFIG_STORAGE_TOPIC="default.config" \ -e CONNECT_OFFSET_STORAGE_TOPIC="default.offsets" \ -e CONNECT_STATUS_STORAGE_TOPIC="default.status" \ -e CONNECT_KEY_CONVERTER="org.apache.kafka.connect.json.JsonConverter" \ -e CONNECT_VALUE_CONVERTER="org.apache.kafka.connect.json.JsonConverter" \ -e CONNECT_INTERNAL_KEY_CONVERTER="org.apache.kafka.connect.json.JsonConverter" \ -e CONNECT_INTERNAL_VALUE_CONVERTER="org.apache.kafka.connect.json.JsonConverter" \ -e CONNECT_REST_ADVERTISED_HOST_NAME="localhost" \ -e CONNECT_PLUGIN_PATH=/usr/share/java,/etc/kafka-connect/jars \ -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \ -v /vol42/kafka-connect/jars:/etc/kafka-connect/jars \ confluentinc/cp-kafka-connect:latest Related content ~~~~~~~~~~~~~~~ - :ref:`cpdocker_intro` - :ref:`image_reference` - :ref:`docker_operations_logging` - :ref:`use-jmx-monitor-docker-deployments` - :ref:`config_reference`