.. _connect-azure-sql-dw: |az| SQL Data Warehouse Sink Connector for |cp| =============================================== The |az| SQL Data Warehouse sink connector allows you to export data from |ak-tm| topics to an |az| SQL Data Warehouse. The connector polls data from |ak| to write to the data warehouse based on the topics subscription. Auto-creation of tables and limited auto-evolution are also supported. This connector is compatible with `Azure Synapse SQL pool `__ Prerequisites ------------- The following are required to run the |kconnect-long| |az| SQL Data Warehouse Sink Connector: * |cp| 4.0.0 or above, or |ak| 1.0.0 or above * Java 1.8 * At minimum, ``INSERT`` permission is required for this connector. See `Permissions: GRANT, DENY, REVOKE (Azure SQL Data Warehouse, Parallel Data Warehouse) `__. * If ``auto.create=true``, ``CREATE TABLE`` and ``CREATE SCHEMA`` permissions are required. * If ``auto.evolve=true``, ``ALTER ANY SCHEMA`` permissions are required. Limitations ----------- * This connector can only insert data into an |az| SQL Data Warehouse. |az| SQL Data Warehouse does not support primary keys, and because updates, upserts, and deletes are all performed on the primary keys, these queries are not supported for this connector. * When ``auto.evolve`` is enabled, if a new column with a default value is added, that default value is only used for new records. Existing records will have "null" as the value for the new column. Features -------- ------------ Data mapping ------------ The sink connector requires knowledge of schemas, so you should use a suitable converter. For example, the Avro converter that comes with |sr|, or the JSON converter with schemas enabled. |ak| record keys if present can be primitive types or a Connect struct, and the record value must be a Connect struct. Fields selected from Connect structs must be primitive types. You may need to implement a custom ``Converter`` if the data in the topic is not in a compatible format. -------------------------------- Auto-creation and Auto-evolution -------------------------------- If ``auto.create`` is enabled, the connector can create the destination table if it is found to be missing. The connector uses the record schema as a basis for the table definition, so the creation takes place online with records consumed from the topic. If ``auto.evolve`` is enabled, the connector can perform limited auto-evolution by issuing alter on the destination table when it encounters a record for which a column is found to be missing. Since data-type changes and removal of columns can be dangerous, the connector does not attempt to perform such evolutions on the table. .. important:: For backwards-compatible table schema evolution, new fields in record schemas must be optional or have a default value. .. _install-connector: Install the |az| SQL Data Warehouse Sink Connector -------------------------------------------------- .. include:: ../includes/connector-install.rst .. include:: ../includes/connector-install-hub.rst .. codewithvars:: bash confluent-hub install confluentinc/kafka-connect-azure-sql-dw:latest .. include:: ../includes/connector-install-version.rst .. codewithvars:: bash confluent-hub install confluentinc/kafka-connect-azure-sql-dw:1.0.0-preview ------------------------------ Install the connector manually ------------------------------ `Download and extract the ZIP file `__ for your connector and then follow the manual connector installation :ref:`instructions `. License ------- .. include:: ../includes/enterprise-license.rst See :ref:`az-sql-dw-sink-connector-license-config` for license properties and :ref:`azure-sql-dw-license-topic-configuration` for information about the license topic. Configuration Properties ------------------------ For a complete list of configuration properties for this connector, see :ref:`azure-sql-dw-configuration-options`. .. _azure-sql-dw-sink-quickstart: Quick Start ----------- In this quick start, the |az| SQL Data Warehouse sink connector is used to export data produced by the Avro console producer to an |az| SQL Data Warehouse instance. |az| Prerequisites - `Azure Account `__ - `Azure CLI `__ - `mssql-cli `__ Confluent Prerequisites - :ref:`Confluent Platform ` - :ref:`Confluent CLI ` (requires separate installation) - :ref:`Azure SQL Data Warehouse Sink Connector ` .. tip:: Though this quick start requires the `Azure CLI `__ for creating the resources and the `mssql-cli `__ for querying the data from the resources, both of these can also be managed through the |az| Portal: see `Create and query an Azure SQL Data Warehouse `__. .. include:: ../includes/connect-to-cloud-note.rst .. _create-azure-sql-dw: ------------------------------------------ Create an |az| SQL Data Warehouse instance ------------------------------------------ For this section, use the `Azure CLI `__ to create the necessary resources for this Quick Start. #. Login with your |az| account. .. sourcecode:: bash az login The above command will open your default browser and load a sign-in page where you can login into your |az| account. #. Create a resource group. .. sourcecode:: bash az group create \ --name quickstartResourceGroup \ --location eastus2 #. Create a SQL server instance. Choose a unique SQL Server name, username and password and supply them for the ``name``, ``admin-user``, and ``admin-password`` arguments. .. sourcecode:: bash az sql server create \ --name \ --resource-group quickstartResourceGroup \ --location eastus2 \ --admin-user \ --admin-password #. Enable a server-level firewall rule. Pass your IP address for the ``start-ip-address`` and ``end-ip-address`` argument to enable connectivity to the server. .. sourcecode:: bash az sql server firewall-rule create \ --name quickstartFirewallRule \ --resource-group quickstartResourceGroup \ --server \ --start-ip-address --end-ip-address #. Create a SQL Data Warehouse instance. .. sourcecode:: bash az sql dw create \ --name quickstartDataWarehouse \ --resource-group quickstartResourceGroup \ --server This can take a couple of minutes. .. _load-connector: ------------------ Load the Connector ------------------ #. Adding a new connector plugin requires restarting Kafka Connect. Use the Confluent CLI to restart Connect. .. include:: ../../includes/cli-new.rst .. codewithvars:: bash |confluent_stop| connect && |confluent_start| connect #. Create an ``azure-sql-dw-quickstart.properties`` file and add the following properties. Make sure to substitute your SQL server name, username, and password in the ``azure.sql.dw.url``, ``azure.sql.dw.user`` and ``azure.sql.dw.password`` arguments respectively. .. codewithvars:: properties name=AzureSqlDwSinkConnector topics=products tasks.max=1 connector.class=io.confluent.connect.azuresqldw.AzureSqlDwSinkConnector azure.sql.dw.url=jdbc:sqlserver://.database.windows.net:1433; azure.sql.dw.user= azure.sql.dw.password= azure.sql.dw.database.name=quickstartDataWarehouse auto.create=true auto.evolve=true table.name.format=kafka_${topic} # The following configs define the Confluent license stored in Kafka, so you need the Kafka bootstrap addresses. # `replication.factor` may not be larger than the number of Kafka brokers in the destination cluster, # so here we set this to '1' for demonstration purposes. Always use at least '3' in production configurations. confluent.license= confluent.topic.bootstrap.servers=localhost:9092 confluent.topic.replication.factor=1 #. Start the |az| SQL Data Warehouse sink connector by loading the connector's configuration with the following command: .. codewithvars:: bash |confluent_load| azure-sql-dw|dash| -d azure-sql-dw-quickstart.properties #. Confirm that the connector is in a ``RUNNING`` state. .. codewithvars:: bash |confluent_status| azure-sql-dw .. _send-data: ----------------- Send Data to |ak| ----------------- #. To produce some records into the ``products`` topic, first start a Kafka producer. .. sourcecode:: bash kafka-avro-console-producer \ --broker-list localhost:9092 --topic products \ --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"name","type":"string"}, {"name":"price","type":"float"}, {"name":"quantity","type":"int"}]}' #. The console producer is now waiting for input, so you can go ahead and insert some records into the topic. .. sourcecode:: bash {"name": "scissors", "price": 2.75, "quantity": 3} {"name": "tape", "price": 0.99, "quantity": 10} {"name": "notebooks", "price": 1.99, "quantity": 5} .. _verify-data-sent: -------------------------------------- Check |az| SQL Data Warehouse for Data -------------------------------------- For this section, use the `mssql-cli `__ to query the data in the data warehouse and validate that the connector was able to export the data from Kafka to the data warehouse. #. Connect to the remote data warehouse, supplying the SQL Server name, username, and password as arguments. .. sourcecode:: bash mssql-cli -S .database.windows.net -U -P -d quickstartDataWarehouse #. Now, query the ``kafka_products`` table to see its contents. .. sourcecode:: bash select * from kafka_products; Your output should resemble the one below (the rows and columns may possibly be in a different order): .. sourcecode:: bash +------------+---------+-----------+ | quantity | price | name | |------------+---------+-----------| | 10 | 0.99 | tape | | 3 | 2.75 | scissors | | 5 | 1.99 | notebooks | +------------+---------+-----------+ .. _closing-resources: ------------------------------ Deleting Unnecessary Resources ------------------------------ Once you've finished the quick start, go ahead and delete the resources created here to avoid incurring additional charges. .. sourcecode:: bash az group delete --name quickstartResourceGroup This command will delete the resource group and all the resources within it. Additional Documentation ------------------------ .. toctree:: :maxdepth: 1 azure_sql_dw_configuration_options changelog