Microsoft SQL Server CDC Source V2 (Debezium) Connector for Confluent Cloud

The fully-managed Microsoft SQL Server Change Data Capture (CDC) Source V2 (Debezium) connector for Confluent Cloud streams row-level changes from a Microsoft SQL Server database into Apache Kafka® topics, using the Debezium engine internally. The connector can also take an initial snapshot of existing data before streaming subsequent INSERT, UPDATE, and DELETE changes. Each table’s events are recorded to a separate Kafka topic, and the connector supports Avro, JSON Schema, Protobuf, or JSON (schemaless) output formats.

Note

This Quick Start is for version 2 of fully-managed cloud connector. If you are planning to upgrade from V1 to V2, see Moving from V1 to V2.
If you are installing the connector locally for Confluent Platform, see Debezium SQL Server CDC Source Connector for Confluent Platform. For more information, see Debezium documentation.
If you require private networking for fully-managed connectors, make sure to set up the proper networking beforehand. For more information, see Manage Networking for Confluent Cloud Connectors.

V2 Improvements

Note the following improvements made to the V2 connector.

Supports capturing changes from multiple databases on a single SQL Server database engine using a single connector instance and multiple tasks. You must configure the database.names property with a comma-separated list of databases and set tasks.max to match the number of databases. Each task handles one database, so increasing tasks.max beyond the number of databases provides no additional performance benefit.
Heartbeat events are emitted even if the connector does not find any changes or the changes that did occur are not of relevance to the connector.
Can stop or pause an in-progress incremental snapshot. Can resume the incremental snapshot if it was previously been paused.
Supports regular expressions to specify table names for incremental snapshots.
Supports SQL-based predicates to control the subset of records to be included in the incremental snapshot.
Supports specifying a single column as a surrogate key for performing incremental snapshots.
Can perform ad-hoc blocking snapshots.
Indices that rely on hidden, auto-generated columns, or columns wrapped in database functions are no longer considered primary key alternatives for tables that do not have a primary key defined.
Configuration options to specify how topic and schema names should be adjusted for compatibility.

Features

The SQL Server CDC Source V2 (Debezium) connector provides the following features:

Topics created automatically: The connector automatically creates Kafka topics using the naming convention: <topic.prefix>.<databaseName>.<schemaName>.<tableName>. The tables are created with the properties: topic.creation.default.partitions=1 and topic.creation.default.replication.factor=3. For more information, see Maximum message size.
Note
For DB history topics, the connector automatically creates Kafka topics using the naming convention: dbhistory.<topic.prefix>.<connect-id>.
Database authentication: Supports password authentication and Microsoft Entra ID-based authentication using Confluent Provider Integration. For more information about provider integration setup, see the connector authentication.
SSL support: Supports SSL encryption.
Tables included and Tables excluded: Sets whether a table is or is not monitored for changes. By default, the connector monitors every non-system table.
Tombstones on delete: Sets whether a tombstone is generated after a delete event. Default is true.
Output formats: The connector supports Avro, JSON Schema, Protobuf, or JSON (schemaless) output Kafka record value format. It supports Avro, JSON Schema, Protobuf, JSON (schemaless), and String output Kafka record key format. Schema Registry must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf).
Tasks per connector: The connector supports multiple tasks. Each task is assigned to a single database. Set tasks.max to match the number of databases you are capturing changes from. Increasing tasks.max beyond the number of databases doesn’t improve performance, as each task handles only one database.
Incremental snapshot: Supports incremental snapshotting via signaling.
Offset management capabilities: Supports offset management. For more information, see Manage custom offsets.
Client-side encryption (CSFLE and CSPE) support: The connector supports CSFLE and CSPE for sensitive data. For more information about CSFLE or CSPE setup, see the Manage CSFLE or CSPE for connectors.
Secret manager integration: The connector supports secret manager integration. For Password based authentication, the connector can retrieve the following configurations from an integrated secret manager at runtime as needed.
Secret manager managed configuration
Type
database.hostname
STRING
database.port
INT
database.user
STRING
database.password
PASSWORD
For more information, see Create a secret manager integration in Confluent Cloud.

For more information and examples to use with the Confluent Cloud API for Connect, see the Confluent Cloud API for Connect Usage Examples section.

Supported database versions

The SQL Server CDC Source V2 (Debezium) connector is compatible with the following SQL Server versions: 2017, 2019, 2022.

Limitations

Be sure to review the following information.

For connector limitations, see Microsoft SQL Server CDC Source V2 (Debezium) Connector limitations.
If you plan to use one or more Single Message Transformations (SMTs), see SMT Limitations.

If you plan to use one or more Custom SMTs, see Custom SMT limitations.

Deprecated features and configurations

The following features and configuration properties have been deprecated. Confluent recommends using the alternatives instead:

Snapshot mode: The schema_only snapshot mode is deprecated. Use no_data instead.
Signaling: The additional-condition option in the signal query is deprecated. Use additional-conditions instead.
ExtractNewRecordState SMT: The configurations drop.tombstones and delete.handling.mode configurations are deprecated. Use delete.tombstones.handling.mode instead.

Maximum message size

This connector creates topics automatically. When it creates topics, the internal connector configuration property max.message.bytes is set to the following:

Basic cluster: 8 MB
Standard cluster: 8 MB
Enterprise cluster: 8 MB
Dedicated cluster: 20 MB

For more information about Confluent Cloud clusters, see Kafka Cluster Types in Confluent Cloud.

Log retention during snapshot

When launched, the CDC connector creates a snapshot of the existing data in the database to capture the nominated tables. To do this, the connector executes a “SELECT *” statement. Completing the snapshot can take a while if one or more of the nominated tables is very large.

During the snapshot process, the database server must retain the change tables so that when the snapshot is complete, the CDC connector can start processing database changes that have completed since the snapshot process began.

If one or more of the tables are very large, the snapshot process could run longer than the retention period set for the cleanup job to purge the change tables. To capture very large tables, you should retain the change tables for longer than normal by increasing the retention period of change tables.

Manage custom offsets

You can manage the offsets for this connector. Offsets provide information on the point in the system from which the connector is accessing data. For more information, see Manage Offsets for Fully-Managed Connectors in Confluent Cloud.

To manage offsets:

Manage offsets using Confluent Cloud APIs. For more information, see Connect offsets API reference.

Get the current offset

To get the current offset, make a GET request that specifies the environment, Kafka cluster, and connector name.

GET /connect/v1/environments/{environment_id}/clusters/{kafka_cluster_id}/connectors/{connector_name}/offsets
Host: https://api.confluent.cloud

Response:

Successful calls return HTTP 200 with a JSON payload that describes the offset.

{
    "id": "lcc-example123",
    "name": "{connector_name}",
    "offsets": [
      {
        "partition": {
          "database": "database_name",
          "server": "server_name"
        },
        "offset": {
          "change_lsn": "00000118:0000302b:0003",
          "commit_lsn": "00000118:0000302b:0004",
          "event_serial_no": 1,
          "transaction_id": null
        }
      }
    ],
    "metadata": {
        "observed_at": "2024-03-28T17:57:48.139635200Z"
    }
}

Responses include the following information:

The position of latest offset.
The observed time of the offset in the metadata portion of the payload. The observed_at time indicates a snapshot in time for when the API retrieved the offset. A running connector is always updating its offsets. Use observed_at to get a sense for the gap between real time and the time at which the request was made. By default, offsets are observed every minute. Calling GET repeatedly will fetch more recently observed offsets.
Information about the connector.
In these examples, the curly braces around “{connector_name}” indicate a replaceable value.

Update the offset

To update the offset, make a POST request that specifies the environment, Kafka cluster, and connector name. Include a JSON payload that specifies new offset and a patch type.

POST /connect/v1/environments/{environment_id}/clusters/{kafka_cluster_id}/connectors/{connector_name}/offsets/request
Host: https://api.confluent.cloud

 {
     "type": "PATCH",
     "offsets": [
       {
         "partition": {
           "database": "database_name",
           "server": "server_name"
         },
         "offset": {
           "change_lsn": "0000010d:0000138f:0002",
           "commit_lsn": "00000116:000042CC:0004",
           "event_serial_no": 1,
           "transaction_id": null
         }
       }
     ]
 }

Considerations:

You can only make one offset change at a time for a given connector.
This is an asynchronous request. To check the status of this request, you must use the check offset status API. For more information, see Get the status of an offset request.
For source connectors, the connector attempts to read from the position defined by the requested offsets.

Response:

Successful calls return HTTP 202 Accepted with a JSON payload that describes the offset.

{
    "id": "lcc-example123",
    "name": "{connector_name}",
    "offsets": [
      {
        "partition": {
          "database": "database_name",
          "server": "server_name"
        },
        "offset": {
          "change_lsn": "0000010d:0000138f:0002",
          "commit_lsn": "00000116:000042CC:0004",
          "event_serial_no": 1,
          "transaction_id": null
        }
      }
    ],
    "requested_at": "2024-03-28T17:58:45.606796307Z",
    "type": "PATCH"
}

Responses include the following information:

The requested position of the offsets in the source.
The time of the request to update the offset.
Information about the connector.

Delete the offset

To delete the offset, make a POST request that specifies the environment, Kafka cluster, and connector name. Include a JSON payload that specifies the delete type.

 POST /connect/v1/environments/{environment_id}/clusters/{kafka_cluster_id}/connectors/{connector_name}/offsets/request
 Host: https://api.confluent.cloud

{
  "type": "DELETE"
}

Considerations:

Delete requests delete the offset for the provided partition and reset to the base state. A delete request is as if you created a fresh new connector.
This is an asynchronous request. To check the status of this request, you must use the check offset status API. For more information, see Get the status of an offset request.
Do not issue delete and patch requests at the same time.
For source connectors, the connector attempts to read from the position defined in the base state.

Response:

Successful calls return HTTP 202 Accepted with a JSON payload that describes the result.

{
  "id": "lcc-example123",
  "name": "{connector_name}",
  "offsets": [],
  "requested_at": "2024-03-28T17:59:45.606796307Z",
  "type": "DELETE"
}

Responses include the following information:

Empty offsets.
The time of the request to delete the offset.
Information about Kafka cluster and connector.
The type of request.

Get the status of an offset request

To get the status of a previous offset request, make a GET request that specifies the environment, Kafka cluster, and connector name.

GET /connect/v1/environments/{environment_id}/clusters/{kafka_cluster_id}/connectors/{connector_name}/offsets/request/status
Host: https://api.confluent.cloud

Considerations:

The status endpoint always shows the status of the most recent PATCH/DELETE operation.

Response:

Successful calls return HTTP 200 with a JSON payload that describes the result. The following is an example of an applied patch.

{
   "request": {
      "id": "lcc-example123",
      "name": "{connector_name}",
      "offsets": [
        {
          "partition": {
            "database": "database_name",
            "server": "server_name"
          },
          "offset": {
            "change_lsn": "0000010d:0000138f:0002",
            "commit_lsn": "00000116:000042CC:0004",
            "event_serial_no": 1,
            "transaction_id": null
          }
        }
      ],
      "requested_at": "2024-03-28T17:58:45.606796307Z",
      "type": "PATCH"
   },
   "status": {
      "phase": "APPLIED",
      "message": "The Connect framework-managed offsets for this connector have been altered successfully. However, if this connector manages offsets externally, they will need to be manually altered in the system that the connector uses."
   },
   "previous_offsets": [
     {
       "partition": {
         "database": "database_name",
         "server": "server_name"
       },
       "offset": {
         "change_lsn": "0000010d:0000138f:0002",
         "commit_lsn": "00000116:000042CC:0004",
         "event_serial_no": 1,
         "transaction_id": null
       }
     },
   ],
   "applied_at": "2024-03-28T17:58:48.079141883Z"
}

Responses include the following information:

The original request, including the time it was made.
The status of the request: applied, pending, or failed.
The time you issued the status request.
The previous offsets. These are the offsets that the connector last updated prior to updating the offsets. Use these to try to restore the state of your connector if a patch update causes your connector to fail or to return a connector to its previous state after rolling back.

JSON payload

The table below offers a description of the unique fields in the JSON payload for managing offsets of the Microsoft SQL Server Change Data Capture (CDC) Source V2 (Debezium) connector.

Field	Definition	Required/Optional
`database`	The Microsoft SQL Server database.	Required
`server`	The name of the Microsoft SQL Server server.	Required
`change_lsn`	`change_lsn` represents the Log Sequence Number (LSN) associated with the specific change event within the transaction.	Required
`commit_lsn`	`commit_lsn` represents the Log Sequence Number (LSN) associated with the commit of the transaction that included the change event.	Required
`event_serial_no`	For update events there can be duplicate values for `change_lsn` and `commit_lsn`. In such cases, `event_serial_no` differentiates the old value (`1`) and new value (`2`) of the event.	Required
`transaction_id`	When `provide.transaction.metadata` is set to `false` (the default), `transaction_id` is `null` and has no effect on streaming. When `provide.transaction.metadata` is set to `true`, the connector sets `transaction_id` to the same value as `commit_lsn` and uses it to drive `BEGIN` and `END` markers on the dedicated transaction-metadata topic.	Optional

Rewind the connector to a desired LSN

To rewind or fast-forward the connector through a custom offset, you need a SQL Server LSN. If you do not already have one, you can obtain an LSN from your source database using SQL Server’s CDC system functions. For the full list and syntax variations across SQL Server versions, see the Microsoft SQL Server CDC functions documentation.

The following walkthrough uses a target time to derive the LSN, then rewinds the connector to that LSN.

Run the following query on the source database, with a login that has permission to read CDC system functions. It uses sys.fn_cdc_map_time_to_lsn to return the LSN at or before a given timestamp, and converts it to the colon-separated form that the connector offset payload expects.

DECLARE @target_time DATETIME = '2026-03-31 09:00:00';

DECLARE @lsn binary(10) = sys.fn_cdc_map_time_to_lsn(
    'largest less than or equal',
    @target_time);

SELECT
    @target_time AS target_time,
    @lsn         AS lsn_binary,
    LOWER(
        CONVERT(VARCHAR(8), SUBSTRING(@lsn, 1, 4), 2) + ':' +
        CONVERT(VARCHAR(8), SUBSTRING(@lsn, 5, 4), 2) + ':' +
        CONVERT(VARCHAR(4), SUBSTRING(@lsn, 9, 2), 2)
    ) AS commit_lsn_payload;

Example output:

target_time            lsn_binary               commit_lsn_payload
---------------------- ------------------------ ----------------------
2026-03-31 09:00:00    0x00000118000030B30003   00000118:000030b3:0003

The final SELECT splits the 10-byte LSN into three chunks (bytes 1–4, 5–8, and 9–10), formats each chunk as hex without the 0x prefix, and joins them with colons. To use a different CDC function as the source of the LSN, substitute its call for sys.fn_cdc_map_time_to_lsn in the DECLARE line, using the appropriate syntax for that function. The rest of the query produces the LSN in the format the connector expects.

Submit the LSN as part of a PATCH request to the offsets endpoint, using the Update the offset flow shown in the tabs above. Set each field as follows:

Field	Value
`commit_lsn`	The LSN you want the connector to resume from (the `commit_lsn_payload` value from the previous step). The connector reads `commit_lsn` as the inclusive lower bound, so events whose commit LSN equals `commit_lsn` are re-emitted.
`change_lsn`	Set to `null` to replay every event of the transaction at `commit_lsn` from the beginning. Set to a specific value to resume strictly after the row identified by `(commit_lsn, change_lsn)`. Only events whose change LSN is strictly greater than `change_lsn` are emitted at that `commit_lsn`.
`event_serial_no`	Set to `0` to include every event at the `(commit_lsn, change_lsn)` position. `0` is the recommended value because it works for any LSN, regardless of whether the LSN happens to land on the position of an UPDATE event. This is the only case where multiple rows share both `commit_lsn` and `change_lsn`. Set `event_serial_no` to `2` to skip both halves of an UPDATE at the offset position. Setting `event_serial_no` to `1` is not allowed at the offset position of an UPDATE event and fails the connector task.
`transaction_id`	Set to `null`. `transaction_id` only affects the dedicated transaction-metadata topic that the connector writes when `provide.transaction.metadata` is set to `true`. The data path on the change topics does not depend on it. With `null`, the first event after the resume point opens a fresh `BEGIN` marker for the transaction at `commit_lsn`. Setting `transaction_id` to any non-null value (including the same value as `commit_lsn`) causes the connector to treat the transaction as already in progress and skip the `BEGIN` marker. The transaction’s `END` marker on the transaction-metadata topic may therefore appear without a matching `BEGIN`, or carry event counts that do not match the events on the data topic.

The following PATCH payload uses the example LSN from step 1 to replay every event of the transaction at 00000118:000030b3:0003 and continue streaming forward:

{
    "type": "PATCH",
    "offsets": [
      {
        "partition": {
          "database": "database_name",
          "server":   "server_name"
        },
        "offset": {
          "commit_lsn":      "00000118:000030b3:0003",
          "change_lsn":      null,
          "event_serial_no": 0,
          "transaction_id":  null
        }
      }
    ]
}

Note

If the CDC cleanup job has purged your chosen LSN from cdc.lsn_time_mapping, the connector advances to the earliest available LSN instead of failing.

Migrate connectors

Considerations:

The configurations of the self-managed connector must match the configurations of the fully-managed connector.
The self-managed connector must be operating in streaming mode. If the self-managed connector is still in the process of making a snapshot, you can either create a new connector on Confluent Cloud which starts the snapshot process from the beginning or wait for the snapshot process to complete and follow the migration guidance.

Create fully-managed connectors with offsets

Considerations:

Schema history topic of the self-managed connector must be reused while creating the fully-managed connector. This can be done by specifying the Database schema history topic name config value in the Confluent Cloud Console or schema.history.internal.kafka.topic config value using the Confluent CLI.
To reuse schema history topic, the specified value of Topic prefix and Database names should be same as that of the self-managed connector and the offsets to create connector with should be provided for all the databases.

Quick Start

Use this quick start to get up and running with the Confluent Cloud Microsoft SQL Server CDC Source V2 (Debezium) connector. The quick start provides the basics of selecting the connector and configuring it to obtain a snapshot of the existing data in a Microsoft SQL Server database and then monitoring and recording all subsequent row-level changes.

Prerequisites

Authorized access to a Confluent Cloud cluster on Amazon Web Services (AWS), Microsoft Azure (Azure), or Google Cloud.
The Confluent CLI installed and configured for the cluster. See Install the Confluent CLI.
Schema Registry must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf).
SQL Server configured for change data capture (CDC).
- For Debezium instructions, see Setting up SQL Server.
- For Amazon RDS instructions, see Using change data capture.
Public access may be required for your database. See Manage Networking for Confluent Cloud Connectors for details. The example below shows the AWS Management Console when setting up a Microsoft SQL Server database.
Public access enabled
For networking considerations, see Networking and DNS. To use a set of public egress IP addresses, see Public Egress IP Addresses for Confluent Cloud Connectors. The following example shows the AWS Management Console when setting up security group rules for the VPC.
Open inbound traffic
Note
See your specific cloud platform documentation for how to configure security rules for your VPC.

Kafka cluster credentials. The following lists the different ways you can provide credentials.
- Enter an existing service account resource ID.
- Create a Confluent Cloud service account for the connector. Make sure to review the ACL entries required in the service account documentation. Some connectors have specific ACL requirements.
- Create a Confluent Cloud API key and secret. To create a key and secret, you can use confluent api-key create or you can autogenerate the API key and secret directly in the Cloud Console when setting up the connector.

Using the Confluent Cloud Console

Step 1: Launch your Confluent Cloud cluster

To create and launch a Kafka cluster in Confluent Cloud, see Create a kafka cluster in Confluent Cloud.

Step 2: Add a connector

In the left navigation menu, click Connectors. If you already have connectors in your cluster, click + Add connector.

Step 3: Select your connector

Click the Microsoft SQL Server CDC Source V2 connector card.

Step 4: Enter the connector details

Note

Make sure you have all your prerequisites completed.

At the Microsoft SQL Server CDC Source V2 (Debezium) screen, complete the following:

Kafka access

Select the way you want to provide Kafka Cluster credentials. You can choose one of the following options:
- My account: This setting allows your connector to globally access everything that you have access to. With a user account, the connector uses an API key and secret to access the Kafka cluster. This option is not recommended for production.
- Service account: This setting limits the access for your connector by using a service account. This option is recommended for production.
- Use an existing API key: This setting allows you to specify an API key and a secret pair. You can use an existing pair or create a new one. This method is not recommended for production environments.
Note
Freight clusters support only service accounts for Kafka authentication.
Click Continue.

Authentication

Configure the authentication properties:
Authentication method
- Authentication method: Select how you want to authenticate with the database. Valid options are Microsoft Entra ID application and Password.
- Use secret manager: Enable this setting to fetch sensitive configuration values, such as the Password, from a secret manager.
- Provider Integration: Select an existing integration that has access to your resource such as the secret manager.
Secret manager configuration
- Secret manager: Select the secret manager that Confluent Cloud should use to retrieve sensitive data.
- Configurations from Secret manager: Select the configurations whose values Confluent Cloud should fetch from the secret manager.
- Provider Integration: Select an existing integration that has access to your resource such as the secret manager.
How should we connect to your database?
- Database hostname: The IP address or hostname of the Microsoft SQL Server database server.
- Database port: The port number of the Microsoft SQL Server database server.
- Database username: The name of the Microsoft SQL Server user that has the required authorization.
- Database password: The password of the Microsoft SQL Server user that has the required authorization.
- Database names: A comma-separated list of Microsoft SQL Server database names from which to stream changes.
- Database encrypt: Controls SSL encryption for connections to the SQL Server database. The default value is false, which means the connector doesn’t force the server to support TLS encryption. When set to true, the connector requests TLS encryption from the server.
Click Continue.

Configuration

Output messages

Select output record value format: Select the output record value format (data going to the Kafka topic). Valid options are AVRO, JSON_SR, PROTOBUF, STRING, and JSON. Schema Registry must be enabled to use a Schema Registry-based format (for example, Avro, JSON Schema, or Protobuf).
Output Kafka record key format: The output Kafka record key format. Valid options are AVRO, JSON_SR, PROTOBUF, STRING, and JSON. To use a schema-based message format (such as AVRO, JSON_SR, or PROTOBUF), you must have Confluent Cloud Schema Registry configured.

Connector config

Snapshot mode: The criteria for running a snapshot when the connector starts. Valid options are initial, initial_only, schema_only, no_data, and recovery.
- initial (default): Takes a snapshot of the structure and data of captured tables. Use this option to populate topics with a complete representation of the data from the captured tables.
- initial_only: Takes a snapshot of structure and data like initial but instead does not transition into streaming changes once the snapshot has completed.
- schema_only: Deprecated, use no_data instead.
- no_data: Takes a snapshot of the structure of captured tables only. This is useful if only changes happening from now onwards should be propagated to topics.
- recovery: Recovery setting for a connector that has already been capturing changes. When you restart the connector, this setting enables recovery of a corrupted or lost database schema history topic. You might set it periodically to clean up a database schema history topic that has been growing unexpectedly. Database schema history topics require infinite retention.
Tables included: Enter a comma-separated list of fully-qualified table identifiers for the connector to monitor. By default, the connector monitors all non-system tables. A fully-qualified table name is in the form schemaName.tableName. This property cannot be used with the property Tables excluded.
Tables excluded: Enter a comma-separated list of fully-qualified table identifiers for the connector to ignore. A fully-qualified table name is in the form schemaName.tableName. This property cannot be used with the property Tables included.

Data encryption

Enable Client-Side Field Level Encryption for data encryption. Specify a Service Account to access the Schema Registry and associated encryption rules or keys with that schema. For more information on CSFLE or CSPE setup, see Manage encryption for connectors.

Show advanced configurations

Schema context: Select a schema context to use for this connector, if using a schema-based data format. This property defaults to the Default context, which configures the connector to use the default schema set up for Schema Registry in your Confluent Cloud environment. A schema context allows you to use separate schemas (like schema sub-registries) tied to topics in different Kafka clusters that share the same Schema Registry environment. For example, if you select a non-default context, a Source connector uses only that schema context to register a schema and a Sink connector uses only that schema context to read from. For more information about setting up a schema context, see What are schema contexts and when should you use them?.

Additional Configs

Value Converter Replace Null With Default: Specifies whether to replace fields that have a default value and that are null to the default value. When set to true, the connector uses the default value; otherwise, it uses null. Applies to the JSON converter.
Value Converter Reference Subject Name Strategy: Sets the subject reference name strategy for values. Valid entries are DefaultReferenceSubjectNameStrategy or QualifiedReferenceSubjectNameStrategy. You can use this strategy only with PROTOBUF format; the default strategy is DefaultReferenceSubjectNameStrategy.
Value Converter Schemas Enable: Includes schema within each of the serialized values. Input messages must contain schema and payload fields and must not contain additional fields. For plain JSON data, set this to false. Applies to the JSON converter.
Errors Tolerance: Use this property to configure the connector’s error handling behavior.
Warning
Use this property with caution for sink connectors, as it can lead to data loss. If you set this property to all, the connector does not fail on errant records, but logs them (and sends to DLQ for sink connectors) and continues processing. If you set this property to none, the connector task fails on errant records.
Value Converter Ignore Default For Nullables: When set to true, this property ensures that the corresponding record in Kafka is null, instead of showing the default column value. Applies to the AVRO, PROTOBUF, and JSON_SR converters.
Value Converter Decimal Format: Specifies the JSON or JSON_SR serialization format for Connect DECIMAL logical type values with two allowed literals: BASE64 to serialize DECIMAL logical types as base64 encoded binary data, and NUMERIC to serialize DECIMAL logical type values in JSON or JSON_SR as a number representing the decimal value.
Key Converter Schema ID Serializer: The class name of the schema ID serializer for keys. This is used to serialize schema IDs in the message headers.
Value Converter Connect Meta Data: Enables the Connect converter to add its metadata to the output schema. Applies to Avro converters.
Value Converter Value Subject Name Strategy: Determines how to construct the subject name under which the value schema is registered with Schema Registry.
Key Converter Key Subject Name Strategy: Determines how to construct the subject name for key schema registration.
Value Converter Schema ID Serializer: The class name of the schema ID serializer for values. This is used to serialize schema IDs in the message headers.

Auto-restart policy

Enable Connector Auto-restart: Enables the auto-restart behavior of the connector and its task in the event of user-actionable errors. Defaults to true, enabling the connector to automatically restart in case of user-actionable errors. Set this property to false to disable auto-restart for failed connectors. If disabled, you must manually restart the connector.

Output messages

After-state only: Controls whether the generated Kafka record should contain only the state of the row after the event occurred. Defaults to false.
Tombstones on delete: Configure whether a tombstone event should be generated after a delete event. The default is true.

Database config

Signal data collection: Fully-qualified name of the data collection that is used to send signals to the connector. Use the following format to specify the fully-qualified collection name: databaseName.schemaName.tableName. These signals can be used to perform incremental snapshotting.

How should we name your topic(s)?

Database schema history topic name: The name of the topic for the database schema history. A new topic with the provided name is created if it doesn’t already exist. If the topic already exists, ensure that it has a single partition, infinite retention period and is not in use by any other connector. If no value is provided, the name defaults to dbhistory.<topic-prefix>.<lcc-id>.

Connector config

Specify application intent: Defines the connection property applicationIntent within the connection string. Possible settings are: ReadWrite and ReadOnly. Defaults to ReadWrite.
Snapshot Isolation mode: A mode to control which transaction isolation level is used and how long the connector locks tables that are designated for capture. Possible settings are: read_uncommitted, read_committed, repeatable_read, snapshot, and exclusive. The snapshot, read_committed and read_uncommitted modes do not prevent other transactions from updating table rows during initial snapshot. The exclusive and repeatable_read modes do prevent concurrent updates. Mode choice also affects data consistency. Only exclusive and snapshot modes guarantee full consistency, that is, initial snapshot and streaming logs constitute a linear history. In case of repeatable_read and read_committed modes, it might happen that, for instance, a record added appears twice - once in initial snapshot and once in streaming phase. Nonetheless, that consistency level should do for data mirroring. For read_uncommitted there are no data consistency guarantees at all (some data might be lost or corrupted).
Columns excluded: An optional, comma-separated list of regular expressions that match the fully-qualified names of columns to exclude from change event record values. Fully-qualified names for columns are of the form schemaName.tableName.columnName.
Event processing failure handling mode: Specifies how the connector should react to exceptions during processing of events. Possible settings are: fail, skip, and warn.
- fail (default): propagates the exception, indicates the offset of the problematic event, and causes the connector to stop.
- skip: skips the problematic event and continues processing.
- warn: logs the offset of the problematic event, skips that event, and continues processing.
Schema name adjustment mode: Specifies how schema names should be adjusted for compatibility with the message converter used by the connector. Possible settings are: none, avro, and avro_unicode.
- none (default): does not apply any adjustment.
- avro: replaces the characters that cannot be used in the Avro type name with underscore.
- avro_unicode: replaces the underscore or characters that cannot be used in the Avro type name with corresponding unicode like _uxxxx. Note: _ is an escape sequence like backslash in Java.
Field name adjustment mode: Specifies how field names should be adjusted for compatibility with the message converter used by the connector. Possible settings are: none, avro, and avro_unicode.
- none (default): does not apply any adjustment.
- avro: replaces the characters that cannot be used in the Avro type name with underscore.
- avro_unicode: replaces the underscore or characters that cannot be used in the Avro type name with corresponding unicode like _uxxxx. Note: _ is an escape sequence like backslash in Java.
Heartbeat interval (ms): Controls how frequently the connector sends heartbeat messages to a Kafka topic. The behavior of default value 0 is that the connector does not send heartbeat messages. Heartbeat messages are useful for monitoring whether the connector is receiving change events from the database. Heartbeat messages might help decrease the number of change events that need to be re-sent when a connector restarts. To send heartbeat messages, set this property to a positive integer, which indicates the number of milliseconds between heartbeat messages.
Heartbeat action query: If specified, the connector executes this query on every heartbeat against the source database. The query must be a valid SQL DML statement, typically an INSERT or UPDATE, that targets a dedicated heartbeat table.
This configuration helps in scenarios where the connector is capturing changes only from low-traffic database tables. Extended periods of inactivity in these tables can prevent the connector from advancing the LSN position in its stored offsets. To address this:
- Create a heartbeat table in the database.
- Enable CDC on the heartbeat table.
- Set this property to a DML statement that periodically updates the table by either inserting a new row or repeatedly updating the same row.
- Include the heartbeat table in the connector’s capture set using the schema or table include configuration properties.
This ensures the connector receives changes from the heartbeat table and can continue advancing the LSN in its stored offsets. The heartbeat query executes at regular intervals, as specified by the heartbeat.interval.ms configuration property.
Note
- To uphold the principle of least privilege, grant the connector user write permissions exclusively to essential tables, such as those used for heartbeats. In line with data minimization, ensure the connector configuration is free of PII or sensitive data, as this information is not needed for system-level functions like heartbeat queries.
Skip unparseable DDL: A Boolean value that specifies whether the connector should ignore malformed or unknown database statements (true), or stop processing so a human can fix the issue (false). Defaults to false. Consider setting this to true to ignore unparseable statements.
Store only captured tables DDL: A Boolean value that specifies whether the connector records schema structures from all tables in a schema or database, or only from tables that are designated for capture.
- false (default): During a database snapshot, the connector records the schema data for all non-system tables in the database, including tables that are not designated for capture. It’s best to retain the default setting. If you later decide to capture changes from tables that you did not originally designate for capture, the connector can easily begin to capture data from those tables, because their schema structure is already stored in the schema history topic.
- true: During a database snapshot, the connector records the table schemas only for the tables from which Debezium captures change events. If you change the default value, and you later configure the connector to capture data from other tables in the database, the connector lacks the schema information that it requires to capture change events from the tables.
Snapshot select statement overrides data map: A JSON object that maps fully-qualified table identifiers (schemaName.tableName) to custom SELECT statements. The connector uses these statements during snapshots instead of the default SELECT * statement. Use this property for large append-only tables to resume a snapshot from a specific point if a previous attempt was interrupted. These values are sensitive and are masked in the configuration.
Note
Keys may use either schemaName.tableName or the fully-qualified dbName.schemaName.tableName form. When the same table name exists across multiple databases, use the fully-qualified form to avoid ambiguity. Escape double quotes (") in table or schema names. Use a single backslash (\") in the UI and three backslashes (\\\") in the CLI.

Schema Config

Key converter reference subject name strategy: Set the subject reference name strategy for key. Valid entries are DefaultReferenceSubjectNameStrategy or QualifiedReferenceSubjectNameStrategy. Note that the subject reference name strategy can be selected only for PROTOBUF format with the default strategy being DefaultReferenceSubjectNameStrategy.

How should we handle data types?

Decimal handling mode: Specifies how the connector should handle values for DECIMAL and NUMERIC columns. Possible settings are: precise, double, and string.
- precise (default): represents values by using java.math.BigDecimal to represent values in binary form in change events.
- double: represents values by using double values, which might result in a loss of precision but which is easier to use.
- string: encodes values as formatted strings, which are easy to consume but semantic information about the real type is lost.
Time precision mode: Time, date, and timestamps can be represented with different kinds of precisions:
- adaptive (default): captures the time and timestamp values exactly as in the database using either millisecond, microsecond, or nanosecond precision values based on the database column’s type.
- connect: always represents time and timestamp values by using Kafka Connect’s built-in representations for Time, Date, and Timestamp, which use millisecond precision regardless of the database columns’ precision.

Transforms

Single Message Transformations: To add a new SMT, see Add transforms. For more information about unsupported SMTs, see Unsupported transformations.

Processing position

Set offsets: Click Set offsets to define a specific offset for this connector to begin procession data from. For more information on managing offsets, see Manage offsets.

For additional information about the Debezium SMTs ExtractNewRecordState and EventRouter (Debezium), see Debezium transformations.

For all property values and definitions, see Configuration Properties.

Click Continue.

Sizing

Based on the number of topic partitions you select, you will be provided with a recommended number of tasks.

To change the number of tasks, use the Range Slider to select the desired number of tasks.
Click Continue.

Review and Launch

Verify the connection details by previewing the running configuration.
After you’ve validated that the properties are configured to your satisfaction, click Launch.
The status for the connector should go from Provisioning to Running.

Step 5: Check the Kafka topic

After the connector is running, verify that messages are populating your Kafka topic.

Note

A topic named dbhistory.<topic.prefix>.<connect-id> is automatically created for schema.history.internal.kafka.topic with one partition.

For more information and examples to use with the Confluent Cloud API for Connect, see the Confluent Cloud API for Connect Usage Examples section.

Using the Confluent CLI

Complete the following steps to set up and run the connector using the Confluent CLI.

Note

Make sure you have all your prerequisites completed.

Step 1: List the available connectors

Enter the following command to list available connectors:

confluent connect plugin list

Step 2: List the connector configuration properties

Enter the following command to show the connector configuration properties:

confluent connect plugin describe <connector-plugin-name>

The command output shows the required and optional configuration properties.

Step 3: Create the connector configuration file

Create a JSON file that contains the connector configuration properties. The following example shows the required connector properties.

{
  "connector.class": "SqlServerCdcSourceV2",
  "name": "SqlServerCdcSourceV2Connector_0",
  "kafka.auth.mode": "KAFKA_API_KEY",
  "kafka.api.key": "****************",
  "kafka.api.secret": "****************************************************************",
  "database.hostname": "connect-sqlserver-cdc.<host-id>.us-west-2.rds.amazonaws.com",
  "database.port": "1433",
  "database.user": "admin",
  "database.password": "************",
  "database.names": "testDb",
  "topic.prefix": "sql",
  "table.include.list":"dbo.passengers",
  "output.data.format": "JSON",
  "tasks.max": "1"
}

Note the following property definitions:

"connector.class": Identifies the connector plugin name.
"name": Sets a name for your new connector.

"kafka.auth.mode": Identifies the connector authentication mode you want to use. There are two options: SERVICE_ACCOUNT or KAFKA_API_KEY (the default). To use an API key and secret, specify the configuration properties kafka.api.key and kafka.api.secret, as shown in the example configuration (above). To use a service account, specify the Resource ID in the property kafka.service.account.id=<service-account-resource-ID>. To list the available service account resource IDs, use the following command:
```
confluent iam service-account list
```
For example:
```
confluent iam service-account list

   Id     | Resource ID |       Name        |    Description
+---------+-------------+-------------------+-------------------
   123456 | sa-l1r23m   | sa-1              | Service account 1
   789101 | sa-l4d56p   | sa-2              | Service account 2
```

"database.hostname": IP address or hostname of the Microsoft SQL Server database server.
"database.port": Port number of the Microsoft SQL Server database server.
"database.user": The name of the Microsoft SQL Server user that has the required authorization.
"database.password": Password of the Microsoft SQL Server user that has the required authorization.
"database.names": The names of the Microsoft SQL Server databases from which to stream the changes.
"topic.prefix": Provides a namespace for the particular database server/cluster that the connector is capturing changes from.
"table.include.list": An optional, comma-separated list of fully-qualified table identifiers for the connector to monitor. By default, the connector monitors all non-system tables. A fully-qualified table name is in the form schemaName.tableName. This property cannot be used with the property table.exclude.list.
"output.data.format": Sets the output Kafka record value format (data coming from the connector). Valid entries are AVRO, JSON_SR, PROTOBUF, or JSON. You must have Confluent Cloud Schema Registry configured if using a schema-based record format (for example, Avro, JSON_SR (JSON Schema), or Protobuf).
"tasks.max": Enter the number of tasks in use by the connector.

Note

To enable CSFLE or CSPE for data encryption, specify the following properties:

csfle.enabled: Flag to indicate whether the connector honors CSFLE or CSPE rules.
sr.service.account.id: A Service Account to access the Schema Registry and associated encryption rules or keys with that schema.

For more information on CSFLE or CSPE setup, see Manage encryption for connectors.

SMTs: For details about adding SMTs using the Confluent CLI, see the Single Message Transformations documentation. For additional information about the Debezium SMTs ExtractNewRecordState and EventRouter (Debezium), see Debezium transformations.

See Configuration Properties for all property values and definitions.

Step 4: Load the properties file and create the connector

Enter the following command to load the configuration and start the connector:

confluent connect cluster create --config-file <file-name>.json

For example:

confluent connect cluster create --config-file microsoft-sql-cdc-source-v2.json

Example output:

Created connector SqlServerCdcSourceV2Connector_0 lcc-ix4dl

Step 5: Check the connector status

Enter the following command to check the connector status:

confluent connect cluster list

Example output:

ID          |            Name                  | Status  |  Type
+-----------+----------------------------------+---------+-------+
lcc-ix4dl   | SqlServerCdcSourceV2Connector_0  | RUNNING | source

Step 6: Check the Kafka topic.

After the connector is running, verify that messages are populating your Kafka topic.

For more information and examples to use with the Confluent Cloud API for Connect, see the Confluent Cloud API for Connect Usage Examples section.

Note

A topic named dbhistory.<topic.prefix>.<connect-id> is automatically created for schema.history.internal.kafka.topic with one partition.

Moving from V1 to V2

Version 2 of this connector supports new features and has breaking changes that are not backward compatible with version 1 of the connector. To understand these changes and to plan for moving to version 2, see Backward Incompatible Changes in Debezium CDC V2 Connectors.

Given the backward-incompatible changes between version 1 and 2 of the CDC connectors, version 2 is being provided in a new set of CDC connectors on Confluent Cloud. You can provision either version 1 or version 2. However, note that eventually version 1 will be deprecated and no longer supported.

Before exploring your options for moving from version 1 to 2, be sure to make the required changes documented in Backward Incompatible Changes in Debezium CDC V2 Connectors. To get the offset in the following section, use the Confluent Cloud APIs. For more information, see Connect offsets API reference, Manage custom offsets, and Manage Offsets for Fully-Managed Connectors in Confluent Cloud.

To move from version 1 to 2 (v1 to v2)

Use the following steps to migrate to version 2. Implement and validate any connector changes in a pre-production environment before promoting to production.

Pause the v1 connector.
Get the offset for the v1 connector.

Create the v2 connector using the offset from the previous step.

confluent connect cluster create [flags]

For example:

Create a configuration file with connector configs and offsets.

{
  "name": "(connector-name)",
  "config": {
      ... // connector specific configuration
  },
  "offsets": [
      {
          "partition": {
      ... // connector specific configuration
          },
          "offset": {
      ... // connector specific configuration
          }
      }
  ]
}

Create a connector in the current or specified Kafka cluster context.

confluent connect cluster create --config-file config.json

For connectors that maintain a schema history topic, you must configure the schema history topic name in v2 to match the schema history topic name from the v1 connector.

Delete the v1 connector.

For more information, see Manage Offsets for Fully-Managed Connectors in Confluent Cloud.

Configuration Properties

Use the following configuration properties with the fully-managed connector. For self-managed connector property definitions and other details, see the connector docs in Self-managed connectors for Confluent Platform.

How should we connect to your data?

name

Sets a name for your connector.

Type: string
Valid Values: A string at most 64 characters long
Importance: high

Kafka Cluster credentials

kafka.auth.mode

Kafka Authentication mode. It can be one of KAFKA_API_KEY or SERVICE_ACCOUNT. It defaults to KAFKA_API_KEY mode, whenever possible.

Type: string
Valid Values: SERVICE_ACCOUNT, KAFKA_API_KEY
Importance: high

kafka.api.key

Kafka API Key. Required when kafka.auth.mode==KAFKA_API_KEY.

Type: password
Importance: high

kafka.service.account.id

The Service Account that will be used to generate the API keys to communicate with Kafka Cluster.

Type: string
Importance: high

kafka.api.secret

Secret associated with Kafka API key. Required when kafka.auth.mode==KAFKA_API_KEY.

Type: password
Importance: high

Authentication method

authentication.method

Select how you want to authenticate with the database. Valid options are Microsoft Entra ID application and Password.

Type: string
Default: Password
Valid Values: Microsoft Entra ID application, Password
Importance: high

provider.integration.id

Select an existing integration that has access to your resource.

Type: string
Importance: high

secret.manager.enabled

Fetch sensitive configuration values from a secret manager.

Type: boolean
Default: false
Importance: high

Secret manager configuration

secret.manager

Select the secret manager to use for retrieving sensitive data.

Type: string
Importance: high

secret.manager.managed.configs

Select the configurations to fetch their values from the secret manager.

Type: list
Importance: high

secret.manager.provider.integration.id

Select an existing provider integration that has access to your secret manager.

Type: string
Importance: high

How should we connect to your database?

database.hostname

IP address or hostname of the SQL Server database server.

Type: string
Valid Values: Must match the regex ^[A-Za-z0-9_.-]+$
Importance: high

database.port

Port number of the SQL Server database server.

Type: int
Valid Values: [0,…,65535]
Importance: high

database.user

The name of the SQL Server database user that has the required authorization.

Type: string
Importance: high

database.password

The password for the SQL Server database user that has the required authorization.

Type: password
Importance: high

database.names

The comma-separated list of the SQL Server database names from which to stream the changes.

Type: list
Valid Values: Every delimited value must match the regular expression pattern ^[A-Za-z0-9_.-]+$
Importance: high

database.encrypt

Controls SSL encryption for connections to a SQL Server database. The default value is false, indicating that the connector won’t force the server to support TLS encryption. When set to true, the connector requests to use TLS encryption with the server.

Type: boolean
Default: false
Importance: high

Output messages

output.data.format

Sets the output Kafka record value format. Valid entries are AVRO, JSON_SR, PROTOBUF, or JSON. Note that you need to have Confluent Cloud Schema Registry configured if using a schema-based message format like AVRO, JSON_SR, and PROTOBUF

Type: string
Default: JSON
Importance: high

output.key.format

Sets the output Kafka record key format. Valid entries are AVRO, JSON_SR, PROTOBUF, STRING or JSON. Note that you need to have Confluent Cloud Schema Registry configured if using a schema-based message format like AVRO, JSON_SR, and PROTOBUF

Type: string
Default: JSON
Valid Values: AVRO, JSON, JSON_SR, PROTOBUF, STRING
Importance: high

after.state.only

Controls whether the generated Kafka record should contain only the state of the row after the event occurred.

Type: boolean
Default: false
Importance: low

tombstones.on.delete

Controls whether a delete event is followed by a tombstone event. The following values are possible:

true: For each delete operation, the connector emits a delete event and a subsequent tombstone event.

false: For each delete operation, the connector emits only a delete event.

After a source record is deleted, a tombstone event (the default behavior) enables Kafka to completely delete all events that share the key of the deleted row in topics that have log compaction enabled.

Type: boolean
Default: true
Importance: medium

How should we name your topic(s)?

topic.prefix

Topic prefix that provides a namespace (logical server name) for the particular SQL Server database server or cluster in which Debezium is capturing changes. The prefix should be unique across all other connectors, since it is used as a topic name prefix for all Kafka topics that receive records from this connector. Only alphanumeric characters, hyphens, dots and underscores must be used. The connector automatically creates Kafka topics using the naming convention: <topic.prefix>.<databaseName>.<schemaName>.<tableName>.

Type: string
Importance: high

schema.history.internal.kafka.topic

The name of the topic for the database schema history. A new topic with provided name is created, if it doesn’t already exist. If the topic already exists, ensure that it has a single partition, infinite retention period and is not in use by any other connector. If no value is provided, the name defaults to dbhistory.<topic-prefix>.<lcc-id>.

Type: string
Default: dbhistory.${topic.prefix}.{{.logicalClusterId}}
Importance: high

How should we configure the topic(s)?

topic.creation.topic_prefix_match.partitions

Number of partitions for Kafka topics auto-created by the connector for topics whose name starts with topic.prefix.. Kafka preserves message ordering only within a partition. The connector keys each change event by the changed record’s primary key, and Kafka routes all events with the same key to the same partition - so events sharing a key stay in order, while events with different keys may be spread across partitions and lose their relative order. Tables that have no primary key produce unkeyed events, which Kafka distributes across all partitions, so any value above 1 removes ordering guarantees for those tables entirely. Keep this at 1 (the default) if you need all change events in a topic delivered in strict order.

Type: int
Default: 1
Valid Values: [1,…]
Importance: high

topic.creation.topic_prefix_match.cleanup.policy

Cleanup policy applied to Kafka topics auto-created by the connector for topics whose name starts with topic.prefix.. compact retains only the latest value per key (typical for change-data-capture topics); delete ages records out based on retention settings.

Type: string
Default: delete
Valid Values: compact, compact,delete, delete
Importance: high

Storage

topic.creation.topic_prefix_match.retention.ms

Time-based retention, in milliseconds, applied to Kafka topics auto-created by the connector for topics whose name starts with topic.prefix.. Use -1 for infinite retention.

Type: long
Default: 604800000 (7 days)
Valid Values: [-1,…]
Importance: high

topic.creation.topic_prefix_match.retention.bytes

Size-based retention, in bytes, applied to Kafka topics auto-created by the connector for topics whose name starts with topic.prefix.. Use -1 for unlimited size.

Type: long
Default: -1
Valid Values: [-1,…]
Importance: high

Database config

signal.data.collection

Fully-qualified name of the data collection that needs to be used to send signals to the connector. Use the following format to specify the fully-qualified collection name: databaseName.schemaName.tableName.

Type: string
Importance: medium

Connector config

snapshot.mode

Specifies the criteria for running a snapshot when the connector starts. Possible settings are: initial, initial_only, schema_only`(deprecated), `no_data, recovery.

initial - Takes a snapshot of structure and data of captured tables; useful if topics should be populated with a complete representation of the data from the captured tables.

initial_only - Takes a snapshot of structure and data like initial but instead does not transition into streaming changes once the snapshot has completed.

schema_only - Deprecated, use no_data instead.

no_data - Takes a snapshot of the structure of captured tables only; useful if only changes happening from now onwards should be propagated to topics.

recovery - recovery setting for a connector that has already been capturing changes. When you restart the connector, this setting enables recovery of a corrupted or lost database schema history topic. You might set it periodically to “clean up” a database schema history topic that has been growing unexpectedly. Database schema history topics require infinite retention.

Type: string
Default: initial
Valid Values: initial, initial_only, no_data, recovery, schema_only
Importance: medium

driver.applicationIntent

Defines the connection property applicationIntent within the connection string. Possible settings are: ReadWrite and ReadOnly.

Type: string
Default: ReadWrite
Valid Values: ReadOnly, ReadWrite
Importance: medium

snapshot.isolation.mode

A mode to control which transaction isolation level is used and how long the connector locks tables that are designated for capture. Possible settings are: read_uncommitted, read_committed, repeatable_read, snapshot, and exclusive. The snapshot, read_committed and read_uncommitted modes do not prevent other transactions from updating table rows during initial snapshot. The exclusive and repeatable_read modes do prevent concurrent updates. Mode choice also affects data consistency. Only exclusive and snapshot modes guarantee full consistency, that is, initial snapshot and streaming logs constitute a linear history. In case of repeatable_read and read_committed modes, it might happen that, for instance, a record added appears twice - once in initial snapshot and once in streaming phase. Nonetheless, that consistency level should do for data mirroring. For read_uncommitted there are no data consistency guarantees at all (some data might be lost or corrupted).

Type: string
Default: repeatable_read
Valid Values: exclusive, read_committed, read_uncommitted, repeatable_read, snapshot
Importance: low

table.include.list

An optional, comma-separated list of regular expressions that match fully-qualified table identifiers for tables whose changes you want to capture. When this property is set, the connector captures changes only from the specified tables. Each identifier is of the form schemaName.tableName. By default, the connector captures changes in every non-system table in each schema whose changes are being captured.

To match the name of a table, Debezium applies the regular expression that you specify as an anchored regular expression. That is, the specified expression is matched against the entire identifier for the table; it does not match substrings that might be present in a table name.

If you include this property in the configuration, do not also set the table.exclude.list property.

Type: list
Importance: medium

table.exclude.list

An optional, comma-separated list of regular expressions that match fully-qualified table identifiers for tables whose changes you do not want to capture. Each identifier is of the form schemaName.tableName. When this property is set, the connector captures changes from every table that you do not specify.

To match the name of a table, Debezium applies the regular expression that you specify as an anchored regular expression. That is, the specified expression is matched against the entire identifier for the table; it does not match substrings that might be present in a table name.

If you include this property in the configuration, do not set the table.include.list property.

Type: list
Importance: medium

column.exclude.list

An optional, comma-separated list of regular expressions that match the fully-qualified names of columns to exclude from change event record values. Fully-qualified names for columns are of the form schemaName.tableName.columnName.

To match the name of a column, Debezium applies the regular expression that you specify as an anchored regular expression. That is, the specified expression is matched against the entire name string of the column; it does not match substrings that might be present in a column name.

Type: list
Importance: medium

event.processing.failure.handling.mode

Specifies how the connector should react to exceptions during processing of events. Possible settings are fail, skip, and warn.

fail: Propagates the exception, indicates the offset of the problematic event, and causes the connector to stop.

warn: Logs the offset of the problematic event, skips that event, and continues processing.

skip: Skips the problematic event and continues processing.

Type: string
Default: fail
Valid Values: fail, skip, warn
Importance: low

schema.name.adjustment.mode

Specifies how schema names should be adjusted for compatibility with the message converter used by the connector. The following values are possible:

none: Does not apply any adjustment.

avro: Replaces the characters that cannot be used in the Avro type name with underscore.

avro_unicode: Replaces the underscore or characters that cannot be used in the Avro type name with corresponding unicode such as, _uxxxx. Note that _ is an escape sequence like backslash in Java.

Type: string
Default: none
Valid Values: avro, avro_unicode, none
Importance: medium

field.name.adjustment.mode

Specifies how field names should be adjusted for compatibility with the message converter used by the connector. The following values are possible:

none: Does not apply any adjustment.

avro: Replaces the characters that cannot be used in the Avro type name with underscore.

avro_unicode: Replaces the underscore or characters that cannot be used in the Avro type name with corresponding unicode such as, _uxxxx. Note that _ is an escape sequence like backslash in Java.

Type: string
Default: none
Valid Values: avro, avro_unicode, none
Importance: medium

heartbeat.interval.ms

Controls how frequently the connector sends heartbeat messages to a Kafka topic. The behavior of default value 0 is that the connector does not send heartbeat messages. Heartbeat messages are useful for monitoring whether the connector is receiving change events from the database. Heartbeat messages might help decrease the number of change events that need to be re-sent when a connector restarts. To send heartbeat messages, set this property to a positive integer, which indicates the number of milliseconds between heartbeat messages.

Type: int
Default: 0
Valid Values: [0,…]
Importance: low

snapshot.select.statement.overrides.data.map

A JSON object that maps fully-qualified table identifiers (schemaName.tableName) to custom SELECT statements. The connector uses these statements during snapshots instead of the default SELECT * statement. Use this property for large append-only tables to resume a snapshot from a specific point if a previous attempt was interrupted. These values are sensitive and are masked in the configuration.

Note the following:

Keys may use either schemaName.tableName or the fully-qualified dbName.schemaName.tableName form. When the same table name exists across multiple databases, use the fully-qualified form to avoid ambiguity.

Escape double quotes (") in table or schema names. Use a single backslash (\") in the UI and three backslashes (\\\") in the CLI.

Type: password
Importance: medium

heartbeat.action.query

If specified, the connector executes this query on every heartbeat against the source database. The query must be a valid SQL DML statement, typically an INSERT or UPDATE, that targets a dedicated heartbeat table.

This configuration helps in scenarios where the connector is capturing changes only from low-traffic database tables. Extended periods of inactivity in these tables can prevent the connector from advancing the LSN position in its stored offsets. To address this, create a heartbeat table in the database, and set this property to a DML statement that periodically updates the table by either inserting a new row or repeatedly updating the same row. Additionally, enable CDC on the heartbeat table and include the heartbeat table in the connector’s capture set using the schema or table include configuration properties. This ensures the connector receives changes from the heartbeat table and can continue advancing the LSN in its stored offsets. The heartbeat query executes at regular intervals, as specified by the heartbeat.interval.ms configuration property.

Note: To uphold the principle of least privilege, grant the connector user write permissions exclusively to essential tables, such as those used for heartbeats. In line with data minimization, ensure the connector configuration is free of PII or sensitive data, as this information is not needed for system-level functions like heartbeat queries.

Type: string
Importance: low

schema.history.internal.skip.unparseable.ddl

A Boolean value that specifies whether the connector should ignore malformed or unknown database statements (true), or stop processing so a human can fix the issue (false). Defaults to false. Consider setting this to true to ignore unparseable statements.

Type: boolean
Default: false
Importance: low

schema.history.internal.store.only.captured.tables.ddl

A Boolean value that specifies whether the connector records schema structures from all tables in a schema or database, or only from tables that are designated for capture. Defaults to false.

false - During a database snapshot, the connector records the schema data for all non-system tables in the database, including tables that are not designated for capture. It’s best to retain the default setting. If you later decide to capture changes from tables that you did not originally designate for capture, the connector can easily begin to capture data from those tables, because their schema structure is already stored in the schema history topic.

true - During a database snapshot, the connector records the table schemas only for the tables from which Debezium captures change events. If you change the default value, and you later configure the connector to capture data from other tables in the database, the connector lacks the schema information that it requires to capture change events from the tables.

Type: boolean
Default: false
Importance: low

Schema Config

schema.context.name

Add a schema context name. A schema context represents an independent scope in Schema Registry. It is a separate sub-schema tied to topics in different Kafka clusters that share the same Schema Registry instance. If not used, the connector uses the default schema configured for Schema Registry in your Confluent Cloud environment.

Type: string
Default: default
Importance: medium

key.converter.reference.subject.name.strategy

Set the subject reference name strategy for key. Valid entries are DefaultReferenceSubjectNameStrategy or QualifiedReferenceSubjectNameStrategy. Note that the subject reference name strategy can be selected only for PROTOBUF format with the default strategy being DefaultReferenceSubjectNameStrategy.

Type: string
Default: DefaultReferenceSubjectNameStrategy
Importance: high

How should we handle data types?

decimal.handling.mode

Specifies how the connector should handle DECIMAL and NUMERIC columns. You can set one of the following options:

precise: Represents values by using java.math.BigDecimal to represent values in binary form in change events.

double: Represents values by using double values, which might result in a loss of precision but which is easier to use.

string: encodes values as formatted strings, which are easy to consume but semantic information about the real type is lost.

Type: string
Default: precise
Valid Values: double, precise, string
Importance: medium

time.precision.mode

Time, date, and timestamps can be represented with different kinds of precisions:

adaptive captures the time and timestamp values exactly as in the database using either millisecond, microsecond, or nanosecond precision values based on the database column’s type.

connect always represents time and timestamp values by using Kafka Connect’s built-in representations for Time, Date, and Timestamp, which use millisecond precision regardless of the database columns’ precision.

Type: string
Default: adaptive
Valid Values: adaptive, connect
Importance: medium

Number of tasks for this connector

tasks.max

Maximum number of tasks that the connector can use to capture data from the database instance. If the database.names list contains more than one element, you can increase the value of this property to a number less than or equal to the number of elements in the list.

Type: int
Valid Values: [1,…,10]
Importance: high

Additional Configs

column.include.list

A comma-separated list of regular expressions that match the fully-qualified names of columns that should be included in change event record values. Fully-qualified names for columns are of the form schemaName.tableName.columnName. Do not set column.exclude.list if you set this property.

Type: list
Importance: low

column.propagate.source.type

A comma-separated list of regular expressions matching fully-qualified names of columns that adds the column’s original type and original length as parameters to the corresponding field schemas in the emitted change records. When this property is set, the connector adds the following fields to the schema of event records with prefix __debezium.source.column. These parameters propagate a column’s original type name and length (for variable-width types), respectively. Include ‘.*’ to match all column types.’

Type: list
Importance: low

datatype.propagate.source.type

A comma-separated list of regular expressions matching the database-specific data type names that adds the data type’s original type and original length as parameters to the corresponding field schemas in the emitted change records. When this property is set, the connector adds the following fields to the schema of event records with prefix __debezium.source.column. These parameters propagate a column’s original type name and length (for variable-width types), respectively. Include ‘.*’ to match all data types.’

Type: list
Importance: low

header.converter

The converter class for the headers. This is used to serialize and deserialize the headers of the messages.

Type: string
Importance: low

message.key.columns

A semicolon-separated list of expressions that match fully-qualified tables and column(s) to be used as message key. Each expression must match the pattern ‘<fully-qualified table name>:<key columns>’, where the fully qualified table name could be defined as <schemaName>.<tableName> and the key columns are a comma-separated list of columns representing the custom key. For any table without an explicit key configuration the table’s primary key column(s) will be used as message key. Example: dbserver1.inventory.orderlines:orderId,orderLineId;dbserver1.inventory.orders:id

Type: string
Importance: low

notification.enabled.channels

List of notification channels names that are enabled. The following channels are available: log and sink. When sink is enabled, the connector sends notifications to a topic specified by the notification.sink.topic.name property.

Type: list
Importance: low

notification.sink.topic.name

The name of the topic for the notifications. This is required in case sink is in the list of enabled channels. If you set this name so that it starts with your topic prefix followed by a period, the topic uses the same partition, retention, and cleanup settings as your other topics with that prefix; otherwise it is created with a single partition.

Type: string
Importance: low

producer.override.compression.type

The compression type for all data generated by the producer. Valid values are none, gzip, snappy, lz4, and zstd.

Type: string
Importance: low

producer.override.linger.ms

The producer groups together any records that arrive in between request transmissions into a single batched request. More details can be found in the documentation: https://docs.confluent.io/platform/current/installation/configuration/producer-configs.html#linger-ms.

Type: long
Valid Values: [100,…,1000]
Importance: low

schema.exclude.list

A comma-separated list of regular expressions that match names of schemas for which you do not want to capture changes. Any schema whose name is not included in schema.exclude.list has its changes captured, with the exception of system schemas.

Type: list
Importance: low

schema.include.list

A comma-separated list of regular expressions that match names of schemas for which you want to capture changes. Any schema name not included in schema.include.list is excluded from having its changes captured. By default, all non-system schemas have their changes captured.

Type: list
Importance: low

signal.enabled.channels

A comma-separated list of channel names that are enabled for the connector. If not set, the connector enables only the source channel by default. Supported values are:

source (default): Signals are read from a signaling table in the source database.

kafka: Signals are consumed from a Kafka topic.

Type: list
Importance: low

signal.kafka.topic

The name of the Kafka topic that the connector monitors for ad hoc signals. Note that you can currently send signal messages to this topic via the Confluent CLI. Note that signal.kafka.topic must have exactly 1 partition as the connector’s signal consumer reads only from partition-0. Any signal messages routed to other partitions are silently ignored in a multi-partition topic.

Type: string
Importance: low

snapshot.include.collection.list

A comma-separated list of regular expressions that match the fully-qualified names (<dbName>.<schemaName>.<tableName>) of the tables to include in a snapshot. If not explicitly set, the connector defaults to snapshotting all tables listed in table.include.list. The specified items must be named in the connector’s table.include.list property. This property takes effect only if the connector’s snapshot.mode property is set to a value other than never.

Type: list
Importance: low

value.converter.allow.optional.map.keys

Allow optional string map key when converting from Connect Schema to Avro Schema. Applicable for Avro Converters.

Type: boolean
Importance: low

value.converter.auto.register.schemas

Specify if the Serializer should attempt to register the Schema.

Type: boolean
Importance: low

value.converter.connect.meta.data

Allow the Connect converter to add its metadata to the output schema. Applicable for Avro Converters.

Type: boolean
Importance: low

value.converter.enhanced.avro.schema.support

Enable enhanced schema support to preserve package information and Enums. Applicable for Avro Converters.

Type: boolean
Importance: low

value.converter.enhanced.protobuf.schema.support

Enable enhanced schema support to preserve package information. Applicable for Protobuf Converters.

Type: boolean
Importance: low

value.converter.flatten.unions

Whether to flatten unions (oneofs). Applicable for Protobuf Converters.

Type: boolean
Importance: low

value.converter.generate.index.for.unions

Whether to generate an index suffix for unions. Applicable for Protobuf Converters.

Type: boolean
Importance: low

value.converter.generate.struct.for.nulls

Whether to generate a struct variable for null values. Applicable for Protobuf Converters.

Type: boolean
Importance: low

value.converter.int.for.enums

Whether to represent enums as integers. Applicable for Protobuf Converters.

Type: boolean
Importance: low

value.converter.latest.compatibility.strict

Verify latest subject version is backward compatible when use.latest.version is true.

Type: boolean
Importance: low

value.converter.object.additional.properties

Whether to allow additional properties for object schemas. Applicable for JSON_SR Converters.

Type: boolean
Importance: low

value.converter.optional.for.nullables

Whether nullable fields should be specified with an optional label. Applicable for Protobuf Converters.

Type: boolean
Importance: low

value.converter.optional.for.proto2

Whether proto2 optionals are supported. Applicable for Protobuf Converters.

Type: boolean
Importance: low

value.converter.scrub.invalid.names

Whether to scrub invalid names by replacing invalid characters with valid characters. Applicable for Avro and Protobuf Converters.

Type: boolean
Importance: low

value.converter.use.latest.version

Use latest version of schema in subject for serialization when auto.register.schemas is false.

Type: boolean
Importance: low

value.converter.use.optional.for.nonrequired

Whether to set non-required properties to be optional. Applicable for JSON_SR Converters.

Type: boolean
Importance: low

value.converter.wrapper.for.nullables

Whether nullable fields should use primitive wrapper messages. Applicable for Protobuf Converters.

Type: boolean
Importance: low

value.converter.wrapper.for.raw.primitives

Whether a wrapper message should be interpreted as a raw primitive at root level. Applicable for Protobuf Converters.

Type: boolean
Importance: low

incremental.snapshot.chunk.size

The maximum number of rows that the connector fetches and reads into memory during an incremental snapshot chunk. Increasing the chunk size improves efficiency by running fewer, larger snapshot queries. However, larger chunk sizes also require more memory to buffer the snapshot data. Adjust the chunk size to a value that provides the best performance in your environment.

Type: int
Default: 1024
Valid Values: [1,…,1024]
Importance: medium

binary.handling.mode

Specify how binary (blob, binary, etc.) columns should be represented in change events, including: ‘bytes’ represents binary data as byte array (default); ‘base64’ represents binary data as base64-encoded string; ‘base64-url-safe’ represents binary data as base64-url-safe-encoded string; ‘hex’ represents binary data as hex-encoded (base16) string

Type: string
Default: bytes
Importance: low

driver.multiSubnetFailover

Controls whether the driver uses multi-subnet failover for SQL Server Always On availability groups. When set to true, the driver can connect to any replica in the availability group and automatically fails over to another replica if the current one becomes unavailable.

This is useful for high-availability scenarios. When you use this configuration, you must use the listener endpoint (not the individual server endpoint) in the database.hostname field to enable proper failover functionality.

Type: boolean
Default: false
Importance: low

driver.sendTimeAsDatetime

Controls how TIME values are sent to the server. When set to true, TIME values are sent as DATETIME values. When set to false, TIME values are sent as TIME values. This affects how temporal data types are handled in the database connection.

Type: boolean
Default: true
Importance: low

errors.tolerance

Use this property if you would like to configure the connector’s error handling behavior. WARNING: This property should be used with CAUTION for SOURCE CONNECTORS as it may lead to dataloss. If you set this property to ‘all’, the connector will not fail on errant records, but will instead log them (and send to DLQ for Sink Connectors) and continue processing. If you set this property to ‘none’, the connector task will fail on errant records.

Type: string
Default: none
Importance: low

include.schema.changes

Whether the connector should publish changes in the database schema to a Kafka topic with the same name as the database server ID. Each schema change will be recorded using a key that contains the database name and whose value include logical description of the new schema and optionally the DDL statement(s). The default is ‘true’. This is independent of how the connector internally records database schema history

Type: boolean
Default: false
Importance: low

incremental.snapshot.allow.schema.changes

Detect schema change during an incremental snapshot and re-select a current chunk to avoid locking DDLs. Note that changes to a primary key are not supported and can cause incorrect results if performed during an incremental snapshot. Another limitation is that if a schema change affects only columns’ default values, then the change won’t be detected until the DDL is processed from the binlog stream.This doesn’t affect the snapshot events’ values, but the schema of snapshot events may have outdated defaults.

Type: boolean
Default: false
Importance: low

incremental.snapshot.option.recompile

Controls whether to use the RECOMPILE query hint for incremental snapshot queries. When set to true, the connector adds the RECOMPILE hint to incremental snapshot queries, which can help with query plan optimization but may impact performance.

Type: boolean
Default: false
Importance: low

incremental.snapshot.watermarking.strategy

Specify the strategy used for watermarking during an incremental snapshot: ‘INSERT_INSERT’ both open and close signal is written into signal data collection (default); ‘INSERT_DELETE’ only open signal is written on signal data collection, the close will delete the relative open signal.

Type: string
Default: INSERT_INSERT
Importance: low

key.converter.key.schema.id.serializer

The class name of the schema ID serializer for keys. This is used to serialize schema IDs in the message headers.

Type: string
Default: io.confluent.kafka.serializers.schema.id.PrefixSchemaIdSerializer
Importance: low

key.converter.key.subject.name.strategy

How to construct the subject name for key schema registration.

Type: string
Default: TopicNameStrategy
Importance: low

max.batch.size

Maximum size of each batch of events that the connector processes. Defaults to 2048 with the allowed range is from 1 to 5000.

Type: int
Default: 2048
Valid Values: [1,…,5000]
Importance: low

max.iteration.transactions

Specifies the maximum number of transactions per iteration to be used to reduce the memory footprint when streaming changes from multiple tables in a database. When set to 0, the connector uses the current maximum LSN as the range to fetch changes from. When set to a value greater than zero, the connector uses the n-th LSN specified by this setting as the range to fetch changes from. Defaults to 500.

Type: int
Default: 500
Valid Values: [1,…,500]
Importance: low

poll.interval.ms

Time to wait for new change events to appear after receiving no events, given in milliseconds. Defaults to 500 ms.

Type: long
Default: 500
Valid Values: [200,…]
Importance: low

provide.transaction.metadata

Determines whether the connector generates events with transaction boundaries and enriches change event envelopes with transaction metadata. When enabled, the connector creates a dedicated transaction metadata topic. Its name starts with your topic prefix, so it uses the same partition, retention, and cleanup settings as your other topics with that prefix.

Type: boolean
Default: false
Importance: low

schema.history.internal.kafka.create.timeout.ms

The number of milliseconds to wait while create kafka history topic using Kafka admin client.

Type: long
Default: 30000 (30 seconds)
Importance: low

schema.history.internal.kafka.query.timeout.ms

The number of milliseconds to wait while fetching cluster information using Kafka admin client.

Type: long
Default: 3000 (3 seconds)
Importance: low

schema.history.internal.kafka.recovery.poll.interval.ms

The number of milliseconds to wait while polling for persisted data during recovery.

Type: long
Default: 100
Valid Values: [100,…]
Importance: low

schema.history.internal.store.only.captured.databases.ddl

Controls what DDL will Debezium store in database schema history. By default (false) Debezium will store all incoming DDL statements. If set to true, then only DDL that manipulates a table from captured schema/database will be stored.

Type: boolean
Default: false
Importance: low

skip.messages.without.change

Enable to skip publishing messages when there is no change in included columns. This would essentially filter messages to be sent when there is no change in columns included as per column.include.list/column.exclude.list.

Type: boolean
Default: false
Importance: low

skipped.operations

The comma-separated list of operations to skip during streaming, defined as: ‘c’ for inserts/create; ‘u’ for updates; ‘d’ for deletes, ‘t’ for truncates, and ‘none’ to indicate nothing skipped. By default, only truncate operations will be skipped.

Type: list
Default: t
Importance: low

snapshot.delay.ms

An interval in milliseconds that the connector should wait before performing a snapshot when the connector starts. Defaults to 0 ms.

Type: long
Default: 0
Valid Values: [0,…]
Importance: low

snapshot.lock.timeout.ms

The maximum number of millis to wait for table locks at the beginning of a snapshot. If locks cannot be acquired in this time frame, the snapshot will be aborted. Defaults to 10 seconds.

Type: long
Default: 10000 (10 seconds)
Importance: low

streaming.delay.ms

A delay period after the snapshot is completed and the streaming begins, given in milliseconds. This delay helps prevent re-snapshotting in case the connector fails during the transition to streaming. Defaults to 60000 ms.

Type: long
Default: 60000 (1 minute)
Valid Values: [0,…]
Importance: low

topic.heartbeat.prefix

Specifies the prefix of the heartbeat topic to which the connector sends heartbeat messages. The topic name has this pattern: <topic.heartbeat.prefix>.<topic.prefix>. Defaults to __debezium-heartbeat-{{.logicalClusterId}}. By default the heartbeat topic does not start with your topic prefix, so it is created with a single partition. If you change this prefix so that the heartbeat topic name starts with your topic prefix followed by a period, it instead uses the same partition, retention, and cleanup settings as your other topics with that prefix. Keep the default unless you have a specific reason to change it.

Type: string
Default: __debezium-heartbeat-{{.logicalClusterId}}
Importance: low

topic.transaction

Controls the name of the topic to which the connector sends transaction metadata messages. The final transaction topic name has this pattern: <topic.prefix>.<topic.transaction>. Defaults to {{.logicalClusterId}}.transaction. Because this topic’s name always starts with your topic prefix, it uses the same partition, retention, and cleanup settings as your other topics with that prefix.

Type: string
Default: {{.logicalClusterId}}.transaction
Importance: low

value.converter.decimal.format

Specify the JSON/JSON_SR serialization format for Connect DECIMAL logical type values with two allowed literals:

BASE64 to serialize DECIMAL logical types as base64 encoded binary data and

NUMERIC to serialize Connect DECIMAL logical type values in JSON/JSON_SR as a number representing the decimal value.

Type: string
Default: BASE64
Importance: low

value.converter.flatten.singleton.unions

Whether to flatten singleton unions. Applicable for Avro and JSON_SR Converters.

Type: boolean
Default: false
Importance: low

value.converter.ignore.default.for.nullables

When set to true, this property ensures that the corresponding record in Kafka is NULL, instead of showing the default column value. Applicable for AVRO,PROTOBUF and JSON_SR Converters.

Type: boolean
Default: false
Importance: low

value.converter.reference.subject.name.strategy

Set the subject reference name strategy for value. Valid entries are DefaultReferenceSubjectNameStrategy or QualifiedReferenceSubjectNameStrategy. Note that the subject reference name strategy can be selected only for PROTOBUF format with the default strategy being DefaultReferenceSubjectNameStrategy.

Type: string
Default: DefaultReferenceSubjectNameStrategy
Importance: low

value.converter.replace.null.with.default

Whether to replace fields that have a default value and that are null to the default value. When set to true, the default value is used, otherwise null is used. Applicable for JSON Converter.

Type: boolean
Default: true
Importance: low

value.converter.schemas.enable

Include schemas within each of the serialized values. Input messages must contain schema and payload fields and may not contain additional fields. For plain JSON data, set this to false. Applicable for JSON Converter.

Type: boolean
Default: false
Importance: low

value.converter.value.schema.id.serializer

The class name of the schema ID serializer for values. This is used to serialize schema IDs in the message headers.

Type: string
Default: io.confluent.kafka.serializers.schema.id.PrefixSchemaIdSerializer
Importance: low

value.converter.value.subject.name.strategy

Determines how to construct the subject name under which the value schema is registered with Schema Registry.

Type: string
Default: TopicNameStrategy
Importance: low

Auto-restart policy

auto.restart.on.user.error

Enable connector to automatically restart on user-actionable errors.

Type: boolean
Default: true
Importance: medium

Monitoring and Troubleshooting

Connector appears as RUNNING but is not processing data

After a task restart, the SQL Server CDC Source v2 connector performs schema history recovery before it resumes change data capture (CDC). During this phase:

The connector status in the Confluent Cloud Console appears as RUNNING.
However, no records will be emitted until schema history recovery is complete.
This is expected behavior and does not indicate an error.

To monitor the progress of schema history recovery, use the following Confluent Cloud metric:

sql_server_cdc_source_connector_schema_history_status

This is a gauge metric with the following values:

0 — Schema recovery is stopped
1 — Schema recovery is in progress
2 — Schema recovery is complete

Once the metric reaches 2, the connector resumes streaming change events to the Kafka topic.

Schema history recovery occurs on every task restart, including those triggered by configuration changes, scaling operations, or infrastructure events. While the connector shows as RUNNING, schema history recovery happens as a separate process after the task initialization.

For more information on available metrics, refer to the Confluent Cloud Metrics API reference.

Frequently asked questions

Find answers to frequently asked questions about the Microsoft SQL Server CDC Source V2 (Debezium) connector for Confluent Cloud.

What best practices should I follow before deploying this connector to production?

Before deploying the connector to production, follow these best practices:

Conduct a performance test in a non-production environment that mirrors your production database size, change rate, schema, and network topology, including any private-networking egress endpoints. This validates that the connector can handle your peak change volume before your customer data depends on it.
Monitor the connector’s streaming lag using the Confluent Cloud Metrics API and the sql_server_cdc_source_connector_lag_milliseconds metric. Add alerts to get notified and act on sustained lag growth. For tuning options when lag rises, see How can I optimize connector performance and reduce lag?.

How do I connect to a self-hosted SQL Server database from Confluent Cloud?

To connect the fully managed Microsoft SQL Server CDC Source V2 connector to a self-hosted SQL Server database over a private network, you can use one of the following options:

Egress PrivateLink endpoint: Create an egress PrivateLink endpoint in the Cloud Console that connects to your SQL Server database. Your database must be accessible through an Amazon PrivateLink endpoint service, Azure Private Link service, or Google Cloud Private Service Connect.
Private Network Interface (PNI) on AWS only: If your database is reachable from your AWS VPC, you can use PNI egress connectivity. For details, see Use Private Network Interface on Confluent Cloud.

Important

The egress endpoint must be in the same region as your Kafka cluster. Cross-region egress is not supported.

For more information, see Manage Networking for Confluent Cloud Connectors.

Why is my connector failing with `CDC is not enabled` errors?

This error indicates that CDC is not enabled on the SQL Server database or specific tables.

The database 'YourDatabase' is not enabled for CDC

CDC enablement procedures vary depending on your SQL Server deployment type. For instructions to enable CDC, see the documentation for your deployment:

On-premises SQL Server: For detailed CDC setup instructions, see Setting up SQL Server in the Debezium documentation.
Amazon RDS for SQL Server: See Using change data capture in the AWS documentation.
Azure SQL Database: See Azure SQL Database change stream with Debezium for Azure SQL Database setup.

To verify that CDC is enabled on your database and tables, run the following queries:

-- Check if CDC is enabled on the database
SELECT name, is_cdc_enabled
FROM sys.databases
WHERE name = 'YourDatabase';

-- Check if CDC is enabled on tables
SELECT name, is_tracked_by_cdc
FROM sys.tables
WHERE SCHEMA_NAME(schema_id) = 'dbo';

What SQL Server user privileges are required for the connector?

The SQL Server database user specified in database.user must have specific privileges to access CDC data and perform snapshots.

To configure the required privileges, complete the following steps:

User CREATE:

USE YourDatabase;
GO
CREATE LOGIN connector_user WITH PASSWORD = 'YourPassword';
GO
CREATE USER connector_user FOR LOGIN connector_user;
GO

SELECT on source tables: Grant SELECT on all tables to be captured.

GRANT SELECT ON SCHEMA::dbo TO connector_user;
GO

Alternatively, grant SELECT on specific tables.

GRANT SELECT ON dbo.YourTable TO connector_user;
GO

Grant CDC permissions:

GRANT SELECT ON SCHEMA::cdc TO connector_user;
GO
GRANT EXECUTE ON SCHEMA::cdc TO connector_user;
GO

Grant view permissions:

GRANT VIEW DATABASE STATE TO connector_user;
GO
GRANT VIEW SERVER STATE TO connector_user;
GO

Grant write permission for incremental snapshots: To enable Debezium to perform incremental snapshots, you must grant the connector permission to write to the signaling table specified in signal.data.collection.
```
GRANT INSERT ON dbo.signal_table TO connector_user;
GO
```

Verify the granted permissions:

SELECT
  USER_NAME(grantee_principal_id) AS UserName,
  permission_name,
  state_desc
FROM sys.database_permissions
WHERE USER_NAME(grantee_principal_id) = 'connector_user';

Why is my connector failing with `SQL Server Agent is not running` errors?

This error occurs when CDC capture or cleanup jobs cannot run because SQL Server Agent is stopped.

SQL Server Agent is not currently running so it cannot be notified of this action

Review the following common causes and solutions:

SQL Server Agent service stopped: Start the SQL Server Agent service from SQL Server Configuration Manager or the Services console.
To verify that the service is running, run the following query:
```
EXEC master.dbo.xp_servicecontrol N'QUERYSTATE', N'SQLServerAGENT';
```
SQL Server Agent disabled: In the Services console, set the service startup type to Automatic.
Insufficient permissions: Verify that the SQL Server Agent service account has the necessary permissions to access the database.
CDC jobs not created: To verify that the CDC capture job exists, run the following query:
```
SELECT * FROM msdb.dbo.cdc_jobs;
```
If the capture job is missing, create it by running the following command:
```
EXEC sys.sp_cdc_add_job 'capture';
GO
```

How can I optimize connector performance and reduce lag?

Large transactions, network latency, or connector configuration can cause high lag or slow processing.

To optimize performance, review the following options:

Monitor lag metrics: Use the Metrics API to monitor connector lag.
Increase tasks: This connector only supports multiple tasks when you configure multiple databases. The task count should be less than or equal to the number of databases configured in the database.names property.
```
{
  "tasks.max": "4"
}
```
Each task operates at the database level and processes changes from a single database. For a single database, you can only achieve parallelism by creating more connectors and distributing tables among them. For a single database and single table, you cannot achieve further parallelism.
Filter unnecessary tables: Use table.include.list or table.exclude.list to capture only required tables and reduce processing overhead.
```
{
  "table.include.list": "dbo.orders,dbo.customers"
}
```
Filter unnecessary columns: Use column.include.list or column.exclude.list to filter columns and reduce data volume.
```
{
  "column.exclude.list": "dbo.orders.internal_notes,dbo.customers.legacy_data"
}
```
Monitor SQL Server performance: Monitor the following SQL Server metrics:
- CDC capture job performance: Monitor sys.dm_cdc_log_scan_sessions.
- Disk I/O: Verify adequate disk performance.
- CPU and memory: Check resource utilization.
Avoid large transactions: Large transactions can cause lag spikes. When possible, break bulk operations into smaller transactions.

How do I configure SSL/TLS encryption for SQL Server connections?

The Microsoft SQL Server CDC Source V2 connector supports SSL/TLS encryption for secure database connections.

To configure SSL/TLS encryption, complete the following steps:

Enable encryption on SQL Server: Configure SQL Server to use SSL certificates.
Configure the connector for SSL: In the connector configuration, set database.encrypt to true to enable SSL/TLS encryption for the database connection.
```
{
  "database.encrypt": "true"
}
```

What should I do if my connector keeps restarting or failing?

Frequent connector restarts indicate configuration issues, resource constraints, or database connectivity problems.

Review the following common causes and solutions:

Check connector logs: In the Cloud Console, review error messages for the following issues:
- Authentication failures: Verify the username and password.
- Network timeouts: Check the egress endpoint status.
- CDC errors: Verify that CDC is enabled on the database and tables.
Verify SQL Server connectivity: Verify that SQL Server is reachable and accepting connections.
- Check firewall rules: Verify that traffic is allowed on the port configured in the database.port property of the connector.
- Verify egress endpoint status: Verify that the status is Ready.
- Test credentials manually: Connect using SQL Server Management Studio with the same credentials.

Review CDC configuration: To verify that CDC is properly configured, run the following queries:

SELECT name, is_cdc_enabled
FROM sys.databases
WHERE name = 'YourDatabase';

SELECT name, is_tracked_by_cdc
FROM sys.tables
WHERE SCHEMA_NAME(schema_id) = 'dbo';

Check SQL Server Agent status: For on-premises SQL Server, verify that SQL Server Agent is running.
```
EXEC master.dbo.xp_servicecontrol N'QUERYSTATE', N'SQLServerAGENT';
```
Verify schema compatibility: Verify that Schema Registry is available if you are using Avro, JSON Schema, or Protobuf formats.
Check for incompatible schema changes: Some SQL Server DDL changes can cause connector failures.
- Column type changes: Column type changes may cause deserialization errors.
- Primary key changes: Primary key changes may cause deserialization errors.
Check for large-transaction memory spikes: If restarts coincide with large transactions or a high volume of change events during streaming, the connector can experience memory spikes. For symptoms and mitigation options, see Microsoft SQL Server CDC Source V2 (Debezium) Connector.

For persistent failures, share logs with Confluent Support.

How do I handle schema changes in my SQL Server database?

By default, the connector handles schema evolution:

Adding columns: The connector automatically detects and includes new columns in change events. The connector adds nullable columns with NULL values for existing rows and adds columns with defaults with default values.
Dropping columns: Removed columns no longer appear in change events.
Renaming columns: The connector records the column rename.
Changing primary key: The connector uses the updated primary key.
Changing column data type: The connector uses the updated data type.

Serialization issues can occur when writing new records to the Kafka topic after a schema change if the change is incompatible with already registered schemas of the topic or with the existing Schema Registry settings of the environment.

To manage schema changes, follow these best practices:

Test schema changes: Test DDL changes in a non-production environment first.
Monitor for errors: After schema changes, check connector logs for issues.

Re-enable CDC after schema changes: Some schema changes require re-enabling CDC.

-- Disable CDC on table
EXEC sys.sp_cdc_disable_table
  @source_schema = N'dbo',
  @source_name = N'YourTable',
  @capture_instance = 'all';

-- Make schema changes
ALTER TABLE dbo.YourTable ADD NewColumn INT;

-- Re-enable CDC on table
EXEC sys.sp_cdc_enable_table
  @source_schema = N'dbo',
  @source_name = N'YourTable',
  @role_name = NULL;

Next Steps

For an example that shows fully-managed Confluent Cloud connectors in action with Confluent Cloud for Apache Flink, see the Cloud ETL Demo. This example also shows how to use Confluent CLI to manage your resources in Confluent Cloud.

Secret manager managed configuration	Type
`database.hostname`	`STRING`
`database.port`	`INT`
`database.user`	`STRING`
`database.password`	`PASSWORD`

Microsoft SQL Server CDC Source V2 (Debezium) Connector for Confluent Cloud

V2 Improvements

Features

Supported database versions

Limitations

Deprecated features and configurations

Maximum message size

Log retention during snapshot

Manage custom offsets

JSON payload

Rewind the connector to a desired LSN

Migrate connectors

Create fully-managed connectors with offsets

Quick Start

Using the Confluent Cloud Console

Step 1: Launch your Confluent Cloud cluster

Step 2: Add a connector

Step 3: Select your connector

Step 4: Enter the connector details

Step 5: Check the Kafka topic

Using the Confluent CLI

Step 1: List the available connectors

Step 2: List the connector configuration properties

Step 3: Create the connector configuration file

Step 4: Load the properties file and create the connector

Step 5: Check the connector status

Step 6: Check the Kafka topic.

Moving from V1 to V2

Configuration Properties

How should we connect to your data?

Kafka Cluster credentials

Authentication method

Secret manager configuration

How should we connect to your database?

Output messages

How should we name your topic(s)?

How should we configure the topic(s)?

Storage

Database config

Connector config

Schema Config

How should we handle data types?

Number of tasks for this connector

Additional Configs

Auto-restart policy

Monitoring and Troubleshooting

Connector appears as RUNNING but is not processing data

Frequently asked questions

What best practices should I follow before deploying this connector to production?

How do I connect to a self-hosted SQL Server database from Confluent Cloud?

Why is my connector failing with CDC is not enabled errors?

What SQL Server user privileges are required for the connector?

Why is my connector failing with SQL Server Agent is not running errors?

How can I optimize connector performance and reduce lag?

How do I configure SSL/TLS encryption for SQL Server connections?

What should I do if my connector keeps restarting or failing?

How do I handle schema changes in my SQL Server database?

Next Steps

Why is my connector failing with `CDC is not enabled` errors?

Why is my connector failing with `SQL Server Agent is not running` errors?