Important
You are viewing documentation for an older version of Confluent Platform. For the latest, click here.
Configuration Properties¶
The MySQL Source Connector can be configured using a variety of configuration properties.
database.hostname
IP address or hostname of the MySQL database server.
- Type: String
- Importance: High
database.port
Integer port number of the MySQL database server.
- Type: Integer
- Importance: Low
- Default:
3306
database.user
- Username to use when when connecting to the MySQL database server. * Type: String * Importance: High
database.password
Password to use when when connecting to the MySQL database server.
- Type: Password
- Importance: High
database.server.name
Logical name that identifies and provides a namespace for the particular MySQL database server/cluster being monitored. The logical name should be unique across all other connectors, since it is used as a prefix for all Kafka topic names emanating from this connector. Defaults to
host:_port_
, where host is the value of thedatabase.hostname
property and port is the value of thedatabase.port
property. Confluent recommends changing the default to a meaningful name.- Type: String
- Importance: Low
- Default:
database.hostname:database.port
database.server.id
A numeric ID of this database client, which must be unique across all currently-running database processes in the MySQL cluster. This connector joins the MySQL database cluster as another server (with this unique ID) so it can read the binlog. By default, a random number is generated between
5400
and6400
. Confluent recommends setting a value.- Type: Integer
- Importance: Low
- Default: random
database.history.kafka.topic
The full name of the Kafka topic where the connector will store the database schema history.
- Type: String
- Importance: High
database.history.kafka.bootstrap.servers
A list of host/port pairs that the connector will use for establishing an initial connection to the Kafka cluster. This connection will be used for retrieving database schema history previously stored by the connector, and for writing each DDL statement read from the source database. This should point to the same Kafka cluster used by the Kafka Connect process.
- Type: List of Strings
- Importance: High
database.whitelist
An optional comma-separated list of regular expressions that match database names to be monitored. Any database name not included in the whitelist will be excluded from monitoring. By default all databases will be monitored. May not be used with
database.blacklist
.- Type: List of Strings
- Importance: Low
- Default: empty string
database.blacklist
An optional comma-separated list of regular expressions that match database names to be excluded from monitoring. Any database name not included in the blacklist will be monitored. May not be used with
database.whitelist
.- Type: List of Strings
- Importance: Low
- Default: empty string
table.whitelist
An optional comma-separated list of regular expressions that match fully-qualified table identifiers for tables to be monitored. Any table not included in the whitelist will be excluded from monitoring. Each identifier is of the form
schemaName.tableName
. By default the connector will monitor every non-system table in each monitored schema. May not be used withtable.blacklist
.- Type: List of Strings
- Importance: Low
- Default: empty string
table.blacklist
An optional comma-separated list of regular expressions that match fully-qualified table identifiers for tables to be excluded from monitoring. Any table not included in the blacklist will be monitored. Each identifier is of the form
databaseName.tableName
. May not be used withtable.whitelist
.- Type: List of Strings
- Importance: Low
- Default: empty string
column.blacklist
An optional comma-separated list of regular expressions that match the fully-qualified names of columns that should be excluded from change event message values. Fully-qualified names for columns are of the form
databaseName.tableName.columnName
, ordatabaseName.schemaName.tableName.columnName
.- Type: List of Strings
- Importance: Low
- Default: n/a
column.truncate.to.length.chars
An optional comma-separated list of regular expressions that match the fully-qualified names of character-based columns. The column values are truncated in the change event message values if the field values are longer than the specified number of characters. Multiple properties with different lengths can be used in a single configuration, although in each the length must be a positive integer. Fully-qualified names for columns are in the form
databaseName.tableName.columnName
, ordatabaseName.schemaName.tableName.columnName
.- Type: List of Strings
- Importance: Low
- Default: n/a
column.mask.with.length.chars
An optional comma-separated list of regular expressions that match the fully-qualified names of character-based columns. The column values are replaced in the change event message values with a field value consisting of the specified number of asterisk (*) characters. Multiple properties with different lengths can be used in a single configuration, although in each the length must be a positive integer. Fully-qualified names for columns are in the form
databaseName.tableName.columnName
, ordatabaseName.schemaName.tableName.columnName
.- Type: List of Strings
- Importance: Low
- Default: n/a
column.propagate.source.type
An optional comma-separated list of regular expressions that match the fully-qualified names of columns whose original type and length should be added as a parameter to the corresponding field schemas in the emitted change messages. The schema parameters
__debezium.source.column.type
,__debezium.source.column.length
and_debezium.source.column.scale
are used to propagate the original type name and length (for variable-width types), respectively. Useful to properly size corresponding columns in sink databases. Fully-qualified names for columns are in the formdatabaseName.tableName.columnName
, ordatabaseName.schemaName.tableName.columnName
.- Type: List of Strings
- Importance: Low
- Default: n/a
time.precision.mode
Time, date, and timestamps can be represented with different kinds of precision. Settings include the following:
adaptive_time_microseconds
(the default) which captures the date, datetime and timestamp values exactly as they are in the database. It uses either millisecond, microsecond, or nanosecond precision values that are are based on the database column’s type. An exception to this are TIME type fields, which are always captured as microseconds.
adaptive
(deprecated) captures the time and timestamp values exactly as they are the database using either millisecond, microsecond, or nanosecond precision values. These values are based on the database column type.connect
represents time and timestamp values using Kafka Connect’s built-in representations for Time, Date, and Timestamp. It uses millisecond precision regardless of database column precision.- Type: String
- Importance: Low
- Default:
adaptive_time_microseconds
decimal.handling.mode
Specifies how the connector should handle values for
DECIMAL
andNUMERIC
columns. Settings include the following:precise
(the default) represents them precisely usingjava.math.BigDecimal
values represented in change events in a binary form; or double represents them using double values, which may result in a loss of precision but will be far easier to use.
string
encodes values as formatted string which is easy to consume but semantic information about the real type is lost.- Type: String
- Importance: Low
- Default:
precise
bigint.unsigned.handling.mode
- Specifies how BIGINT UNSIGNED columns should be represented in change events. Settings include the following:
precise
usesjava.math.BigDecimal
to represent values, which are encoded in the change events using a binary representation and Kafka Connect’sorg.apache.kafka.connect.data.Decimal
type.long
(the default) represents values using Java’slong
, which may not offer the precision but will be far easier to use in consumers.long
is usually the preferable setting. Theprecise
setting should only be used when working with values larger than 2^63 (these values can not be conveyed usinglong
).- Type: String
- Importance: Low
- Default:
long
include.schema.changes
Boolean value that specifies whether the connector should publish changes in the database schema to a Kafka topic with the same name as the database server ID. Each schema change will be recorded using a key that contains the database name and whose value includes the DDL statement(s). This is independent of how the connector internally records database history.
- Type: String
- Importance: Low
- Default:
true
include.query
Boolean value that specifies whether the connector should include the original SQL query that generated the change event. Note: This option requires MySQL be configured with the binlog_rows_query_log_events option set to ON. Query will not be present for events generated from the snapshot process. WARNING: Enabling this option may expose tables or fields explicitly blacklisted or masked by including the original SQL statement in the change event.
- Type: String
- Importance: Low
- Default:
false
event.deserialization.failure.handling.mode
Specifies how the connector should react to exceptions during deserialization of binlog events.
fail` propagates the exception (indicating the problematic event and its binlog offset), causing the connector to stop. ``warn
causes the problematic event to be skipped and the problematic event and its binlog offset to be logged (make sure that the logger is set to theWARN
orERROR
level).ignore
causes the problematic event to be skipped.- Type: String
- Importance: Low
- Default:
fail
inconsistent.schema.handling.mode
Specifies how the connector should react to binlog events that relate to tables that are not present in internal schema representation (i.e. internal representation is not consistent with database)
fail
throws an exception (indicating the problematic event and its binlog offset), causing the connector to stop.warn
causes the problematic event to be skipped and the problematic event and its binlog offset to be logged (make sure that the logger is set to theWARN
orERROR
level).ignore
causes the problematic event to be skipped.- Type: String
- Importance: Low
- Default:
fail
max.queue.size
Positive integer value that specifies the maximum size of the blocking queue into which change events read from the database log are placed before they are written to Kafka. This queue can provide backpressure to the binlog reader when, for example, writes to Kafka are slower or if Kafka is not available. Events that appear in the queue are not included in the offsets periodically recorded by this connector. Defaults to
8192
, and should always be larger than the maximum batch size specified in themax.batch.size
property.- Type: Integer
- Importance: Low
- Default:
8192
max.batch.size
Positive integer value that specifies the maximum size of each batch of events that should be processed during each iteration of this connector. Defaults to
2048
.- Type: Integer
- Importance: Low
- Default:
2048
poll.interval.ms
Positive integer value that specifies the number of milliseconds the connector should wait during each iteration for new change events to appear. Defaults to 1000 milliseconds, or 1 second.
- Type: Integer
- Importance: Low
- Default:
1000
connect.timeout.ms
A positive integer value that specifies the maximum time in milliseconds this connector should wait after trying to connect to the MySQL database server before timing out. Defaults to 30 seconds.
- Type: String
- Importance: Low
- Default:
30
gtid.source.includes
A comma-separated list of regular expressions that match source UUIDs in the GTID set used to find the binlog position in the MySQL server. Only the GTID ranges that have sources matching one of these include patterns will be used. May not be used with
gtid.source.excludes
.- Type: List of Strings
- Importance: Low
gtid.source.excludes
A comma-separated list of regular expressions that match source UUIDs in the GTID set used to find the binlog position in the MySQL server. Only the GTID ranges that have sources matching none of these exclude patterns will be used. May not be used with
gtid.source.includes
.- Type: List of Strings
- Importance: Low
gtid.new.channel.position
When set to
latest
, and when the connector sees a new GTID channel, the connector starts consuming from the last executed transaction in that GTID channel. If set toearliest
, the Debezium connector starts reading that channel from the first available (not purged) GTID position.earliest
is useful when you have a active-passive MySQL setup where Debezium is connected to master, in this case during failover the slave with new UUID (and GTID channel) starts receiving writes before Debezium is connected. These writes would be lost when using latest.- Type: String
- Importance: Low
- Default:
latest
tombstones.on.delete
Controls whether a tombstone event should be generated after a delete event. When set to
true
, the delete operations are represented by a delete event and a subsequent tombstone event. When set tofalse
, only a delete event is sent. Emitting the tombstone event (the default behavior) allows Kafka to completely delete all events pertaining to the given key once the source record got deleted.- Type: String
- Importance: Low
- Default:
true
ddl.parser.mode
Controls which parser should be used for parsing DDL statements when building up the meta-model of the captured database structure. Can be set to
legacy
(for the legacy hand-written parser implementation) orantlr
(for the Antlr based implementation introduced in Debezium 0.8.0). While the legacy parser remains the default for Debezium 0.8.x, please try out the new implementation and report back any issues you encounter. The new parser is the default as of 0.9, followed by the removal of the old implementation in a future version.- Type: String
- Importance: Low
- Default:
legacy
for Debezium 0.8.x andantlr
for Debezium 0.9 (and later)
Additional advanced configuration properties and details can be found in the Debezium connector properties documentation.
Note
Portions of the information provided here derives from documentation originally produced by the Debezium Community. Work produced by Debezium is licensed under Creative Commons 3.0.