Limits for Fully-Managed Connectors for Confluent Cloud¶
Refer to the following for usage limitations.
Connector Configurations¶
Do not use \
in passwords or secret keys when setting up the connector. It may lead to connection failures.
Schema Registry Enabled Environments¶
Schema Registry enabled environments support only the default schema subject naming
strategy TopicNameStrategy
. Subject naming strategies RecordNameStrategy
and TopicRecordNameStrategy
are not currently supported.
Sink connectors¶
You cannot add a sink connector’s dead letter queue (DLQ) topic to the list of topics consumed by the same sink connector (to prevent an infinite loop).
Source connectors¶
An internal Apache Kafka® configuration property (max.request.size
) controls the
maximum producer request size. This size is set at 8 MB maximum for Basic, Standard, and Enterprise
clusters, and 20 MB for Dedicated clusters. If you want to run a source
connector making requests larger than 8 MB, you must run the connector in a
Dedicated cluster.
Automatic topic creation¶
If the Kafka broker configuration is set to auto.create.topics.enable=false
and you delete a topic that was automatically created, the connector does not
automatically create a new topic, even if the Connect worker has the property
topic.creation.enable=true
. To work around this issue, make a minor
modification to the connector configuration so the connector can restart. The
connector creates the new topic when it restarts after the configuration change.
Connector-Specific Limitations¶
Supported connector limitations¶
See the following limitations for supported connectors.
ActiveMQ Source Connector¶
The following are the limitations for the ActiveMQ Source Connector for Confluent Cloud.
- The connector does not support Advanced Message Queuing Protocol (AMQP).
AlloyDB Sink Connector¶
The following are limitations for the AlloyDB Sink Connector for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- AlloyDB only provides a private IP, hence cannot be accessed publicly. Setup the AlloyDB Auth Proxy and ensure that it is accessible to the connector.
- The database and Kafka cluster should be in the same region.
- For tombstone records, set
delete.enabled
totrue
Amazon CloudWatch Logs Source Connector¶
The following are limitations for the Amazon CloudWatch Logs Source Connector for Confluent Cloud.
- The connector can only read from the top 50 log streams (ordered alphabetically).
- The connector does not support Protobuf.
Amazon CloudWatch Metrics Sink Connector¶
The Amazon CloudWatch Metrics region must in the same region where your Confluent Cloud cluster is, and where you are running the Amazon CloudWatch Metrics Sink Connector for Confluent Cloud.
Amazon DynamoDB Sink Connector¶
The following are limitations for the Amazon DynamoDB Sink Connector for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- The Amazon DynamoDB database and Kafka cluster should be in the same region.
- The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Amazon DynamoDB CDC Source Connector¶
The Amazon DynamoDB CDC Source Connector for Confluent Cloud has the following limitations:
- A table can only be processed by one task at a given time. Thus configuring more tasks than tables results in some tasks remaining idle. The maximum possible number of tasks is equal to the total number of configured tables.
- The
dynamodb.snapshot.max.poll.records
configuration property tunes the maximum number of records that can be returned in a single DynamoDB read operation. Each read operation has a 1 MB size limit. - The
dynamodb.cdc.max.poll.records
configuration property tunes the maximum number of records that can be returned in single DynamoDB Streams GetRecords operation. Each GetRecords operation has a 1 MB size limit. - You can only run up two connectors (or consumers) on a single DynamoDB table stream. This is not supported by AWS DynamoDB.
- DynamoDB Streams is not supported in PrivateLink. The connector can operate only in
SNAPSHOT
mode in PrivateLink. - During
SNAPSHOT
mode, if the table’s pricing mode is provisioned with auto-scaling off, the connector throughout is limited by RCU constraints defined on the table. - A task might fail if a
SNAPSHOT
for a table takes longer than 24 hours–since CDC items are removed after 24 hours. To workaround this, ensure there are not too many tables assigned to each task, and their table RCU is adequate. - The connector requires setting
StreamViewType
on the stream to eitherNEW_IMAGE
orNEW_AND_OLD_IMAGES
to source data from DynamoDB Streams.
Amazon Kinesis Source Connector¶
- The Amazon Kinesis Source Connector for Confluent Cloud does not currently support the following
Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.ValueToKey
org.apache.kafka.connect.transforms.HoistField$Value
org.apache.kafka.connect.transforms.ReplaceField$Value
org.apache.kafka.connect.transforms.ExtractField$Value
io.confluent.connect.transforms.Filter
io.confluent.connect.transforms.ExtractTopic
Amazon Redshift Sink Connector¶
The following are limitations for the Amazon Redshift Sink Connector for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- The Confluent Cloud cluster and the target Redshift cluster must be in the same AWS region.
- A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
- The connector cannot consume data containing nested structs.
- The connector does not support the Array data type. For supported data types, see Data types.
- The connector does not support Avro schemas that contain
decimal
logical types. For a better understanding of numeric data types, see this blog post: Bytes, Decimals, Numerics and oh my.
Amazon SQS Source Connector¶
There are no current limitations for the Amazon SQS Source Connector for Confluent Cloud.
Amazon S3 Sink Connector¶
The following are limitations for the Amazon S3 Sink Connector for Confluent Cloud.
The data system the sink connector is connecting to should be in the same region as your Confluent Cloud cluster. If you use a different region or cloud platform, be aware that you may incur additional data transfer charges. Contact your Confluent account team or Confluent Support if you need to use Confluent Cloud and connect to a data system that is in a different region or on a different cloud platform.
One task can handle up to 100 partitions.
Partitioning (hourly or daily) is based on Kafka record time.
flush.size
defaults to 1000. The value can be increased if needed. The value can be lowered (1 minimum) if you are running a Dedicated Confluent Cloud cluster. The minimum value is 1000 for non-dedicated clusters.The following scenarios describe a couple of ways records may be flushed to storage:
You use the default setting of 1000 and your topic has six partitions. Files start to be created in storage after more than 1000 records exist in each partition.
You use the default setting of 1000 and the partitioner is set to Hourly. 500 records arrive at one partition from 2:00pm to 3:00pm. At 3:00pm, an additional 5 records arrive at the partition. You will see 500 records in storage at 3:00pm.
Note
The properties
rotate.schedule.interval.ms
androtate.interval.ms
can be used withflush.size
to determine when files are created in storage. These parameters kick in and files are stored based on which condition is met first.For example: You have one topic partition. You set
flush.size=1000
androtate.schedule.interval.ms=600000
(10 minutes). 500 records arrive at the topic partition from 12:01 to 12:10. 500 additional records arrive from 12:11 to 12:20. You will see two files in the storage bucket with 500 records in each file. This is because the 10 minuterotate.schedule.interval.ms
condition tripped before theflush.size=1000
condition was met.
Performing a compatible schema change may cause the connector to flush data prior to whatever is configured for
flush.size
.When the connector observes a schema change in the value field of an enum type, the schema compatibility check fails in all modes except when
schema.compatibility
is set toNONE
. In the event of this failure, the connector writes the record to the DLQ.A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
The S3 Sink connector does not allow recursive schema types. Writing to Parquet output format with a recursive schema type results in a StackOverflowError.
The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Amazon S3 Source Connector¶
The following are limitations for the Amazon S3 Source Connector for Confluent Cloud.
- For a new bucket, you need to create a new connector with an unused name. If you reconfigure an existing connector to source from the new bucket, or create a connector with a name that is used for another connector, the connector will not source from the beginning of data stored in the bucket. This is because the connector will maintain offsets tied to the connector name.
- The connector limits the number of objects it can index in a bucket to 1,000,000, as
set by the
bucket.listing.max.objects.threshold
property. This limit helps manage large directories effectively. You must keep the object count under 1,000,000 to comply with bucket limitation. - The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.HoistField$Value
org.apache.kafka.connect.transforms.HoistField$Key
org.apache.kafka.connect.transforms.ValueToKey
org.apache.kafka.connect.transforms.Filter
io.confluent.connect.transforms.Filter$Key
io.confluent.connect.transforms.Filter$Value
AWS Lambda Sink Connector¶
The Confluent Cloud cluster and your AWS Lambda project must be in the same AWS region.
Azure Blob Storage Sink Connector¶
The following are limitations for the Azure Blob Storage Sink Connector for Confluent Cloud.
The Azure Blob Storage Container should be in the same region as your Confluent Cloud cluster. If you use a different region, be aware that you may incur additional data transfer charges. Contact Confluent Support if you need to use Confluent Cloud and Azure Blob storage in different regions.
You cannot use public egress IP addresses (IP address allowlisting) for the connector. Azure provides secure and direct private service endpoints to Azure services. For more information, see Service and gateway endpoints.
This connector does not support egress IP addresses when your Confluent Cloud cluster is running on Azure. Public inbound traffic access (0.0.0.0/0) must be allowed for this connector if connecting via public endpoint.
One task can handle up to 100 partitions.
Partitioning (hourly or daily) is based on Kafka record time.
flush.size
defaults to 1000. The value can be increased if needed. The value can be lowered (1 minimum) if you are running a Dedicated Confluent Cloud cluster. The minimum value is 1000 for non-dedicated clusters.The following scenarios describe a couple of ways records may be flushed to storage:
You use the default setting of 1000 and your topic has six partitions. Files start to be created in storage after more than 1000 records exist in each partition.
You use the default setting of 1000 and the partitioner is set to Hourly. 500 records arrive at one partition from 2:00pm to 3:00pm. At 3:00pm, an additional 5 records arrive at the partition. You will see 500 records in storage at 3:00pm.
Note
The properties
rotate.schedule.interval.ms
androtate.interval.ms
can be used withflush.size
to determine when files are created in storage. These parameters kick in and files are stored based on which condition is met first.For example: You have one topic partition. You set
flush.size=1000
androtate.schedule.interval.ms=600000
(10 minutes). 500 records arrive at the topic partition from 12:01 to 12:10. 500 additional records arrive from 12:11 to 12:20. You will see two files in the storage bucket with 500 records in each file. This is because the 10 minuterotate.schedule.interval.ms
condition tripped before theflush.size=1000
condition was met.
schema.compatibility
is set toNONE
.A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Azure Blob Storage Source Connector¶
The following are limitations for the Azure Blob Storage Source Connector for Confluent Cloud.
You cannot use public egress IP addresses (IP address allowlisting) for the connector. Azure provides secure and direct private service endpoints to Azure services. For more information, see Service and gateway endpoints.
This connector does not support egress IP addresses when your Confluent Cloud cluster is running on Azure. Public inbound traffic access (0.0.0.0/0) must be allowed for this connector if connecting via public endpoint.
The connector ignores any object with a name that does not start with the configured
topics.dir
directory. This name istopics/
by default.The connector uses the connector name to store offsets that identify how much of the container it has processed. If you delete a connector and then use the same connector name for a new connector, the new connector will not reprocess data from the beginning of the container. The progress for the deleted connector is saved and the new connector starts from where the original connector’s processing ended. The connector can start processing earlier container data if the corresponding entry in the offset topic is cleared.
The connector limits the number of objects it can index in a bucket to 1,000,000, as set by the
bucket.listing.max.objects.threshold
property. This limit helps manage large directories effectively. You must keep the object count under 1,000,000 to comply with bucket limitation.For a new container, you need to create a new connector. If you reconfigure an existing connector to source from the new container, the connector will not source from the beginning of data stored in the container.
The connector will not reload data during the following scenarios:
- Renaming a file that the connector has already read.
- Uploading a newer version of an existing file with a new record.
If a shared access signature (SAS) token is used, the connector requires an account-level SAS token. A service-level (container) SAS token will not work.
There are compatibility constraints for certain input data formats.
Output data format Supported input formats PROTOBUF, JSON_SR BYTES, AVRO JSON, AVRO, STRING AVRO, JSON, BYTES, STRING BYTES STRING, BYTES The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.HoistField$Value
org.apache.kafka.connect.transforms.HoistField$Key
org.apache.kafka.connect.transforms.ValueToKey
org.apache.kafka.connect.transforms.Filter
io.confluent.connect.transforms.Filter$Key
io.confluent.connect.transforms.Filter$Value
Azure Cognitive Search Sink Connector¶
The following are limitations for Azure Cognitive Search Sink Connector for Confluent Cloud.
- Batching multiple metrics: The connector tries to batch metrics in a single payload. The maximum payload size is 16 megabytes for each API request. For additional details, refer to Size limits per API call.
- The Azure Cognitive Search service must be in the same region as your Confluent Cloud cluster.
- The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Azure Cosmos DB Sink Connector¶
The following are limitations for Azure Cosmos DB Sink Connector for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- The Azure Cosmos DB must be in the same region as your Confluent Cloud cluster.
- You cannot use public egress IP addresses (IP address allowlisting) for the connector. Azure provides secure and direct private service endpoints to Azure services. For more information, see Service and gateway endpoints.
- This connector does not support egress IP addresses when your Confluent Cloud cluster is running on Azure. Public inbound traffic access (0.0.0.0/0) must be allowed for this connector if connecting via public endpoint.
- The Kafka topic must not contain tombstone records. The connector does not handle tombstone or null values.
Azure Cosmos DB Source Connector¶
The following are limitations for the Azure Cosmos DB Source Connector for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- You cannot use public egress IP addresses (IP address allowlisting) for the connector. Azure provides secure and direct private service endpoints to Azure services. For more information, see Service and gateway endpoints.
- This connector does not support egress IP addresses when your Confluent Cloud cluster is running on Azure. Public inbound traffic access (0.0.0.0/0) must be allowed for this connector if connecting via public endpoint.
- Azure Cosmos DB serverless mode is not supported.
- The Kafka record key is serialized by StringConverter.
Azure Data Lake Storage Gen2 Sink Connector¶
The following are limitations for the Azure Data Lake Storage Gen2 Sink Connector for Confluent Cloud.
Azure Data Lake storage should be in the same region as your Confluent Cloud cluster. If you use a different region, be aware that you may incur additional data transfer charges. Contact Confluent Support if you need to use Confluent Cloud and Azure Data Lake storage in different regions.
Input format JSON to output format AVRO does not work for the preview connector.
One task can handle up to 100 partitions.
Partitioning (hourly or daily) is based on Kafka record time.
flush.size
defaults to 1000. The value can be increased if needed. The value can be lowered (1 minimum) if you are running a Dedicated Confluent Cloud cluster. The minimum value is 1000 for non-dedicated clusters.The following scenarios describe a couple of ways records may be flushed to storage:
You use the default setting of 1000 and your topic has six partitions. Files start to be created in storage after more than 1000 records exist in each partition.
You use the default setting of 1000 and the partitioner is set to Hourly. 500 records arrive at one partition from 2:00pm to 3:00pm. At 3:00pm, an additional 5 records arrive at the partition. You will see 500 records in storage at 3:00pm.
Note
The properties
rotate.schedule.interval.ms
androtate.interval.ms
can be used withflush.size
to determine when files are created in storage. These parameters kick in and files are stored based on which condition is met first.For example: You have one topic partition. You set
flush.size=1000
androtate.schedule.interval.ms=600000
(10 minutes). 500 records arrive at the topic partition from 12:01 to 12:10. 500 additional records arrive from 12:11 to 12:20. You will see two files in the storage bucket with 500 records in each file. This is because the 10 minuterotate.schedule.interval.ms
condition tripped before theflush.size=1000
condition was met.
schema.compatibility
is set toNONE
.A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
Using a recursive schema type is not allowed and will result in a StackOverflowError.
The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
This connector does not support egress IP addresses when your Confluent Cloud cluster is running on Azure. Public inbound traffic access (0.0.0.0/0) must be allowed for this connector if connecting via public endpoint.
Azure Event Hubs Source Connector¶
The following are limitations for the Azure Event Hubs Source Connector for Confluent Cloud.
max.events
:499
is the maximum number of events allowed. Defaults to50
.- You cannot use public egress IP addresses (IP address allowlisting) for the connector. Azure provides secure and direct private service endpoints to Azure services. For more information, see Service and gateway endpoints.
- This connector does not support egress IP addresses when your Confluent Cloud cluster is running on Azure. Public inbound traffic access (0.0.0.0/0) must be allowed for this connector if connecting via public endpoint.
- The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.ValueToKey
org.apache.kafka.connect.transforms.HoistField$Value
Azure Functions Sink Connector¶
The following are limitations for Azure Functions Sink Connector for Confluent Cloud.
- The target Azure Function must be in the same region as your Confluent Cloud cluster.
- The target Azure Function must be on a host using a public IP address. For more information, see Manage Networking for Confluent Cloud Connectors.
Azure Log Analytics Sink Connector¶
The following are limitations for Azure Log Analytics Sink Connector for Confluent Cloud.
- There is a 30 MB per post size limit when posting to the Azure Monitor Data Collector API. This size limit is for a single post.
- There is a 32 KB field value size limit. If the field value is greater than 32 KB data will be truncated.
- The recommended maximum number of fields for a given type is 50. This is a practical limit based on usability testing.
- Tables in Azure Log Analytics workspaces can support 500 columns maximum.
- Column names can have a maximum of 45 characters.
- Table names can have a maximum of 100 characters. Table names must start with letters and can only contain letters, numbers, and the underscore character (_).
Azure Service Bus Source Connector¶
The following are limitations for the Azure Service Bus Source Connector for Confluent Cloud.
- For JSON, JSON_SR, AVRO and PROTOBUF, the message body (
messageBody
) produced by the connector contains JSON or text in base64 encoded format. - For using a private DNS zone/server to configure the Azure Service Bus Source connector to a private endpoint, see DNS Support in Manage Networking for Confluent Cloud Connectors.
- You cannot use public egress IP addresses (IP address allowlisting) for the connector. Azure provides secure and direct private service endpoints to Azure services. For more information, see Service and gateway endpoints.
- This connector does not support egress IP addresses when your Confluent Cloud cluster is running on Azure. Public inbound traffic access (0.0.0.0/0) must be allowed for this connector if connecting via public endpoint.
Azure Synapse Analytics Sink Connector¶
The following are limitations for Azure Synapse Analytics Sink Connector for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- This connector can only insert data into an Azure SQL data warehouse database. Azure Synapse Analytics does not support primary keys. Since updates, upserts, and deletes are all performed on the primary keys, these queries are not supported for this connector.
- When
auto.evolve
is enabled, if a new column with a default value is added, that default value is only used for new records. Existing records will have"null"
as the value for the new column. - The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Databricks Delta Lake Sink¶
The following are limitations for the Databricks Delta Lake Sink Connector for Confluent Cloud.
- The connector is available only on Amazon Web Services (AWS).
- The Amazon S3 bucket (where data is staged), the Delta Lake instance, and the Kafka cluster must be in the same region.
- You cannot configure multiple connectors that consume from the same topic and that use the same Amazon S3 staging bucket.
- Exactly-once semantics functionality is not currently supported.
- To use multiple tasks, your Databricks workspace must be using Databricks Runtime version 10.5 or later. For earlier Databricks Runtime versions, the connector is limited to a single task per connector.
- The connector does not support Array, Map, or Struct field schemas.
- The connector appends data only.
- The connector uses the UTC timezone.
- The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Datadog Metrics Sink Connector¶
- Batching multiple metrics: The connector tries to batch metrics in a single payload. The maximum payload size is 3.2 megabytes for each API request. For additional details, refer to Post timeseries points.
- Metrics Rate Limiting: The API endpoints are rate limited. The rate limit for metrics retrieval is 100 per hour, per organization. These limits can be modified by contacting Datadog support.
- The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Datagen Source Connector¶
There are no current limitations for the Datagen Source Connector for Confluent Cloud.
Elasticsearch Service Sink Connector¶
The following are limitations for the Elasticsearch Service Sink Connector for Confluent Cloud.
- The connector is tested to work with Elastic Cloud and the official Elasticsearch distribution from Elastic. Open source versions and derivatives like Amazon OpenSearch are not supported.
- The connector supports data stream types
LOGS
andMETRICS
only. - The connector has been tested with versions up to Elasticsearch 8.x.
- The Confluent Cloud cluster and the target Elasticsearch deployment must be in the same region.
- The
batch.size
property is limited to a maximum of 4000 records.
GitHub Source Connector¶
The following is a limitation for the GitHub Source Connector for Confluent Cloud.
- Because of a GitHub API limitation, only one task per connector is supported.
Google BigTable Sink Connector¶
The following are limitations for the Google Cloud BigTable Sink Connector for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- The database and the Kafka cluster should be in the same region.
Google BigQuery Sink (Legacy) Connector¶
The following are limitations for the Google Cloud BigQuery Sink [Deprecated] Connector for Confluent Cloud.
- The data system the sink connector is connecting to should be in the same region as your Confluent Cloud cluster. If you use a different region or cloud platform, be aware that you may incur additional data transfer charges. Contact your Confluent account team or Confluent Support if you need to use Confluent Cloud and connect to a data system that is in a different region or on a different cloud platform.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- Source topic names must comply with BigQuery naming conventions even if
sanitizeTopics
is set totrue
in the connector configuration. - A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
- The connector does not support schemas with recursion.
- Auto schema update does not support column removal.
- Auto schema update does not support recursive schemas.
- DLQ routing does not work if Auto update schemas (
auto.update.schemas
) is enabled and the connector detects that the failure is due to schema mismatch. - Topic names are mapped to BigQuery table names. For example, if you have a topic named
pageviews
, a topic namedvisitors
, and a dataset namedwebsite
, the result is two tables in BigQuery; one namedpageviews
and one namedvisitors
under thewebsite
dataset. - The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Google BigQuery Sink V2 Connector¶
The following are limitations for the Google BigQuery Sink V2 Connector for Confluent Cloud.
- Primary keys are required to run the connector in
UPSERT
orUPSERT_DELETE
mode. You must ensure that there aren’t any null primary keys as ingesting new and existing records with null primary keys may result in unreliable behavior. For more details, see the user responsibilities for primary and foreign keys with BigQuery. Note that only Kafka record keys are treated as primary keys, and BigQuery CDC limitations still apply. - The data system the connector is connecting to should be in the same region as your Confluent Cloud cluster. If you use a different region or cloud platform, you may incur additional data transfer charges. Contact your Confluent account team or Confluent Support if you need to use Confluent Cloud and connect to a data system that is in a different region or on a different cloud platform.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- Source topic names must comply with BigQuery naming conventions even if
sanitize.topics
is set totrue
in the connector configuration. - A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
- The connector does not support schemas with recursion.
- Auto schema update does not support column removal.
- Auto schema update does not support recursive schemas.
- DLQ routing does not work if Auto update schemas (
auto.update.schemas
) is enabled and the connector detects that the failure is due to schema mismatch. - Topic names are mapped to BigQuery table names. For example, if you have a
topic named
pageviews
, a topic namedvisitors
, and a dataset namedwebsite
, the result is two tables in BigQuery; one namedpageviews
and one namedvisitors
under thewebsite
dataset. - The connector does not currently support the following Single Message
Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Google Functions Sink Connector¶
The following are limitations for the Google Cloud Functions Sink [Deprecated] Connector for Confluent Cloud.
- The target Google Function should be in the same region as your Confluent Cloud cluster.
- The connector does not currently support Google Cloud Functions (2nd gen).
- The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Google Pub/Sub Source Connector¶
There are no current limitations for the Google Cloud Pub/Sub Source Connector for Confluent Cloud.
Google Cloud Spanner Sink Connector¶
The following are limitations for the Google Cloud Spanner Sink Connector for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- The Confluent Cloud cluster and the target Google Spanner cluster must be in the same Google Cloud region.
- A valid schema must be available in Confluent Cloud Schema Registry to use Avro, JSON Schema, or Protobuf.
- The connector does not support PostgreSQL dialect.
- The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Google Cloud Storage Sink Connector¶
The following are limitations for the Google Cloud Storage Sink Connector for Confluent Cloud.
The data system the sink connector is connecting to should be in the same region as your Confluent Cloud cluster. If you use a different region or cloud platform, be aware that you may incur additional data transfer charges. Contact your Confluent account team or Confluent Support if you need to use Confluent Cloud and connect to a data system that is in a different region or on a different cloud platform.
One task can handle up to 100 partitions.
Partitioning (hourly or daily) is based on Kafka record time.
flush.size
defaults to 1000. The value can be increased if needed. The value can be lowered (1 minimum) if you are running a Dedicated Confluent Cloud cluster. The minimum value is 1000 for non-dedicated clusters.The following scenarios describe a couple of ways records may be flushed to storage:
You use the default setting of 1000 and your topic has six partitions. Files start to be created in storage after more than 1000 records exist in each partition.
You use the default setting of 1000 and the partitioner is set to Hourly. 500 records arrive at one partition from 2:00pm to 3:00pm. At 3:00pm, an additional 5 records arrive at the partition. You will see 500 records in storage at 3:00pm.
Note
The properties
rotate.schedule.interval.ms
androtate.interval.ms
can be used withflush.size
to determine when files are created in storage. These parameters kick in and files are stored based on which condition is met first.For example: You have one topic partition. You set
flush.size=1000
androtate.schedule.interval.ms=600000
(10 minutes). 500 records arrive at the topic partition from 12:01 to 12:10. 500 additional records arrive from 12:11 to 12:20. You will see two files in the storage bucket with 500 records in each file. This is because the 10 minuterotate.schedule.interval.ms
condition tripped before theflush.size=1000
condition was met.
schema.compatibility
is set toNONE
.If output format BYTES is selected, the input message format must also be BYTES.
A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
Using a recursive schema type is not allowed and will result in a StackOverflowError.
The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Google Cloud Storage Source Connector¶
The following are limitations for the Google Cloud Storage Source Connector for Confluent Cloud.
The connector ignores any GCS object with a name that does not start with the configured
topics.dir
directory. This name istopics/
by default.The connector uses the connector name to store offsets that identify how much of the bucket it has processed. If you delete a connector and then use the same connector name for a new connector, the new connector will not reprocess data from the beginning of the bucket. The progress for the deleted connector is saved and the new connector starts from where the original connector’s processing ended. The connector can start processing earlier bucket data if the corresponding entry in the offset topic is cleared.
The connector limits the number of objects it can index in a bucket to 1,000,000, as set by the
bucket.listing.max.objects.threshold
property. This limit helps manage large directories effectively. You must keep the object count under 1,000,000 to comply with bucket limitation.For a new bucket, you need to create a new connector. If you reconfigure an existing connector to source from the new bucket, the connector will not source from the beginning of data stored in the bucket.
The connector will not reload data during the following scenarios:
- Renaming a file that the connector has already read.
- Uploading a newer version of an existing file with a new record.
There are compatibility constraints for certain input data formats.
Output data format Supported input formats PROTOBUF, JSON_SR BYTES, AVRO JSON, AVRO, STRING AVRO, JSON, BYTES, STRING BYTES STRING, BYTES The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.HoistField$Value
org.apache.kafka.connect.transforms.HoistField$Key
org.apache.kafka.connect.transforms.ValueToKey
org.apache.kafka.connect.transforms.Filter
io.confluent.connect.transforms.Filter$Key
io.confluent.connect.transforms.Filter$Value
HTTP Sink Connector¶
There is one limitation for the HTTP Sink Connector for Confluent Cloud.
- The Confluent Cloud Kafka consumer configuration property
max.poll.interval.ms
is set to300000
milliseconds (5 minutes). This is a hard-coded property. If the sink connector takes longer than five minutes to complete processing and overshoots the poll interval, the connector is kicked out of the consumer group. This results in a failure to commit offsets and duplicate messages.
HTTP Source Connector¶
The following are limitations for the HTTP Source Connector for Confluent Cloud.
- The connector does not support APIs that rely on timestamp range-based queries.
- The connector cannot parse responses in any format other than JSON.
HTTP Sink V2 Connector¶
The HTTP Sink V2 Connector for Confluent Cloud has the following limitation.
- The Confluent Cloud user interface currently does not support adding, removing, or updating APIs for the OpenAPI specification. In this case, you must create a new connector. To workaround this limitation, use the Switch to JSON feature to use the raw JSON to configure the connector.
- The connector only supports HTTP/1.x protocol and RESTful APIs.
- If CSFLE is enabled, the actual payload won’t be available in the headers of the error topic.
HTTP Source V2 Connector¶
The HTTP Source V2 Connector for Confluent Cloud has the following limitation.
- The Confluent Cloud user interface currently does not support adding, removing, or updating APIs for the OpenAPI specification. In this case, you must create a new connector. To workaround this limitation, use the Switch to JSON feature to use the raw JSON to configure the connector.
- The connector only supports HTTP/1.x protocol and RESTful APIs.
IBM MQ Source Connector¶
There are no current limitations for the IBM MQ Source Connector for Confluent Cloud.
Jira Source Connector¶
The following are limitations for the Jira Source Connector for Confluent Cloud.
- To use a schema-based output format, you must set schema compatibility to
NONE
in Schema Registry. - For Schema Registry-based output formats, the connector attempts to deduce the schema based on the source API response returned. The connector registers a new schema for every NULL and NOT NULL value of an optional field in the API response. For this reason, the connector may register schema versions at a much higher rate than expected.
- Resources which do not support fetching records by datetime will have duplicate records and will be fetched repeatedly at a duration specified by the
request.interval.ms
configuration property. - The connector is not able to detect data deletion on Jira.
- The connector does not guarantee accurate record order in the Apache Kafka® topic.
- The timezone set by the user (defined in the
jira.username
configuration property) must match the general setting Jira timezone used for the connector.
InfluxDB 2 Sink Connector¶
- The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
InfluxDB Source Connector¶
There are no current limitations for the InfluxDB 2 Source Connector for Confluent Cloud.
Microsoft SQL Server Sink Connector¶
The following are limitations for the Microsoft SQL Server Sink (JDBC) Connector for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- You cannot use public egress IP addresses (IP address allowlisting) for the connector. Azure provides secure and direct private service endpoints to Azure services. For more information, see Service and gateway endpoints.
- Active Directory authentication is not currently supported.
- The database and Kafka cluster should be in the same region. If you use a different region, you may incur additional data transfer charges.
- For tombstone records, set
delete.enabled
totrue
.
Microsoft SQL Server CDC Source Connector (Debezium) [Legacy]¶
The following are limitations for the Microsoft SQL Server CDC Source (Debezium) [Deprecated] Connector for Confluent Cloud.
- Change data capture (CDC) is only available in the Enterprise, Developer, Enterprise Evaluation, and Standard editions.
- Active Directory authentication is not currently supported.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
- Organizations can run multiple connectors with a limit of one task per connector (that is,
"tasks.max": "1"
).
Microsoft SQL Server CDC Source V2 (Debezium) Connector¶
The following are limitations for the Microsoft SQL Server CDC Source Connector V2 (Debezium) for Confluent Cloud.
- Change data capture (CDC) is only available in the Enterprise, Developer, Enterprise Evaluation, and Standard editions.
- Active Directory authentication is not currently supported.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
- The signaling does not work when the connector is configured with multiple databases. In such cases, the connector can not execute incremental snapshots, as they require signals to be produced into the signaling table.
Microsoft SQL Server Source (JDBC) Connector¶
The following are limitations for the Microsoft SQL Server Source (JDBC) Connector for Confluent Cloud.
Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
You cannot use public egress IP addresses (IP address allowlisting) for the connector. Azure provides secure and direct private service endpoints to Azure services. For more information, see Service and gateway endpoints.
Active Directory authentication is not currently supported.
A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
A timestamp column must not be nullable for the timestamp or timestamp+incrementing mode and should be datetime2.
If the connector is making numerous parallel insert operations in a large source table, insert transactions can commit out of order (this is typical). What this means is that a “greater” auto_increment ID (for example, 101) is committed earlier and a “smaller” ID (for example, 100) is committed later. The time difference here may only be a few milliseconds, but the commits are out of order nevertheless.
Note that using incrementing mode to load data from such tables always results in some data loss. This happens because when the source connector worker reads (polls) the table, the connector gets the row with a greater offset value (with the smaller offset row remaining uncommitted). In the next iteration, although the uncommitted row is committed, the offset position has moved beyond that value, so the row is skipped. Using timestamp+incrementing mode is not a good choice either, because the tables may be very large (five to eight million rows added daily) and there is a high cost for any indexing approach, with the exception of PK indexing.
The connector doesn’t allow the use of special characters in table names (for example,
$
), which are not allowed in Kafka topic names. To work around this limitation, change the name of the target topic using the ExtractTopic Single Message Transformation (SMT). Note that this SMT helps you extract the topic name from the key or value of the message; therefore, you must include the desired topic name in the payload.The connector does not accept zero date values, for example,
0000-00-00
when using Date fields. You can convert them to NULL by using the driver propertytableName?zeroDateTimeBehavior=convertToNull
in the connection string.The connector can skip records in
timestamp+incrementing
mode when multiple transactions write to the source database table, and some transactions commit later. If a transaction with a higher timestamp and incrementing ID commits first and is read by the connector, the connector will ignore any later transactions with older timestamps and incrementing IDs. As a workaround, set thetimestamp.delay.interval.ms
in the connector to match the transaction timeout on the database side. This ensures the connector considers records that are committed later, as long as they are committed within thetimestamp.delay.interval.ms
timeframe.
MongoDB Atlas Sink Connector¶
The following are limitations for the MongoDB Atlas Sink Connector for Confluent Cloud.
This connector supports MongoDB Atlas only. This connector will not work with a self-managed MongoDB database or MongoDB Atlas Serverless.
If your MongoDB database username or password includes any of the following characters:
$ : / ? # [ ] @
, you must convert the characters using percent encoding.Document post processing configuration properties are not supported. These include:
post.processor.chain
key.projection.type
value.projection.type
field.renamer.mapping
A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
The MongoDB database service endpoint and the Kafka cluster must be in the same region.
Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
You cannot use a dot in a field name (for example,
Client.Email
). The error shown below is displayed if a field name includes a dot. You should also not use$
in a field name. For additional information, see Field Names.Your record has an invalid BSON field name. You can check the MongoDB documentation for details.
The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
MongoDB Atlas Source Connector¶
The following are limitations for the Get Started with the MongoDB Atlas Source Connector for Confluent Cloud.
- This connector supports MongoDB Atlas only. This connector will not work with a self-managed MongoDB database or MongoDB Atlas Serverless.
- MongoDB Time Series Collections are not supported with this connector.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- The connector supports running a single task.
MQTT Sink Connector¶
There are no current limitations for the MQTT Sink Connector for Confluent Cloud.
MQTT Source Connector¶
- The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.ValueToKey
org.apache.kafka.connect.transforms.HoistField$Value
MySQL Sink Connector¶
The following are limitations for the MySQL Sink (JDBC) Connector for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- The database and Kafka cluster should be in the same region.
- For tombstone records, set
delete.enabled
totrue
.
MySQL CDC Source Connector (Debezium) [Legacy]¶
The following are limitations for the MySQL CDC Source (Debezium) [Deprecated] Connector for Confluent Cloud.
- MariaDB is not currently supported. See the Debezium docs for more information.
- Amazon Aurora doesn’t support binary logging using a multi-master cluster as the binlog master or worker. You can’t use binlog-based CDC tools with multi-master clusters.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
- Organizations can run multiple connectors with a limit of one task per connector (that is,
"tasks.max": "1"
). - The
additional-condition
option for the signaling feature of incremental snapshots is not supported in v1. If you are using a legacy version of the connector and need this option, upgrade to the latest version of the connector.
MySQL CDC Source V2 (Debezium) Connector¶
The following are limitations for the MySQL CDC Source Connector V2 (Debezium) for Confluent Cloud.
- MariaDB is not currently supported. For more information, see the Debezium docs.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
- Organizations can run multiple connectors with a limit of one task per connector (that is,
"tasks.max": "1"
).
MySQL Source (JDBC) Connector¶
The following are limitations for the MySQL Source (JDBC) Connector for Confluent Cloud.
Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
A timestamp column must not be nullable for the timestamp or timestamp+incrementing mode.
A query produced by the connector may take a very long time to execute when using timestamp+incrementing, as opposed to the query speed when using incrementing.
If the connector is making numerous parallel insert operations in a large source table, insert transactions can commit out of order (this is typical). What this means is that a “greater” auto_increment ID (for example, 101) is committed earlier and a “smaller” ID (for example, 100) is committed later. The time difference here may only be a few milliseconds, but the commits are out of order nevertheless.
Note that using incrementing mode to load data from such tables always results in some data loss. This happens because when the source connector worker reads (polls) the table, the connector gets the row with a greater offset value (with the smaller offset row remaining uncommitted). In the next iteration, although the uncommitted row is committed, the offset position has moved beyond that value, so the row is skipped. Using timestamp+incrementing mode is not a good choice either, because the tables may be very large (five to eight million rows added daily) and there is a high cost for any indexing approach, with the exception of PK indexing.
The connector doesn’t allow the use of special characters in table names (for example,
$
), which are not allowed in Kafka topic names. To work around this limitation, change the name of the target topic using the ExtractTopic Single Message Transformation (SMT). Note that this SMT helps you extract the topic name from the key or value of the message; therefore, you must include the desired topic name in the payload.The connector cannot capture tables with the same name across different schemas, even if you configure the schema pattern with the correct schema name or use fully qualified names (FQNs) for tables. As a workaround, you should create the connector with the required table names and restrict user access at the schema level.
If the records contain zero date values, for example,
0000-00-00
when using Date fields, you can convert them to NULL by using the driver propertytableName?zeroDateTimeBehavior=CONVERT_TO_NULL
in the connection string.The connector can skip records in
timestamp+incrementing
mode when multiple transactions write to the source database table, and some transactions commit later. If a transaction with a higher timestamp and incrementing ID commits first and is read by the connector, the connector will ignore any later transactions with older timestamps and incrementing IDs. As a workaround, set thetimestamp.delay.interval.ms
in the connector to match the transaction timeout on the database side. This ensures the connector considers records that are committed later, as long as they are committed within thetimestamp.delay.interval.ms
timeframe.
New Relic Metrics Sink Connector¶
The following are limitations for New Relic Metrics Sink Connector for Confluent Cloud.
- The connector is limited by New Relic Matric API limits and restricted attributes.
Opensearch Sink Connector¶
The following are limitations for the OpenSearch Sink Connector for Confluent Cloud:
- The connector only allows you to create and manage up to 5 indexes.
- Batch inserts are not currently supported. The connector can only insert records one at a time.
- The connector only supports HTTP POST requests.
- The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
org.apache.kafka.connect.transforms.RegexRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic
io.confluent.connect.cloud.transforms.TopicRegexRouter
Oracle CDC Source Connector¶
The following are limitations for the Oracle CDC Source Connector for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- If you change the connector configuration property
oracle.date.mapping
fromdate
totimestamp
the connector will not work, since this results in a breaking schema change. You must create a new connector if you want to change to thetimestamp
option.
Oracle Database Sink Connector¶
The following are limitations for the Oracle Database Sink (JDBC) Connector for Confluent Cloud.
- The Oracle Database version must be 11.2.0.4 or later.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- The Oracle database and Kafka cluster should be in the same region.
- See Database considerations for additional information.
Oracle Database Source (JDBC) Connector¶
The following are limitations for the Oracle Database Source (JDBC) Connector for Confluent Cloud.
- The Oracle Database version must be 11.2.0.4 or later.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
- A timestamp column must not be nullable for the timestamp or timestamp+incrementing mode.
- Configuration properties that are not shown in the Cloud Console use the default values. See JDBC Connector Source Connector Configuration Properties for property definitions and default values.
- The connector doesn’t allow the use of special characters in table names (for
example,
$
), which are not allowed in Kafka topic names. To work around this limitation, change the name of the target topic using the ExtractTopic Single Message Transformation (SMT). Note that this SMT helps you extract the topic name from the key or value of the message; therefore, you must include the desired topic name in the payload. - The connector does not accept zero date values, for example,
0000-00-00
when using Date fields. You can convert them to NULL by using the driver propertytableName?zeroDateTimeBehavior=convertToNull
in the connection string. - The connector can skip records in
timestamp+incrementing
mode when multiple transactions write to the source database table, and some transactions commit later. If a transaction with a higher timestamp and incrementing ID commits first and is read by the connector, the connector will ignore any later transactions with older timestamps and incrementing IDs. As a workaround, set thetimestamp.delay.interval.ms
in the connector to match the transaction timeout on the database side. This ensures the connector considers records that are committed later, as long as they are committed within thetimestamp.delay.interval.ms
timeframe.
PostgreSQL Sink Connector¶
The following are limitations for the PostgreSQL Sink (JDBC) Connector for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- The database and Kafka cluster should be in the same region. If you use a different region, be aware that you may incur additional data transfer charges.
- For tombstone records, set
delete.enabled
totrue
Pagerduty Sink Connector¶
There are no current limitations for the PagerDuty Sink Connector [Deprecated] for Confluent Cloud.
PostgreSQL CDC Source (Debezium) [Legacy] Connector¶
The following are limitations for the PostgreSQL CDC Source Connector (Debezium) [Deprecated] for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- For Azure, you must use a general purpose or memory-optimized PostgreSQL database. You cannot use a basic database.
- A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
- CockroachDB is not supported. To learn about other unsupported PostgreSQL databases, contact Confluent Support.
- Clients from Azure Virtual Networks are not allowed to access the server by default. Make sure your Azure Virtual Network is correctly configured and that Allow access to Azure Services is enabled.
- The following are the default partition and replication factor properties:
topic.creation.default.partitions=1
topic.creation.default.replication.factor=3
- See the After-state only output limitation if you are planning to use the optional property
After-state only
. - Organizations can run multiple connectors with a limit of one task per connector (that is,
"tasks.max": "1"
). - The
pgoutput
logical decoding output plug-in does not capture values for generated columns, resulting in missing data for these columns in the connector’s output. - RDS Proxy does not support streaming replication mode. If you configure the connector to connect to an RDS Proxy, it cannot create the replication slot, leading to failure. However, if the replication slot exists before starting the connector with RDS Proxy, the connector will successfully complete the snapshotting phase but will fail during the streaming phase.
PostgreSQL CDC Source V2 (Debezium) Connector¶
The following are limitations for the PostgreSQL CDC Source Connector V2 (Debezium) for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- For Azure, you must use a general purpose or memory-optimized PostgreSQL database. You cannot use a basic database.
- A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
- CockroachDB is not supported. To learn about other unsupported PostgreSQL databases, contact Confluent Support.
- Clients from Azure Virtual Networks are not allowed to access the server by default. Make sure your Azure Virtual Network is correctly configured and that Allow access to Azure Services is enabled.
- The following are the default partition and replication factor properties:
topic.creation.default.partitions=1
topic.creation.default.replication.factor=3
- See the After-state only output limitation if you are planning to use the optional property
after.state.only
. - Organizations can run multiple connectors with a limit of one task per connector (that is,
"tasks.max": "1"
). - The
pgoutput
logical decoding output plug-in does not capture values for generated columns, resulting in missing data for these columns in the connector’s output. - RDS Proxy does not support streaming replication mode. If you configure the connector to connect to an RDS Proxy, it cannot create the replication slot, leading to failure. However, if the replication slot exists before starting the connector with RDS Proxy, the connector will successfully complete the snapshotting phase but will fail during the streaming phase.
PostgreSQL Source (JDBC) Connector¶
The following are limitations for the PostgreSQL Source (JDBC) Connector for Confluent Cloud.
Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
Clients from Azure Virtual Networks are not allowed to access the server by default. Make sure your Azure Virtual Network is correctly configured and enable “Allow access to Azure Services”.
CockroachDB is not supported. To learn about other unsupported PostgreSQL databases, contact Confluent Support.
A timestamp column must not be nullable for the
timestamp
ortimestamp+incrementing
mode.If the connector is making numerous parallel insert operations in a large source table, insert transactions can commit out of order (this is typical). What this means is that a “greater” auto_increment ID (for example, 101) is committed earlier and a “smaller” ID (for example, 100) is committed later. The time difference here may only be a few milliseconds, but the commits are out of order nevertheless.
Note that using incrementing mode to load data from such tables always results in some data loss. This happens because when the source connector worker reads (polls) the table, the connector gets the row with a greater offset value (with the smaller offset row remaining uncommitted). In the next iteration, although the uncommitted row is committed, the offset position has moved beyond that value, so the row is skipped. Using timestamp+incrementing mode is not a good choice either, because the tables may be very large (five to eight million rows added daily) and there is a high cost for any indexing approach, with the exception of PK indexing.
The connector does not support all data types. When it encounters an unknown data type, the data type is dropped from the source output. The following lists known unsupported data types.
Known unsupported data types
_abc
box
cardinal_number
character_data
cidr
circle
citext
hstore
inet
line
lseg
macaddr
macaddr8
money
path
pg_lsn
pg_snapshot
point
polygon
refcursor
sql_identifier
tsquery
tsvector
txid_snapshot
uuid
yes_or_no
The connector doesn’t allow the use of special characters in table names (for example,
$
), which are not allowed in Kafka topic names. To work around this limitation, change the name of the target topic using the ExtractTopic Single Message Transformation (SMT). Note that this SMT helps you extract the topic name from the key or value of the message; therefore, you must include the desired topic name in the payload.The connector does not accept zero date values, for example,
0000-00-00
when using Date fields. You can convert them to NULL by using the driver propertytableName?zeroDateTimeBehavior=convertToNull
in the connection string.The connector can skip records in
timestamp+incrementing
mode when multiple transactions write to the source database table, and some transactions commit later. If a transaction with a higher timestamp and incrementing ID commits first and is read by the connector, the connector will ignore any later transactions with older timestamps and incrementing IDs. As a workaround, set thetimestamp.delay.interval.ms
in the connector to match the transaction timeout on the database side. This ensures the connector considers records that are committed later, as long as they are committed within thetimestamp.delay.interval.ms
timeframe.
RabbitMQ Sink Connector¶
The following are limitations for the RabbitMQ Sink Connector for Confluent Cloud.
- The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Redis Sink Connector¶
The following are limitations for the Redis Sink Connector for Confluent Cloud.
- Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
- The Redis instance and Kafka cluster should be in the same region.
- This connector does not support the ValueToKey (
org.apache.kafka.connect.transforms.ValueToKey
) Single Message Transform.
Salesforce Bulk API Source Connector¶
The following are limitations for the Salesforce Bulk API Source Connector for Confluent Cloud.
Organizations can run multiple connectors with a limit of one task per connector (that is,
"tasks.max": "1"
).Restarting: When the connector operates, it periodically records the last query time in the Connect offset topic. When the connector is restarted, it may fetch Salesforce objects with a
LastModifiedDate
that is later than last queried time.API limits: The Salesforce Bulk API Source connector is limited by non-compound fields. For example, Bulk Query doesn’t support address, location fields. The connector discards address and geolocation fields.
The following Salesforce object (SObject) error message may be displayed when you are using the Salesforce Bulk API Source connector:
Entity 'Order' is not supported to use PKChunking.
For these SObjects, set the configuration property Enable Batching to false (CLI property
batch.enable=false
).Unsupported SObjects: See Supported and Unsupported SObjects for a list of supported and unsupported SObjects.
Salesforce Bulk API 2.0 Sink Connector¶
The following are limitations for the Salesforce Bulk API 2.0 Sink Connector for Confluent Cloud.
- Salesforce imposes API limits over a 24-hour window. Exceeding Salesforce API limits will result in connector failure.
- The connector is subject to daily limits on the number of records handled, the number of batches created (internal to Salesforce), and the total size of data. For detailed limitations, see Bulk API Limits.
- There are Salesforce data and file storage limitations based on the type of organization used.
Salesforce Bulk API 2.0 Source Connector¶
The following are limitations for the Salesforce Bulk API 2.0 Source Connector for Confluent Cloud.
- Organizations can run multiple connectors with a limit of one task per connector (that is,
"tasks.max": "1"
). - Restarting: When the connector operates, it periodically records the last query time in the Connect offset topic. When the connector is restarted, it may fetch Salesforce objects with a
LastModifiedDate
that is later than last queried time. - API limits: The Salesforce Bulk API Source connector is limited by non-compound fields. For example, Bulk Query doesn’t support address, location fields. The connector discards address and geolocation fields.
Salesforce CDC Source Connector¶
The following are limitations for the Salesforce CDC Source Connector for Confluent Cloud.
- The Salesforce user account configured for the connector must have permission to View All Data. For details, see Required Permissions for Change Events Received by CometD Subscribers.
- A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
- When you pause a connector, the connector continues to fetch records from the Salesforce endpoint. These records are not sent to the Kafka topic until the connector resumes.
- The connector does not support parsing fields where Salesforce sends the data differences of fields in updated records. For details, see Sending Data Differences for Fields of Updated Records.
Salesforce Platform Event Sink Connector¶
The following are limitations for the Salesforce Platform Event Sink Connector for Confluent Cloud.
- There are Salesforce streaming allocations and limits that apply to this connector. For example, the number of API calls that can occur within a 24-hour period is capped for free developer org accounts.
- There are data and file storage limits that are based on the type of organization you use.
Salesforce Platform Event Source Connector¶
There is one limitation for the Salesforce Platform Event Source Connector for Confluent Cloud.
- Organizations can run multiple connectors with a limit of one task per connector (that is,
"tasks.max": "1"
).
- When you pause a connector, the connector continues to fetch records from the Salesforce endpoint. These records are not sent to the Kafka topic until the connector resumes.
- The connector does not support parsing fields where Salesforce sends the data differences of fields in updated records. For details, see Sending Data Differences for Fields of Updated Records.
Salesforce PushTopic Source Connector¶
The following are limitations for the Salesforce PushTopic Source Connector for Confluent Cloud.
- Organizations can run multiple connectors with a limit of one task per
connector (that is,
"tasks.max": "1"
). - Note the following limitations for at least once delivery:
- When the connector operates, it periodically records the replay ID of the last record written to Kafka. When the connector is stopped and then restarted within 24 hours, the connector continues consuming the PushTopic where it stopped, with no missed events. However, if the connector stops for more than 24 hours, some events are discarded in Salesforce before the connector can read them.
- If the connector stops unexpectedly due to a failure, it may not record the replay ID of the last record successfully written to Kafka. When the connector restarts, it resumes from the last recorded replay ID. This means that some events may be duplicated in Kafka.
- When you pause a connector, the connector continues to fetch records from the Salesforce endpoint. These records are not sent to the Kafka topic until the connector resumes.
Salesforce SObject Sink Connector¶
The following are limitations for the Salesforce SObject Sink Connector for Confluent Cloud.
- There are Salesforce streaming allocations and limits that apply to this connector. For example, the number of API calls that can occur within a 24-hour period is capped for free developer org accounts.
- There are data and file storage limits that are based on the type of organization you use.
- The connector doesn’t currently support external lookup relationships.
ServiceNow Sink Connector¶
There is one limitation for the ServiceNow Sink Connector for Confluent Cloud.
- The connector does not support reporter topics with CSFLE.
ServiceNow Source Connector¶
The following are limitations for the ServiceNow Source Connector for Confluent Cloud.
- The connector does not support the following table types:
- Sys Audit (sys_audit)
- Audit Relationship Change (sys_audit_relation)
SFTP Sink Connector¶
The following are limitations for the SFTP Sink Connector for Confluent Cloud.
flush.size
defaults to 1000. The value can be increased if needed. The value can be lowered (1 minimum) if you are running a Dedicated Confluent Cloud cluster. The minimum value is 1000 for non-dedicated clusters.The following scenarios describe a couple of ways records may be flushed to storage:
You use the default setting of 1000 and your topic has six partitions. Files start to be created in storage after more than 1000 records exist in each partition.
You use the default setting of 1000 and the partitioner is set to Hourly. 500 records arrive at one partition from 2:00pm to 3:00pm. At 3:00pm, an additional 5 records arrive at the partition. You will see 500 records in storage at 3:00pm.
Note
The properties
rotate.schedule.interval.ms
androtate.interval.ms
can be used withflush.size
to determine when files are created in storage. These parameters kick in and files are stored based on which condition is met first.For example: You have one topic partition. You set
flush.size=1000
androtate.schedule.interval.ms=600000
(10 minutes). 500 records arrive at the topic partition from 12:01 to 12:10. 500 additional records arrive from 12:11 to 12:20. You will see two files in the storage bucket with 500 records in each file. This is because the 10 minuterotate.schedule.interval.ms
condition tripped before theflush.size=1000
condition was met.
schema.compatibility
is set toNONE
.A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
SFTP Source Connector¶
The following are limitations for the SFTP Source Connector for Confluent Cloud.
The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.ValueToKey
org.apache.kafka.connect.transforms.HoistField$Value
Currently, the SFTP source connector reads and moves a file only once while processing files from a specific SFTP directory. If a file with the same name is produced later in the same SFTP directory, the connector will not process it.
If the following error occurs, you must provide
WRITE
access to the SFTP server directory where the connector is accessing files.There were some errors with your configuration:\ninput.path: Could not write to the specified location configured in %s config. Check that that the user has write permissions for the specified location
When the
schema.generation.enabled
configuration is enabled, the connector offers limited schema detection capabilities and does not support nested or array schemas.
Snowflake Sink Connector¶
The following are limitations for the Snowflake Sink Connector for Confluent Cloud.
The connector doesn’t support Snowflake’s Client Redirect feature.
Depending on the service environment, certain network access limitations may exist. Make sure the connector can reach your service. For details, see Networking and DNS.
The data system the sink connector is connecting to should be in the same region as your Confluent Cloud cluster. If you use a different region or cloud platform, be aware that you may incur additional data transfer charges. Contact your Confluent account team or Confluent Support if you need to use Confluent Cloud and connect to a data system that is in a different region or on a different cloud platform.
The connector does not remove Snowflake pipes when a connector is deleted. For instructions to manually clean up Snowflake pipes, see Dropping Pipes.
Note that Snowpipe Streaming for Kafka supports insert-only operations. For additional information, see the Snowflake documentation.
A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
Each task is limited to a number of topic partitions based on the
buffer.size.bytes
property value. For example, a10
MB buffer size is limited to 50 topic partitions, a20
MB buffer is limited to 25 topic partitions,50
MB buffer is limited to 10 topic partitions, and a100
MB buffer to 5 topic partitions.In
snowflake.ingestion.method=SNOWPIPE_STREAMING
mode, the connector does not support custom offsets.In
SNOWPIPE_STREAMING
mode, each connector requires a unique channel name. For example, the combination[channelA = <database_name>.<schema_name>.<table_name>.<topic_name>]
must be unique across all created connectors.Note
This restriction applies only when connectors are created in
SNOWPIPE_STREAMING
mode.The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Solace Sink Connector¶
The following are limitations for the Solace Sink Connector for Confluent Cloud.
- The connector can create queues, but not durable topic endpoints.
- A valid schema must be available in Schema Registry to use a Schema Registry-based format, like Avro.
- The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
Splunk Sink Connector¶
The following are limitations for the Splunk Sink Connector for Confluent Cloud.
- If an invalid index is specified, the connector posts the event to Splunk successfully, but it appears to be discarded in Splunk.
Zendesk Source Connector¶
The following are limitations for the Zendesk Source Connector for Confluent Cloud.
- There is a limit of one task per connector instance.
- For Schema Registry-based output formats, the connector attempts to deduce the schema based on the source API response returned. The connector registers a new schema for every NULL and NOT NULL value of an optional field in the API response. For this reason, the connector may register schema versions at a much higher rate than expected.
Preview connector limitations¶
See the following limitations for preview connectors.
Caution
Preview connectors are not currently supported and are not recommended for production use.
Google Cloud Dataproc Sink Connector¶
The following are limitations for the Google Cloud Dataproc Sink Connector [Deprecated] for Confluent Cloud.
The Confluent Cloud cluster and the target Dataproc cluster must be in a VPC peering configuration.
Note
For a non-VPC peered environment, public inbound traffic access (
0.0.0.0/0
) must be allowed to the VPC where the Dataproc cluster is located. You must also make configuration changes to allow public access to the Dataproc cluster while retaining the private IP addresses for the Dataproc master and worker nodes (HDFS NameNode and DataNodes). For configuration details, see Configuring a non-VPC peering environment. For more information about public Internet access to resources, see Networking and DNS.The Dataproc image version must be 1.4 (or later). See Cloud Dataproc Image version list.
One task can handle up to 100 partitions.
Input format JSON to output format AVRO does not work for the preview connector.
Partitioning (hourly or daily) is based on Kafka record time.
flush.size
defaults to 1000. The value can be increased if needed. The value can be lowered (1 minimum) if you are running a Dedicated Confluent Cloud cluster. The minimum value is 1000 for non-dedicated clusters.The following scenarios describe a couple of ways records may be flushed to storage:
You use the default setting of 1000 and your topic has six partitions. Files start to be created in storage after more than 1000 records exist in each partition.
You use the default setting of 1000 and the partitioner is set to Hourly. 500 records arrive at one partition from 2:00pm to 3:00pm. At 3:00pm, an additional 5 records arrive at the partition. You will see 500 records in storage at 3:00pm.
Note
The properties
rotate.schedule.interval.ms
androtate.interval.ms
can be used withflush.size
to determine when files are created in storage. These parameters kick in and files are stored based on which condition is met first.For example: You have one topic partition. You set
flush.size=1000
androtate.schedule.interval.ms=600000
(10 minutes). 500 records arrive at the topic partition from 12:01 to 12:10. 500 additional records arrive from 12:11 to 12:20. You will see two files in the storage bucket with 500 records in each file. This is because the 10 minuterotate.schedule.interval.ms
condition tripped before theflush.size=1000
condition was met.
schema.compatibility
is set toNONE
.A valid schema must be available in Confluent Cloud Schema Registry to use a schema-based message format, like Avro.
For Confluent Cloud and Confluent Cloud Enterprise, organizations are limited to one task and one connector. Use of this connector is free for a limited time.
The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.TimestampRouter
io.confluent.connect.transforms.MessageTimestampRouter
io.confluent.connect.transforms.ExtractTopic$Header
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
io.confluent.connect.cloud.transforms.TopicRegexRouter
RabbitMQ Source Connector¶
The following are limitations for the RabbitMQ Source Connector for Confluent Cloud.
- When paused, this connector continues to consume messages from RabbitMQ until the consumer times out. These messages remain in system memory while the connector is paused. There is no data loss when the connector resumes, since messages are acknowledged after they are flushed from memory and sent to Kafka. However, if you plan to keep this connector paused for an extended time, consider removing the connector, since message will continue to accumulate in system memory.
- The connector does not currently support the following Single Message Transformations (SMTs):
org.apache.kafka.connect.transforms.ValueToKey
org.apache.kafka.connect.transforms.HoistField$Value