HTTP Sink Connector for Confluent Cloud

The fully-managed HTTP Sink connector for Confluent Cloud integrates Apache Kafka® with an API via HTTP or HTTPS.

The connector consumes records from Kafka topic(s) and converts each record value to STRING or JSON format before sending it, in the request body, to the configured http.api.url. The API URL can reference a record key or topic name using substitution variables ${topic} and ${key} in the URL property. You can also use fields from the Kafka record. The targeted API must support either a POST, PATCH, or PUT request.

The connector batches records up to the set Batch max size (batch.max.size) before sending the batched request to the API. Each record is converted to its String representation or its JSON representation with Request Body Format (request.body.format=json) and then separated with the Batch separator (batch.separator). See Configuration Properties for configuration property descriptions.

The HTTP Sink connector supports connecting to APIs using SSL along with Basic Authentication, OAuth2, or a Proxy Authentication Server.

Note

This is a Quick Start for the fully-managed cloud connector. If you are installing the connector locally for Confluent Platform, see HTTP Sink Connector for Confluent Platform.

Features

The HTTP Sink connector supports the following features:

  • At least once delivery: This connector guarantees that records from the Kafka topic are delivered at least once.

  • Supports multiple tasks: The connector supports running one or more tasks. More tasks may improve performance (that is, consumer lag is reduced with multiple tasks running).

  • Automatically creates topics: The following three topics are automatically created when the connector starts:

    The suffix for each topic name is the connector’s logical ID. In the example below, there are the three connector topics and one pre-existing Kafka topic named pageviews.

    HTTP Sink Connector Topics

    Connector Topics

    If the records sent to the topic are not in the correct format, or if important fields are missing in the record, the errors are recorded in the error topic, and the connector continues to run.

  • Supported data formats: The connector supports Avro, JSON Schema (JSON-SR), Protobuf, JSON (schemaless), and Bytes formats. Schema Registry must be enabled to use a Schema Registry-based format (for example, Avro, JSON Schema, or Protobuf). See Schema Registry Enabled Environments for additional information.

  • Template parameters: The connector allows you to specify fields from the Kafka record, other than {$topic} and {$key} and constructs a unique URL using these parameters.

  • Regex Replacements: The connector can take a number of regex patterns and replacement strings that are applied to a record before it is submitted to the destination API. To do this, the connector uses the configuration options regex.patterns, regex.replacements, and regex.separator.

  • Supports Batching: The connector batches requests submitted to HTTP APIs for efficiency. Batches can be built with the configuration options batch.prefix, batch.suffix and batch.separator. All regex options apply when batching and are applied to individual records before being submitted to the batch.

For more information and examples to use with the Confluent Cloud API for Connect, see the Confluent Cloud API for Managed and Custom Connectors section.

Limitations

Be sure to review the following information.

Template Parameters

The connector forwards the message (record) value to the HTTP API. You can add parameters to have the connector construct a unique HTTP API URL containing the record key and topic name. For example, you enter http://eshost1:9200/api/messages/${topic}/${key} to have the HTTP API URL contain the topic name and record key.

In addition to the ${topic} and ${key} parameters, you can also refer to fields from the Kafka record. As shown in the following example, you may want the connector to construct a URL that uses the Order ID and Customer ID.

The Avro format that the producer uses to generate records in the Apache Kafka® topic order is shown below:

{
  "name": "MyClass",
  "type": "record",
  "namespace": "com.acme.avro",
  "fields": [
    {
      "name": "customerId",
      "type": "int"
    },
    {
      "name": "order",
      "type": {
        "name": "order",
        "type": "record",
        "fields": [
          {
            "name": "id",
            "type": "int"
          },
          {
            "name": "amount",
            "type": "int"
          }
        ]
      }
    }
  ]
}

To send the Order ID and Customer ID, you would use the following URL in the HTTP API URL (http.api.url) configuration property:

"http.api.url" : "http://eshost1:9200/api/messages/order/${order.id}/customer/${customerId}/"

Assuming the data in the Kafka topic contains the following values:

{
  "customerId": 123,
  "order": {
    "id": 1,
    "amount": 12345
  }
}

The connector constructs the following URL:

http://eshost1:9200/api/messages/order/1/customer/123/

Note

  • The maximum depth for added parameters is 10. For example, connector validation fails if you were to use the URL https://eshost1:9200/api/messages/order/${a.b.c.d.e.f.g.h.i.j.k}.
  • When you add parameters to the HTTP API URL, each record can result in a unique URL. For this reason, batching is disabled when using additional URL parameters.
  • The connector throws a runtime exception if fields referred to in the HTTP API URL do not exist in the Kafka record.

Quick Start

Use this quick start to get up and running with the Confluent Cloud HTTP Sink connector. The quick start provides the basics of selecting the connector and configuring it to stream events to an HTTP endpoint.

Prerequisites
  • Authorized access to a Confluent Cloud cluster on Amazon Web Services (AWS), Microsoft Azure (Azure), or Google Cloud (Google Cloud).
  • The Confluent CLI installed and configured for the cluster. See Install the Confluent CLI.
  • Schema Registry must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf). See Schema Registry Enabled Environments for additional information.
  • At least one source Kafka topic must exist in your Confluent Cloud cluster before creating the sink connector.

Using the Confluent Cloud Console

Step 1: Launch your Confluent Cloud cluster

See the Quick Start for Confluent Cloud for installation instructions.

Step 2: Add a connector

In the left navigation menu, click Connectors. If you already have connectors in your cluster, click + Add connector.

Step 3: Select your connector

Click the HTTP Sink connector card.

HTTP Sink Connector Card

Step 4: Enter the connector details

Note

  • Ensure you have all your prerequisites completed.
  • An asterisk ( * ) designates a required entry.

At the Add HTTP Sink Connector screen, complete the following:

If you’ve already populated your Kafka topics, select the topic(s) you want to connect from the Topics list.

To create a new topic, click +Add new topic.

Step 5: Check for records

Verify that records are being produced at the endpoint.

For more information and examples to use with the Confluent Cloud API for Connect, see the Confluent Cloud API for Managed and Custom Connectors section.

Tip

When you launch a connector, a Dead Letter Queue topic is automatically created. See Confluent Cloud Dead Letter Queue for details.

Using the Confluent CLI

To set up and run the connector using the Confluent CLI, complete the following steps.

Note

Make sure you have all your prerequisites completed.

Step 1: List the available connectors

Enter the following command to list available connectors:

confluent connect plugin list

Step 2: List the connector configuration properties

Enter the following command to show the connector configuration properties:

confluent connect plugin describe <connector-plugin-name>

The command output shows the required and optional configuration properties.

Step 3: Create the connector configuration file

Create a JSON file that contains the connector configuration properties. The following example shows the required connector properties.

{
  "connector.class": "HttpSink",
  "input.data.format": "JSON",
  "name": "HttpSinkConnector_0",
  "kafka.auth.mode": "KAFKA_API_KEY",
  "kafka.api.key": "<my-kafka-api-key>",
  "kafka.api.secret": "<my-kafka-api-secret>",
  "http.api.url": "http:://eshost1:9200/api/messages",
  "request.method": "POST",
  "tasks.max": "1",
  "topics": "orders",
}

Note the following property definitions:

  • "connector.class": Identifies the connector plugin name.

  • "input.data.format": Sets the input Kafka record value format (data coming from the Kafka topic). Valid entries are AVRO, JSON_SR, PROTOBUF, JSON, or BYTES. You must have Confluent Cloud Schema Registry configured if using a schema-based message format (for example, Avro, JSON_SR (JSON Schema), or Protobuf).

    Tip

    Select schemaless JSON to consume STRING data.

  • "name": Sets a name for your new connector.

  • "kafka.auth.mode": Identifies the connector authentication mode you want to use. There are two options: SERVICE_ACCOUNT or KAFKA_API_KEY (the default). To use an API key and secret, specify the configuration properties kafka.api.key and kafka.api.secret, as shown in the example configuration (above). To use a service account, specify the Resource ID in the property kafka.service.account.id=<service-account-resource-ID>. To list the available service account resource IDs, use the following command:

    confluent iam service-account list
    

    For example:

    confluent iam service-account list
    
       Id     | Resource ID |       Name        |    Description
    +---------+-------------+-------------------+-------------------
       123456 | sa-l1r23m   | sa-1              | Service account 1
       789101 | sa-l4d56p   | sa-2              | Service account 2
    
  • "http.api.url": Use an HTTP or HTTPS connection URL. For example, http://eshost1:9200/api/messages or https://eshost3:9200/api/messages. The connector sends the record value to the API as part of the request body. You can specify a static URL (for example, http://eshost1:9200/api/messages) or a dynamic URL (for example, http://eshost1:9200/api/messages/${topic}/${key}). You can also specify a dynamic URL using fields from the Kafka record.

    Note

    • Note that if the connection URL is HTTPS, HTTPS is used for all connections. A URL with no protocol is considered HTTP.
    • For additional information, see HTTP Sink Connector limitations.
  • "request.method": Enter an HTTP API Request Method: PUT, POST, or PATCH. Defaults to POST.

  • "tasks.max": Enter the maximum number of tasks for the connector to use. More tasks may improve performance (that is, consumer lag is reduced with multiple tasks running).

  • "topics": Enter the topic name or a comma-separated list of topic names.

Single Message Transforms: See the Single Message Transforms (SMT) documentation for details about adding SMTs using the CLI.

See Configuration Properties for all property values and descriptions.

Step 3: Load the properties file and create the connector

Enter the following command to load the configuration and start the connector:

confluent connect cluster create --config-file <file-name>.json

For example:

confluent connect cluster create --config-file http-sink-config.json

Example output:

Created connector HttpSinkConnector_0 lcc-do6vzd

Step 4: Check the connector status.

Enter the following command to check the connector status:

confluent connect cluster list

Example output:

ID           |             Name              | Status  | Type | Trace
+------------+-------------------------------+---------+------+-------+
lcc-do6vzd   | HttpSinkConnector_0           | RUNNING | sink |       |

Step 5: Check for records

Verify that records are populating the endpoint.

For more information and examples to use with the Confluent Cloud API for Connect, see the Confluent Cloud API for Managed and Custom Connectors section.

Tip

When you launch a connector, a Dead Letter Queue topic is automatically created. See Confluent Cloud Dead Letter Queue for details.

Configuration Properties

Use the following configuration properties with the fully-managed connector. For self-managed connector property definitions and other details, see the connector docs in Self-managed connectors for Confluent Platform.

Note

These are properties for the fully-managed cloud connector. If you are installing the connector locally for Confluent Platform, see HTTP Sink Connector for Confluent Platform.

Which topics do you want to get data from?

topics

Identifies the topic name or a comma-separated list of topic names.

  • Type: list
  • Importance: high

Schema Config

schema.context.name

Add a schema context name. A schema context represents an independent scope in Schema Registry. It is a separate sub-schema tied to topics in different Kafka clusters that share the same Schema Registry instance. If not used, the connector uses the default schema configured for Schema Registry in your Confluent Cloud environment.

  • Type: string
  • Default: default
  • Importance: medium

Input messages

input.data.format

Sets the input Kafka record value format. Valid entries are AVRO, JSON_SR, PROTOBUF, JSON or BYTES. Note that you need to have Confluent Cloud Schema Registry configured if using a schema-based message format like AVRO, JSON_SR, and PROTOBUF.

  • Type: string
  • Importance: high

How should we connect to your data?

name

Sets a name for your connector.

  • Type: string
  • Valid Values: A string at most 64 characters long
  • Importance: high

Kafka Cluster credentials

kafka.auth.mode

Kafka Authentication mode. It can be one of KAFKA_API_KEY or SERVICE_ACCOUNT. It defaults to KAFKA_API_KEY mode.

  • Type: string
  • Default: KAFKA_API_KEY
  • Valid Values: KAFKA_API_KEY, SERVICE_ACCOUNT
  • Importance: high
kafka.api.key

Kafka API Key. Required when kafka.auth.mode==KAFKA_API_KEY.

  • Type: password
  • Importance: high
kafka.service.account.id

The Service Account that will be used to generate the API keys to communicate with Kafka Cluster.

  • Type: string
  • Importance: high
kafka.api.secret

Secret associated with Kafka API key. Required when kafka.auth.mode==KAFKA_API_KEY.

  • Type: password
  • Importance: high

HTTP server details

http.api.url
  • Type: string
  • Importance: high
request.method
  • Type: string
  • Default: POST
  • Importance: high
headers

HTTP headers to be included in all requests. Individual headers should be separated by the Header Separator

  • Type: string
  • Importance: high
header.separator

Separator character used in headers

  • Type: string
  • Importance: high
sensitive.headers

Sensitive HTTP headers (eg: credentials) to be included in all requests. Individual headers should be separated by the Header Separator

  • Type: password
  • Importance: high
behavior.on.null.values

How to handle records with a non-null key and a null value (i.e. Kafka tombstone records). Valid options are ignore, delete and fail

  • Type: string
  • Default: ignore
  • Importance: low

HTTP server error handling

behavior.on.error

Error handling behavior config for handling error responses from HTTP requests

  • Type: string
  • Default: ignore
  • Importance: medium
report.errors.as

Dictates the content of records produced to the error topic. If set to error_string, the value would be a human readable string describing the failure. The value will include some or all of the following information if available: http response code, reason phrase, submitted payload, url, response content, exception and error message. If set to http_response, the value would be the plain response content for the request which failed to write the record. In both modes, any information about the failure will also be included in the error record’s headers

  • Type: string
  • Default: error_string
  • Importance: medium

HTTP server batches

request.body.format

Used to produce request body in either JSON or String format

  • Type: string
  • Default: string
  • Importance: medium
batch.key.pattern

Pattern used to build the key for a given batch. ${key} and ${topic} can be used to include message attributes here

  • Type: string
  • Importance: high
batch.max.size

The number of records accumulated in a batch before the HTTP API is invoked

  • Type: int
  • Default: 1
  • Importance: high
batch.prefix

Prefix added to record batches. This is applied once at the beginning of the batch of records

  • Type: string
  • Importance: high
batch.suffix

Suffix added to record batches. This is applied once at the end of the batch of records

  • Type: string
  • Importance: high
batch.separator

Separator for records in a batch

  • Type: string
  • Importance: high
batch.json.as.array

Whether or not to use an array to bundle json records. Only used when request.body.format is set to json. This can be disabled only when batch.max.size is set to 1.

  • Type: boolean
  • Importance: high

HTTP server authentication

auth.type
  • Type: string
  • Default: NONE
  • Importance: high
connection.user

The username to be used with an endpoint requiring authentication

  • Type: string
  • Importance: high
connection.password

The password to be used with an endpoint requiring authentication

  • Type: password
  • Importance: high
oauth2.token.url

The URL to be used for fetching OAuth2 token. Client Credentials is the only supported grant type.

  • Type: string
  • Importance: high
oauth2.client.id

The client id used when fetching OAuth2 token

  • Type: string
  • Importance: high
oauth2.client.secret

The secret used when fetching OAuth2 token

  • Type: password
  • Importance: high
oauth2.token.property

The name of the property containing the OAuth2 token returned by the http proxy.

  • Type: string
  • Default: access_token
  • Importance: high
oauth2.client.auth.mode

Specifies how to encode client_id and client_secret in the OAuth2 authorization request. If set to ‘header’, the credentials are encoded as an 'Authorization: Basic <base-64 encoded client_id:client_secret>' HTTP header. If set to ‘url’, then client_id and client_secret are sent as URL encoded parameters.

  • Type: string
  • Default: header
  • Importance: low
oauth2.client.scope

The scope used when fetching OAuth2 token. If empty, this parameter is not set in the authorization request

  • Type: string
  • Default: any
  • Importance: low
oauth2.jwt.enabled

Whether to generate and add JWT token to request. If selected, JWT token will be added as ‘jwt_token’ request param

  • Type: boolean
  • Default: false
  • Importance: medium
oauth2.jwt.keystore.path

Keystore containing private key to use to sign JWT.

  • Type: password
  • Default: [hidden]
  • Importance: medium
oauth2.jwt.keystore.password

Password to access keystore

  • Type: password
  • Default: [hidden]
  • Importance: medium
oauth2.jwt.keystore.type

JWT keystore type

  • Type: string
  • Default: JKS
  • Importance: medium
oauth2.jwt.claimset

JSON containing JWT claims

  • Type: string
  • Default: “”
  • Importance: medium
oauth2.client.headers

HTTP headers to be included in the OAuth2 client endpoint. Individual headers should be separated by OAuth2 Client Headers Separator

  • Type: string
  • Importance: low
oauth2.client.header.separator

Separator character used in OAuth2 Client Headers

  • Type: string
  • Importance: low

HTTP server retries

retry.on.status.codes

The HTTP error codes to retry on. Comma-separated list of codes or range of codes to retry on. Ranges are specified with start and optional end code. Range boundaries are inclusive. For instance, ‘400-‘ includes all codes greater than or equal to 400. ‘400-500’ includes codes from 400 to 500, including 500. Multiple ranges and single codes can be specified together to achieve fine grained control over retry behavior. For example, ‘404,408,500-‘ will retry on 404 NOT FOUND, 408 REQUEST TIMEOUT, and all 5xx error codes

  • Type: string
  • Default: 400-
  • Importance: medium
max.retries

The maximum number of times to retry on errors before failing the task

  • Type: int
  • Default: 3
  • Importance: medium
retry.backoff.ms

The initial duration in milliseconds to wait following an error before a retry attempt is made. Subsequent backoff attempts will be exponentially larger than the first duration. Note that this value is the initial backoff before retrying. After that, the connector will retry using exponential jitter. Jitter adds randomness to the exponential backoff algorithm to prevent synchronized retries.

  • Type: int
  • Default: 3000 (3 seconds)
  • Valid Values: [100,…]
  • Importance: medium
http.connect.timeout.ms

The time in milliseconds to wait for a connection to be established

  • Type: int
  • Default: 30000 (30 seconds)
  • Importance: medium
http.request.timeout.ms

The time in milliseconds to wait for a request response from the server

  • Type: int
  • Default: 30000 (30 seconds)
  • Importance: medium
retry.backoff.policy

The backoff policy to use in terms of retry - CONSTANT_VALUE or EXPONENTIAL_WITH_JITTER

  • Type: string
  • Default: EXPONENTIAL_WITH_JITTER
  • Importance: medium

HTTP server regular expressions

regex.patterns

Regular expression patterns used for replacements in the message sent to the HTTP service. Multiple regular expression patterns can be specified, but must be separated by regex.separator

  • Type: string
  • Importance: medium
regex.replacements

Regex replacements to use with the patterns in regex.patterns. Multiple replacements can be specified, but must be separated by regex.separator. ${key} and ${topic} can be used here.

  • Type: string
  • Importance: medium
regex.separator

Separator character used in regex.patterns and regex.replacements property.

  • Type: string
  • Importance: medium

HTTP server SSL

https.ssl.key.password

The password of the private key in the key store file. This is optional for client

  • Type: password
  • Importance: high
https.ssl.keystorefile

The key store containing server certificate. Only required if using https

  • Type: password
  • Importance: low
https.ssl.keystore.password

The store password for the key store file. This is optional for a client and is only needed if https.ssl.keystore.location is configured

  • Type: password
  • Importance: high
https.ssl.truststorefile

The trust store containing server CA certificate. Only required if using https

  • Type: password
  • Importance: high
https.ssl.truststore.password

The trust store password containing server CA certificate. Only required if using https

  • Type: password
  • Importance: high
https.ssl.protocol

The protocol to use for SSL connections

  • Type: string
  • Default: TLSv1.3
  • Importance: medium
https.host.verifier.enabled

True if SSL host verification should be enabled

  • Type: boolean
  • Default: true
  • Importance: medium

Consumer configuration

max.poll.interval.ms

The maximum delay between subsequent consume requests to Kafka. This configuration property may be used to improve the performance of the connector, if the connector cannot send records to the sink system. Defaults to 300000 milliseconds (5 minutes).

  • Type: long
  • Default: 300000 (5 minutes)
  • Valid Values: [60000,…,1800000]
  • Importance: low
max.poll.records

The maximum number of records to consume from Kafka in a single request. This configuration property may be used to improve the performance of the connector, if the connector cannot send records to the sink system. Defaults to 500 records.

  • Type: long
  • Default: 500
  • Valid Values: [1,…,500]
  • Importance: low

Number of tasks for this connector

tasks.max

Maximum number of tasks for the connector.

  • Type: int
  • Valid Values: [1,…]
  • Importance: high

Next Steps

For an example that shows fully-managed Confluent Cloud connectors in action with Confluent Cloud ksqlDB, see the Cloud ETL Demo. This example also shows how to use Confluent CLI to manage your resources in Confluent Cloud.

../_images/topology.png