Confluent Hub Component Archive Specification

Components published on Confluent Hub can only be downloaded if an archive file is provided by the owner. That archive file must have specific content and organization, including a manifest file with information about the component.

Confluent has provided a Maven plugin that you can use in the Maven builds of your component's source code project to build the manifest and archive file. When you release your software, verify your component satisfies the requirements and submit your component.

The rest of this page describes the archive file, the manifest file, and the Maven plugin in more detail.

Terminology

Component
A packaged implementation of one or more Apache Kafka® or Confluent Platform APIs. Many components include a set of libraries, configuration files, documentation, and other resources that you install into your local system and then configure and use. Some components, however, cannot be downloaded and must be accessed from the component's owner as detailed on Confluent Hub.
Component archive
A ZIP file that contains the packaged component for easy download and installation. Not all components have an archive.
Component owner
The individual or organization providing a component. Owners develop and test a component, and optionally package it into a component archive for publishing on Confluent Hub. Owner names namespace all components from that owner into a unique identifier. Owner names are used as part of the directory name when the component is installed. The name of a component owner must contain two to 255 characters, and must contain only lowercase letters, numbers, and underscore (_) characters. An example is confluentinc.
Component name
Each component has a name that, combined with the owner name, provides a unique identifier for that component relative to all other components in Confluent Hub. Names should be meaningful to end users, and will be used as part of the directory name when the component is installed. The name of a component must contain only lowercase letters, numbers, minus sign (-), and underscore (_) characters. An example is kafka-connect-jdbc.
Component version
Components have versions that can be used to identify a specific release of that component, and should correspond to the same versioning scheme used in version control and releases. They may contain letters, numbers, minus sign (-), period (.), and underscore (_) characters. An example is 5.0.0.

Component Archive Format

A component archive is a ZIP file that contains the files that are required by the component, organized into a specific structure. This structure makes it easy to extract the contents of the archive into particular locations within your local installation.

Component Archive Name

The component archive name must follow this convention:

${componentOwner}-${componentName}-${componentVersion}.zip

where:

  • ${componentOwner} is the name of the owner or developer of the component that acts as a namespace and is unique across Confluent Hub.
  • ${componentName} is the name of the component that is unique within the scope of the owner name.
  • ${componentVersion} is the version of the component as chosen by the owner or developer.

For example, in the file named confluentinc-kafka-connect-elasticsearch-3.3.2.zip, the owner is confluentinc, the name is kafka-connect-elasticsearch, and the version is 3.3.2. These are somewhat analogous to the Maven groupId, artifactId, and version.

Component Archive Structure

Here are the contents of the ZIP file:

confluentinc-kafka-connect-elasticsearch-3.3.2/
    manifest.json
    doc/
        LICENSE
        licenses
        notices
        README.md
        licenses.html
        version.txt
    etc/
        quickstart-elasticsearch.properties
    lib/
        commons-codec-1.9.jar
        commons-lang3-3.4.jar
        commons-logging-1.2.jar
        gson-2.4.jar
        guava-18.0.jar
        httpasyncclient-4.1.1.jar
        httpclient-4.5.1.jar
        httpcore-4.4.4.jar
        httpcore-nio-4.4.4.jar
        jest-2.0.0.jar
        jest-common-2.0.0.jar
        kafka-connect-elasticsearch-3.3.2.jar
        slf4j-api-1.7.25.jar
    assets/
        elastic_logo.png
        confluent_logo.jpg
        apache_logo.gif

The doc/ directory contains any sort of human-readable documents concerning the component, such as READMEs and licenses.

The etc/ directory contains any sample or default properties files that may be helpful to include with the component.

The lib/ directory contains all of the files necessary for actually running the component. For example, the Elasticsearch connector JAR (kafka-connect-elasticsearch-3.3.2.jar) and all of its runtime dependencies, excluding those already provided by the Kafka Connect framework. Notice that there is no JAR for the Kafka Connect API included (e.g., connect-api-1.0.1.jar); it’s assumed that this will already be provided by the Kafka Connect framework when the connector is run.

The assets/ directory includes any asset files that should be hosted by Confluent Hub. This is currently limited to logo files. Subdirectories are not included.

The manifest.json file contains all of the metadata to be associated with the component and made available on Confluent Hub.

Component Manifests

A sample manifest

Here are the contents of the manifest.json file:

{
  "component_types": [
    "sink"
  ],

  "description": "This is a connector for getting data out of Apache Kafka into Elasticsearch.\nIt is built off of the Kafka Connect framework, and therefore automatically supports pluggable encoding converters, single message transforms, and other useful features.\nElasticsearch 5.x is required (6.x is not supported due to a known issue).",

  "docker_image": {
    "tag": "3.3.2",
    "name": "kafka-connect-elasticsearch",
    "namespace": "confluentinc",
    "registry": "hub.docker.io"
  },

  "documentation_url": "https://docs.confluent.io/3.3.1/connect/connect-elasticsearch/docs/index.html",

  "features": {
    "confluent_control_center_integration": true,
    "delivery_guarantee": ["at_least_once"],
    "kafka_connect_api": true,
    "single_message_transforms": true,
    "supported_encodings": ["any"]
  },

  "license": [
    {
      "name": "Apache License, Version 2.0",
      "url": "http://www.apache.org/licenses/LICENSE-2.0",
      "logo": "assets/apache_logo.gif"
    }
  ],

  "logo": "assets/elastic_logo.png",

  "name": "kafka-connect-elasticsearch",

  "owner": {
    "logo": "assets/confluent_logo.jpg",
    "name": "Confluent, Inc.",
    "type": "organization",
    "url": "https://confluent.io/",
    "username": "confluentinc"
  },

  "requirements": [
    "Elasticsearch 5.x"
  ],

  "support": {
    "logo": "assets/confluent_logo.jpg",
    "provider_name": "Confluent, Inc.",
    "summary": "We provide full support for this connector, alongside community members who contribute to it as an open-source project.",
    "url": "https://confluent.io/"
  },

  "tags": [
    "analytics",
    "elasticsearch",
    "elastic"
  ],

  "title": "Kafka Connect Elasticsearch",

  "version": "3.3.2"
}

Manifest Field Descriptions

Here’s what each of the fields in the provided manifest.json file should contain:

component_types
Required. Confluent Hub currently contains four different categories of component: sink connector, source connector, transformer, and converter. This field is a list of one or more of the following values: “sink”, “source”, “transform”, and “converter”. A component can qualify for several different types. For example, one might contain a sink connector, a source connector, and multiple transformations.
description
Required. The description is one of the most essential elements of the manifest, and this is among the most prominent content shown on Confluent Hub. It provides a summary of the component’s functionality and features, as well as any information that can’t be specified in any other part of the manifest. We recommend the first short paragraph provide a good overview, and that subsequent paragraphs list features, capabilities, limitations, and external requirements in detailed, human-readable sentences. This is where you convince users to use your component rather than other components that serve the same or a similar purpose.
docker_image
Information on a Docker image available for the component.
docker_image.namespace
The namespace that the Docker image can be found under.
docker_image.name
The name of the Docker image.
docker_image.tag
The tag of the Docker image that corresponds to this version of the component.
docker_image.registry
The registries that contain the Docker image.
documentation_url
A link to documentation for the component. Can be anything from a custom web page to a GitHub Wiki.
features
These are specific features that are tracked by Confluent and used to help compare components. Any other features that make a component noteworthy should be detailed in the description field.
features.supported_encodings
The serialized encodings the component supports. If it is a Kafka Connect connector, the availability of pluggable converters means that basically any encoding is supported. If, however, the component is not a Kafka Connect connector and is limited in which encodings it supports, they should be specified here.
features.delivery_guarantee
What delivery guarantees the component supports. Can be empty if no guarantees are made, or a list containing one or both of “at_least_once” and “exactly_once”.
features.kafka_connect_api
Whether the component is built on top of the Kafka Connect API.
features.confluent_control_center_integration
Whether the component supports integration with Confluent Control Center.
features.single_message_transforms
Whether the component supports single message transforms. If it is a Kafka Connect connector, this is automatically true.
logo
A logo to display on Confluent Hub for the component. If the component is a Kafka Connect connector, this should probably be a logo for the source/sink of the connector, since the relation to Apache Kafka® will be implied already.
license
A list of licenses associated with the component. The example only contains one, but several can be specified if need be. It is strongly recommended that at least one license be provided.
license.name
The human-readable name of the license.
license.url
A link to the actual text of the license.
license.logo
A logo for the license.
name
Required. This is the ${componentName} portion of the component used in its archive filename. It is also the name used to refer to the component via the back-end REST API; because of this, restrictions are placed on what characters it can consist of, and the result is often slightly less than what may be desired for human readability.
owner
Information regarding the owner of the component.
owner.username
Required. This is the ${componentOwner} portion of the component’s archive filename. Similar to the name field, it is restricted to work as part of a resource for a REST API.
owner.type
Required. The type of owner for the component; either "organization" or "user".
owner.name
Required. As the title field relates to the name field, the owner.name field relates to the owner.username field. This is a friendlier, human-readable way to present who owns the component, and will be used on Confluent Hub. If left unspecified, will default to the owner.username field.
owner.url
A website or web page to associate with the owner. Useful for attracting traffic from people who want to know more about who provided the component, possibly to a company’s website (as in this example) or a public GitHub profile (for an individual user who would like recognition for their work).
owner.logo
An image to associate with the component owner.
requirements
A list of requirements given in a concise, bullet-listable format.
support
Information on who, if anyone, provides support for the component, and what kind of support is provided.
support.provider_name
A friendly, human-readable name for who supports the component. Can be Community if the component is fully open-sourced and maintained by voluntary contributors.
support.summary
What exactly users should expect in terms of support for this component. Should be as detailed as possible with regards to how to file bug reports, ask questions, etc.
support.url
Where the users should go for support. If a public Jira is used for filing bug reports, for example, this should link there. Another option would be a link to a GitHub issues page.
support.logo
An image to associate with the support provider.
tags
A list of relevant search tags for the component.
title
Required. The human-readable name for the component, which will be displayed for it on Confluent Hub. This can essentially be thought of as a less restrictive version of the name field and, if not specified, will default to the name field.
version
Required. Pretty self-explanatory--the version of the component. This is used as the ${componentVersion} portion of the component’s archive filename. Version is an important attribute since multiple versions of the same component can be uploaded, and Confluent Hub displays the most recent version of each component by default.

Logo Files

Logos are specified as references to files contained in the assets/ directory of the component archive (e.g., logo, support.logo, owner.logo, and license.logo). Confluent Hub hosts these images and displays them on its site on the relevant pages for the component. At this time, all logo files must be provided in the assets/ directory and not one of its subdirectories. External URLs or files placed in other locations will not be recognized.

Any valid image format is allowed for a logo file, as long as the corresponding extension is contained in its filename. Logos must be at least 400 pixels wide and 200 pixels tall, and can be no more than 45 MB in size.

Verification Level

The confluent_verified nested document is added by Confluent Hub based upon the owner’s participation in Confluent’s Partner Program, in which Confluent verifies the functionality of the components. Although it will appear in manifests hosted by Confluent Hub, you should not specify a value for it in a manifest you submit; Confluent will add it automatically.

Maven Packaging Plugin

Confluent provides a Maven plugin for automatically packaging your component in the archive format, including asset files, an auto-generated manifest, config files, READMEs, licenses, and JAR files. If you use Maven to build your project and would like to integrate our packaging plugin into your build process, get the Maven plugin from Maven Central, and see its documentation. Check out Confluent's Elasticsearch connector for an example of how it's used.