Confluent Hub Component Archive Specification¶
Components published on Confluent Hub can only be downloaded if an archive file is provided by the owner. That archive file must have specific content and organization, including a manifest file with information about the component.
Confluent has provided a Maven plugin that you can use in the Maven builds of your component’s source code project to build the manifest and archive file. When you release your software, verify your component satisfies the requirements and submit your component.
The rest of this page describes the archive file, the manifest file, and the Maven plugin in more detail.
Terminology¶
- Component
- A packaged implementation of one or more Apache Kafka® or Confluent Platform APIs. Many components include a set of libraries, configuration files, documentation, and other resources that you install into your local system and then configure and use. Some components, however, cannot be downloaded and must be accessed from the component’s owner as detailed on Confluent Hub.
- Component archive
- A ZIP file that contains the packaged component for easy download and installation. Not all components have an archive.
- Component owner
- The individual or organization providing a component. Owners develop and test a component,
and optionally package it into a component archive for publishing on Confluent Hub.
Owner names namespace all components from that owner into a unique identifier.
Owner names are used as part of the directory name when the component is installed.
The name of a component owner must contain two to 255 characters, and must contain only lowercase letters, numbers,
and underscore (
_
) characters. An example isconfluentinc
. - Component name
- Each component has a name that, combined with the owner name, provides a unique identifier for that component
relative to all other components in Confluent Hub.
Names should be meaningful to end users, and will be used as part of the directory name when the component is installed.
The name of a component must contain only lowercase letters, numbers, minus sign (
-
), and underscore (_
) characters. An example iskafka-connect-jdbc
. - Component version
- Components have versions that can be used to identify a specific release of that component, and should correspond to
the same versioning scheme used in version control and releases. They may contain letters, numbers,
minus sign (
-
), period (.
), and underscore (_
) characters. An example is5.0.0
.
Component Archive Format¶
A component archive is a ZIP file that contains the files that are required by the component, organized into a specific structure. This structure makes it easy to extract the contents of the archive into particular locations within your local installation.
Component Archive Name¶
The component archive name must follow this convention:
${componentOwner}-${componentName}-${componentVersion}.zip
where:
${componentOwner}
is the name of the owner or developer of the component that acts as a namespace and is unique across Confluent Hub.${componentName}
is the name of the component that is unique within the scope of the owner name.${componentVersion}
is the version of the component as chosen by the owner or developer.
For example, in the file named confluentinc-kafka-connect-elasticsearch-3.3.2.zip
, the
owner is confluentinc
, the name is kafka-connect-elasticsearch
, and the version is
3.3.2
. These are somewhat analogous to the Maven groupId
, artifactId
, and version
.
Component Archive Structure¶
Here are the contents of the ZIP file:
confluentinc-kafka-connect-elasticsearch-3.3.2/
manifest.json
doc/
LICENSE
licenses
notices
README.md
licenses.html
version.txt
etc/
quickstart-elasticsearch.properties
lib/
commons-codec-1.9.jar
commons-lang3-3.4.jar
commons-logging-1.2.jar
gson-2.4.jar
guava-18.0.jar
httpasyncclient-4.1.1.jar
httpclient-4.5.1.jar
httpcore-4.4.4.jar
httpcore-nio-4.4.4.jar
jest-2.0.0.jar
jest-common-2.0.0.jar
kafka-connect-elasticsearch-3.3.2.jar
slf4j-api-1.7.25.jar
assets/
elastic_logo.png
confluent_logo.jpg
apache_logo.gif
The doc/
directory contains any sort of human-readable documents concerning the component, such as
READMEs and licenses.
The etc/
directory contains any sample or default properties files that may be helpful to include
with the component.
The lib/
directory contains all of the files necessary for actually running the component. For
example, the Elasticsearch connector JAR (kafka-connect-elasticsearch-3.3.2.jar
) and all of its
runtime dependencies, excluding those already provided by the Kafka Connect framework. Notice that
there is no JAR for the Kafka Connect API included (e.g., connect-api-1.0.1.jar
); it’s assumed
that this will already be provided by the Kafka Connect framework when the connector is run.
The assets/
directory includes any asset files that should be hosted by Confluent Hub. This is
currently limited to logo files. Subdirectories are not included.
The manifest.json
file contains all of the metadata to be associated with the component and made
available on Confluent Hub.
Component Manifests¶
A sample manifest¶
Here are the contents of the manifest.json
file:
{
"component_types": [
"sink"
],
"description": "This is a connector for getting data out of Apache Kafka into Elasticsearch.\nIt is built off of the Kafka Connect framework, and therefore automatically supports pluggable encoding converters, single message transforms, and other useful features.\nElasticsearch 5.x is required (6.x is not supported due to a known issue).",
"docker_image": {
"tag": "3.3.2",
"name": "kafka-connect-elasticsearch",
"namespace": "confluentinc",
"registry": "hub.docker.io"
},
"documentation_url": "https://docs.confluent.io/3.3.1/connect/connect-elasticsearch/docs/index.html",
"features": {
"confluent_control_center_integration": true,
"delivery_guarantee": ["at_least_once"],
"kafka_connect_api": true,
"single_message_transforms": true,
"supported_encodings": ["any"]
},
"license": [
{
"name": "Apache License, Version 2.0",
"url": "http://www.apache.org/licenses/LICENSE-2.0",
"logo": "assets/apache_logo.gif"
}
],
"logo": "assets/elastic_logo.png",
"name": "kafka-connect-elasticsearch",
"owner": {
"logo": "assets/confluent_logo.jpg",
"name": "Confluent, Inc.",
"type": "organization",
"url": "https://confluent.io/",
"username": "confluentinc"
},
"requirements": [
"Elasticsearch 5.x"
],
"support": {
"logo": "assets/confluent_logo.jpg",
"provider_name": "Confluent, Inc.",
"summary": "We provide full support for this connector, alongside community members who contribute to it as an open-source project.",
"url": "https://confluent.io/"
},
"tags": [
"analytics",
"elasticsearch",
"elastic"
],
"title": "Kafka Connect Elasticsearch",
"version": "3.3.2"
}
Manifest Field Descriptions¶
Here’s what each of the fields in the provided manifest.json
file should contain:
component_types
- Required. Confluent Hub currently contains four different categories of
component: sink connector, source connector, transformer, and converter. This field is a list of one
or more of the following values:
“sink”
,“source”
,“transform”
, and“converter”
. A component can qualify for several different types. For example, one might contain a sink connector, a source connector, and multiple transformations. description
- Required. The description is one of the most essential elements of the manifest, and this is among the most prominent content shown on Confluent Hub. It provides a summary of the component’s functionality and features, as well as any information that can’t be specified in any other part of the manifest. Confluent recommends the first short paragraph provide a good overview, and that subsequent paragraphs list features, capabilities, limitations, and external requirements in detailed, human-readable sentences. This is where you convince users to use your component rather than other components that serve the same or a similar purpose.
docker_image
- Information on a Docker image available for the component.
docker_image.namespace
- The namespace that the Docker image can be found under.
docker_image.name
- The name of the Docker image.
docker_image.tag
- The tag of the Docker image that corresponds to this version of the component.
docker_image.registry
- The registries that contain the Docker image.
documentation_url
- A link to documentation for the component. Can be anything from a custom web page to a GitHub Wiki.
features
- These are specific features that are tracked by Confluent and used to help compare components. Any other features that make a component noteworthy should be detailed in the description field.
features.supported_encodings
- The serialized encodings the component supports. If it is a Kafka Connect connector, the availability of pluggable converters means that basically any encoding is supported. If, however, the component is not a Kafka Connect connector and is limited in which encodings it supports, they should be specified here.
features.delivery_guarantee
- What delivery guarantees the component supports. Can be empty if no guarantees are made, or a list containing one or both of “at_least_once” and “exactly_once”.
features.kafka_connect_api
- Whether the component is built on top of the Kafka Connect API.
features.confluent_control_center_integration
- Whether the component supports integration with Confluent Control Center.
features.single_message_transforms
- Whether the component supports single message transforms. If it is a Kafka Connect connector, this is automatically true.
logo
- A logo to display on Confluent Hub for the component. If the component is a Kafka Connect connector, this should probably be a logo for the source/sink of the connector, since the relation to Apache Kafka® will be implied already.
license
- A list of licenses associated with the component. The example only contains one, but several can be specified if need be. It is strongly recommended that at least one license be provided.
license.name
- The human-readable name of the license.
license.url
- A link to the actual text of the license.
license.logo
- A logo for the license.
name
- Required. This is the
${componentName}
portion of the component used in its archive filename. It is also the name used to refer to the component via the back-end REST API; because of this, restrictions are placed on what characters it can consist of, and the result is often slightly less than what may be desired for human readability. owner
- Information regarding the owner of the component.
owner.username
- Required. This is the
${componentOwner}
portion of the component’s archive filename. Similar to the name field, it is restricted to work as part of a resource for a REST API. owner.type
- Required. The type of owner for the component; either
"organization"
or"user"
. owner.name
- Required. As the
title
field relates to thename
field, theowner.name
field relates to theowner.username
field. This is a friendlier, human-readable way to present who owns the component, and will be used on Confluent Hub. If left unspecified, will default to theowner.username
field. owner.url
- A website or web page to associate with the owner. Useful for attracting traffic from people who want to know more about who provided the component, possibly to a company’s website (as in this example) or a public GitHub profile (for an individual user who would like recognition for their work).
owner.logo
- An image to associate with the component owner.
requirements
- A list of requirements given in a concise, bullet-listable format.
support
- Information on who, if anyone, provides support for the component, and what kind of support is provided.
support.provider_name
- A friendly, human-readable name for who supports the component. Can be
Community
if the component is fully open-sourced and maintained by voluntary contributors. support.summary
- What exactly users should expect in terms of support for this component. Should be as detailed as possible with regards to how to file bug reports, ask questions, etc.
support.url
- Where the users should go for support. If a public Jira is used for filing bug reports, for example, this should link there. Another option would be a link to a GitHub issues page.
support.logo
- An image to associate with the support provider.
tags
- A list of relevant search tags for the component.
title
- Required. The human-readable name for the component, which will be displayed for it on Confluent Hub. This can essentially be thought of as a less restrictive version of the name field and, if not specified, will default to the name field.
version
- Required. Pretty self-explanatory–the version of the component. This is used as
the
${componentVersion}
portion of the component’s archive filename. Version is an important attribute since multiple versions of the same component can be uploaded, and Confluent Hub displays the most recent version of each component by default.
Logo Files¶
Logos are specified as references to files contained in the assets/ directory of the component
archive (e.g., logo
, support.logo
, owner.logo
, and license.logo
). Confluent Hub
hosts these images and displays them on its site on the relevant pages for the component. At this
time, all logo files must be provided in the assets/
directory and not one of its
subdirectories. External URLs or files placed in other locations will not be recognized.
Any valid image format is allowed for a logo file, as long as the corresponding extension is contained in its filename. Logos must be at least 400 pixels wide and 200 pixels tall, and can be no more than 45 MB in size.
Verification Level¶
The confluent_verified
nested document is added by Confluent Hub based upon the owner’s
participation in Confluent’s Partner Program, in which Confluent verifies the functionality of the
components. Although it will appear in manifests hosted by Confluent Hub, you should not specify a
value for it in a manifest you submit; Confluent will add it automatically.
Maven Packaging Plugin¶
Confluent provides a Maven plugin for automatically packaging your component in the archive format, including asset files, an auto-generated manifest, config files, READMEs, licenses, and JAR files. If you use Maven to build your project and would like to integrate our packaging plugin into your build process, get the Maven plugin from Maven Central, and see its documentation. Check out Confluent’s Elasticsearch connector for an example of how it’s used.